Skip to end of metadata
Go to start of metadata

QNPR descriptors

These descriptors are used for QNPR (Quantitative Name Property Relationship) thus giving their name. The descriptors are derived directly from the compounds name or SMILES strings.

For each molecule either canonical SMILES or IUPAC name are split into fragments of a specified length, which is determined by the configuration. All numbers 0-9 are substituted with § symbol.

Thus we will get for

  • CCC: C CC and CCC as descriptors
  • c1ccccc1: c § c§ c§c §cc ccc and cc$ as descriptors

when using fragments of length 1-3

Parameters

Parameter
Effect
 
Fragments from to
Create string fragments with length from x to y
 
Minimum fragment count threshold
If there are not at least # occurences of the pattern in the whole dataset, filter the descriptor out
 
Type of fragments
Naming scheme (SMILES, IUPAC, ...)
 

Literature

(1) Thormann M, Vidal D, Almstetter M, Pons M; Nomen Est Omen: Quantitative Prediction of Molecular Properties Directly from IUPAC Names; The Open Applied Informatics Journal; 1 (1), 28-32, 2007.

  • No labels