QNPR

QNPR descriptors

These descriptors are used for QNPR (Quantitative Name Property Relation= ship) thus giving their name. The descriptors are derived directly from the= compounds name or SMILES strings.

=20

For each molecule either canonical SMILES or IUPAC name are split into f= ragments of a specified length, which is determined by the configuration. A= ll numbers 0-9 are substituted with = § symbol.

=20

Thus we will get for

=20

CCC: C CC and CCC as descriptors
c1ccccc1: c § c§ c§c §cc ccc and cc$ as descriptors=

=20

when using fragments of length 1-3

=20

Parameters

Parameter	Effect
Fragments from to	Create string fragments with length from x = to y
Minimum fragment count threshold	If there are not at least # occurences of t= he pattern in the whole dataset, filter the descriptor out
Type of fragments	Naming scheme (SMILES, IUPAC, ...)

Literature

(1) Thormann M, Vidal D, Almstetter M, Pons M; Nomen Est Omen: Quantitat= ive Prediction of Molecular Properties Directly from IUPAC Names; The Open = Applied Informatics Journal; 1 (1), 28-32, 2007.