Message-ID: <1189947940.269.1632427554029.JavaMail.bigchem@cpu> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_268_951443148.1632427554029" ------=_Part_268_951443148.1632427554029 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
These descriptors are used for QNPR (Quantitative Name Property Relation= ship) thus giving their name. The descriptors are derived directly from the= compounds name or SMILES strings.
=20For each molecule either canonical SMILES or IUPAC name are split into f= ragments of a specified length, which is determined by the configuration. A= ll numbers 0-9 are substituted with = § symbol.
=20Thus we will get for
=20when using fragments of length 1-3
=20Parameter |
Effect |
|
Fragments from to |
Create string fragments with length from x = to y |
|
Minimum fragment count threshold |
If there are not at least # occurences of t= he pattern in the whole dataset, filter the descriptor out |
|
Type of fragments |
Naming scheme (SMILES, IUPAC, ...) |
(1) Thormann M, Vidal D, Almstetter M, Pons M; Nomen Est Omen: Quantitat= ive Prediction of Molecular Properties Directly from IUPAC Names; The Open = Applied Informatics Journal; 1 (1), 28-32, 2007.