Message-ID: <21358289.123.1632427490922.JavaMail.bigchem@cpu>
Subject: Exported From Confluence
MIME-Version: 1.0
Content-Type: multipart/related;
boundary="----=_Part_122_1120060494.1632427490922"
------=_Part_122_1120060494.1632427490922
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Content-Location: file:///C:/exported.html
Bagging (Bootstrap aggregating) is a meta-learning technique=
that involves creation of an ensemble of models based on random train=
ing sets and created from the original training set by sampling with replac=
ement.
The final model is a simple average of the individual models within the =
ensemble.
In other words, bagging involves:
- replicating multiple training sets of equal sizes from=
the original training set. The size each set equals to the size of the ori=
ginal set
- definition of the respective validation sets as the sa=
mples not included in the training sets. By chance, about 37% of the sample=
s will not included in a training set.
- training multiple models using a particular machine le=
arning method (the same method and parameters for each model)
- for getting a validated prediction for a sample, calculate the predicti=
on value as the average prediction by those models that had this compound i=
n their validation sets
Bagging achieves two important goals: validation and assessment of predictive uncertainty, that is:
- Obtaining a correctly validated predictive statistics (similarly to cro=
ss-validation)
- Obtaining standard deviations for each prediction, which is possible be=
cause of using an ensemble of models rather that a single model.
The s=
tandard deviation (referred to as BAGGING-STD) can be used, for calculation=
of prediction uncertainty and definition of applicability domain.
------=_Part_122_1120060494.1632427490922--