Starten Sie Ihre Suche...


Wir weisen darauf hin, dass wir technisch notwendige Cookies verwenden. Weitere Informationen

Integrating background knowledge from internet databases into predictive toxicology models

SAR and QSAR in environmental research. Bd. 21. H. 1-2. London: Taylor & Francis 2010 S. 21 - 35

Erscheinungsjahr: 2010

ISBN/ISSN: 1029-046X ; 1062-936X

Publikationstyp: Zeitschriftenaufsatz

Sprache: Englisch

Doi/URN: 10.1080/10629360903560579

Volltext über DOI/URN

Geprüft:Bibliothek

Inhaltszusammenfassung


While data integration for data analysis has been investigated extensively in biological applications (see, e.g., [1, 2, 3]), it has not yet been so much the focus in computational chemistry and QSAR research. With the availability and growing number of chemical databases on the web, such data integration efforts become an intriguing possibility (and in fact, a necessity). In this paper, we take a first step towards the following vision and scenario for predictive toxicology applications: Giv...While data integration for data analysis has been investigated extensively in biological applications (see, e.g., [1, 2, 3]), it has not yet been so much the focus in computational chemistry and QSAR research. With the availability and growing number of chemical databases on the web, such data integration efforts become an intriguing possibility (and in fact, a necessity). In this paper, we take a first step towards the following vision and scenario for predictive toxicology applications: Given a new structure to be predicted, the first step would be to gather (integrate) all relevant information from internet databases for the structure itself, and all structures with available information for the endpoint of interest. In a second step, the collected information is combined statistically into a prediction of the new structure. We simulated this scenario with three endpoints (datasets) from the DSSTox database [4], and collect information from three public chemical databases: PubChem, ChemBank and Sigma-Aldrich. In the experiments, we investigate whether the addition of background knowledge from the three databases can improve predictive performance (over using chemical structure alone) in a statistically significant way. To this purpose, we define groups of features (belonging together from an application point of view) from the three databases, and perform a variant of forward selection to include those feature groups into a prediction model. Our experiments show that the integration of background knowledge from internet databases can significantly improve prediction performance, in particular for regression tasks.» weiterlesen» einklappen

Autoren


Edelstein, Mira (Autor)
Buchwald, Fabian (Autor)
Richter, Lothar (Autor)
Kramer, Stefan (Autor)

Klassifikation


DFG Fachgebiet:
4.43 - Informatik

DDC Sachgruppe:
Informatik