DataWarrior: an open-source program for chemistry aware data visualization and analysis.

Thomas Sander, Joel Freyss, Modest von Korff, Christian Rufener
Author Information
  1. Thomas Sander: Department of Information Management Drug Discovery, Actelion Ltd. , Gewerbestrasse 16, CH-4123 Allschwil, Switzerland.

Abstract

Drug discovery projects in the pharmaceutical industry accumulate thousands of chemical structures and ten-thousands of data points from a dozen or more biological and pharmacological assays. A sufficient interpretation of the data requires understanding, which molecular families are present, which structural motifs correlate with measured properties, and which tiny structural changes cause large property changes. Data visualization and analysis software with sufficient chemical intelligence to support chemists in this task is rare. In an attempt to contribute to filling the gap, we released our in-house developed chemistry aware data analysis program DataWarrior for free public use. This paper gives an overview of DataWarrior's functionality and architecture. Exemplarily, a new unsupervised, 2-dimensional scaling algorithm is presented, which employs vector-based or nonvector-based descriptors to visualize the chemical or pharmacophore space of even large data sets. DataWarrior uses this method to interactively explore chemical space, activity landscapes, and activity cliffs.

MeSH Term

Algorithms
Artificial Intelligence
Combinatorial Chemistry Techniques
Data Display
Data Mining
Databases, Chemical
Drug Discovery
Drug Industry
Models, Molecular
Molecular Conformation
Programming Languages
Software
Structure-Activity Relationship
Support Vector Machine

Word Cloud

Created with Highcharts 10.0.0datachemicalanalysissufficientstructuralchangeslargevisualizationchemistryawareprogramDataWarriorspaceactivityDrugdiscoveryprojectspharmaceuticalindustryaccumulatethousandsstructuresten-thousandspointsdozenbiologicalpharmacologicalassaysinterpretationrequiresunderstandingmolecularfamiliespresentmotifscorrelatemeasuredpropertiestinycausepropertyDatasoftwareintelligencesupportchemiststaskrareattemptcontributefillinggapreleasedin-housedevelopedfreepublicusepapergivesoverviewDataWarrior'sfunctionalityarchitectureExemplarilynewunsupervised2-dimensionalscalingalgorithmpresentedemploysvector-basednonvector-baseddescriptorsvisualizepharmacophoreevensetsusesmethodinteractivelyexplorelandscapescliffsDataWarrior:open-source

Similar Articles

Cited By (560)