Data-Driven Approach to Drug Toxicity Prediction

A research article published in Cell Chemistry Biology in 2016

Gayvert, K. M., Madhukar, N. S., & Elemento, O. (2016). A Data-Driven Approach to Predicting Successes and Failures of Clinical Trials. Cell Chemical Biology, 23(10), 1294–1301.

David C. Young mentioned in 2009 that the side effects of drugs are usually not identified until clinical trails which may results in drug failing clinical trails after already spending a large amount of money. [] Failures in clinical trails have skyrocketed over the past three decades due to safety reasons. How to overcome this obstacle? The first thing comes to my mind is Big Data. Data-driven approaches have been used in almost all areas to solve different problems which is the reason I start to blog research articles of this cutting-edge area that interest me

In this specific case, Elemento et al. sought to use a similar “moneyball” approach, inspired by the effective use of sabermetrics in predicting successful baseball players (I don’t know baseball at all), to predict clinical toxicity, which is highlt related to successes and failures of clinical trials. This approach is called Predicting the Odds of Clinical Trial Outcomes Using Random Forest (PrOCTOR).图片1

This approach is shown in this figure (for detail illustration, check the video presented by the author: click here). It integrates chemical properties, drug-likeness measures, and target-based properties of a molecule into a random forest model to predict whether the drug is likely to be a member to fail clinical trials for toxicity reasons.

The set of 48 features taken into account in this research are listed in this file (click here).