Finding Needles in the Haystack: Integrating ‘Omics Data to Predict Toxicity


Advances in sequencing, mass spectrometry, and imaging have allowed us to measure numerous endpoints with high sensitivity more quickly than ever before. Instead of characterizing the change of a few genes at a time, it is now cost efficient to measure the entire epigenome, transcriptome, proteome, and metabolome. Using these big datasets to elucidate molecular changes to a mechanistic detail not possible before holds great promise for revolutionizing predictive toxicology.

The only thing holding us back is the data itself. Each ‘omics technology generates thousands of data points, and it’s easy to get lost in the numbers. New methods are needed to integrate and interpret these complex datasets. The Symposium Session “Integrated ‘Omics Approaches to Toxicity Assessment,” led by Dr. Julia Rager and Dr. Scott Auerbach during the SOT 58th Annual Meeting and ToxExpo, sought to introduce the cutting-edge bioinformatics techniques to toxicologists to show just how powerful ‘omics data can be for toxicology.

Dr. Karan Uppal of Emory University highlighted the importance of dataset integration to discover patterns across datasets. By utilizing data from multiple sources, whether across technologies or from different labs, more confidence can be placed in results pointing to the same gene or pathway. Dr. Uppal introduced the key approaches and data analysis packages for integrating datasets. The most powerful methods are data-driven integration, where associations are made using the raw data first. After a network between datasets is created, biological interpretation, often through pathway analysis, is critical to understand the associations. xMWAS, developed by Dr. Uppal’s group, performs data-driven integration to analyze multiple ‘omics sets and is available as both a user-friendly web-based application and an R package for those more computationally literate. Multiple other analysis packages are available for specific needs, though some software outlast others as the field rapidly advances.

Dr. Scott Auerbach of the National Institute of Environmental Health Sciences National Toxicology Program (NTP) showed how these principles of data-driven integration could be used to predict toxicity of aromatic phosphate flame retardants. NTP performed a five-day in vivo toxicogenomics study in rats to evaluate the toxicity of six different flame retardants, generating data for organ weights, clinical pathology, liver transcriptomics, and serum metabolomics. First, Dr. Auerbach created a network of known rat liver toxicogenomic associations using the freely available DrugMatrix database. Then each of the data streams (transcriptional and clinical endpoints) altered by the different flame retardants was overlaid on top of this established association network. This proved to be a powerful method to cluster the flame retardants based on similarity of biological response and to provide an interpretive landscape to multiple datasets. Next, Dr. Auerbach mapped different clusters of genes that change together. By integrating metabolomics, clinical chemistry, and organ weight, associations could be made between transcriptional changes and pathological endpoints. Overall, data integration methods showed great promise for helping us understand how transcriptional changes can lead to adverse apical outcomes, an essential step for predictive toxicology.

Dr. Avi Ma’ayan of the Icahn School of Medicine at Mount Sinai presented on the latest cutting-edge machine learning technologies to not only integrate multiple datasets but also predict toxicity of drugs and small molecules. The Ma’ayan group has released an impressive list of software and tools to make complex associations and analyses easier for the biologist. This includes Enrichr (a gene-list enrichment tool), Harmonizome (a database integrating biological knowledge from hundreds of sources), and L1000FWD (visualization of drug-induced transcriptomic signatures). The L1000FWD is especially relevant to toxicologists, as the mode of action can be effectively predicted for chemicals and new drug compounds could be designed using this transcriptomic scaffold.

It is clear that innovative and advanced computing is needed as we continue to modernize chemical toxicity testing. Computational and robust methods to integrate different data types and streams will build confidence in toxicity assessment and could be used for in silico prediction of new chemicals as they are designed or prior to biological testing. These methods will also improve our confidence in in vitro model systems to effectively predict in vivo responses and ultimately advance predictive toxicology thousands of data points at a time.

This blog was prepared by an SOT Reporter. SOT Reporters are SOT members who volunteer to write about sessions and events they attend during the SOT Annual Meeting and ToxExpo. If you are interested in participating in the SOT Reporter program in the future, please email Giuliana Macaluso.

Recent Stories
SOT Concludes a Successful 2020 Virtual Meeting

Engage with the 2020 Virtual ToxExpo

Reflections on the 2020 Virtual Undergraduate Education Program