A group of experts in the field of chemical similarity brought their unique skillset to explaining this important concept during the Workshop Session “Understanding the Concept of Similarity and its Applications to Toxicological Research and Risk Assessment.”
Kamel Mansouri from the National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods began the Workshop by introducing the concept of chemical similarity. He reminded the audience that similarity is not as clear as we may think of it. It is important to understand the goal of your project, as it informs which similarity (such as functionality, common scaffolding, etc.) is best for you. QSAR’s basic requirement is for congeners to be similar compounds that give similar responses. The speaker was the first to introduce activity cliffs (structurally similar compounds have different activities) versus structural cliffs (structurally diverse compounds have similar activity). He also noted that, with computer modeling, whether trained or prediction, you can over fit the model and create less accurate results. Generally, supervised variable selection can create minimal number of descriptors that are most applicable with the endpoint.
Huixiao Hong from the US FDA gave a talk titled “Structure Similarity based on Chemical Descriptors, Fingerprints, and Structural Alerts.” Chemical structure similarity is a key concept that plays a vital role for in silico methods such as QSAR. Measuring chemical structure similarity varies in chemical structure presentations, distances, and applications. Similarity between molecules can be quantified based on chemical descriptors, fingerprints, and structural alerts, though there are pros and cons to each. Theoretically, similarity measures based on 3D structure are ideal and should be precise. In practice, challenges exist in determining biologically active conformation, making similarity measures based on 2D structures often perform equally well. Emerging methods are expected to further explore structure similarity quantification, including DL-derived structural features and dynamic features of chemicals in target macromolecules.
Denis Fourches from Oerth Bio discussed the challenges of chemical similarity in a talk titled “The Multifaceted Challenges of Chemical Similarity.” The speaker again covered the basics of structural similarity, similarity cliffs vs activity cliffs, and contextualized these ideas in an example of protein degraders. He emphasized that descriptor matrixes are particular to certain descriptor types, and when they change, the matrix is no longer applicable. He recommended chemical clustering and to take time with the process so that when you go deeper in similarity, chemicals make sense. Activity cliffs are traditionally mispredicted using structural alerts, read-across or QSAR, as there are many different types of activity cliffs. He emphasized 3D fingerprinting and introduced molecular dynamic fingerprinting, a more accurate way of using fingerprinting, which looks at how compounds work in space and with other compounds.
Alessandra Roncaglioni from Istituto di Ricerche Farmacologiche Mario Negri spoke about endpoint specific similarity with a talk titled “Endpoint Specific Similarity Measures for Read-Across and Weight of Evidence.” Read-Across is a technique for data gap filling in which info for a source chemical is used to make a prediction for a target chemical considered to be similar. Read-Across is used commonly in the European Union’s REACH. For it, you can use composite chemical similarity index and structural alerts. She discussed developing databases for biological and metabolic similarity of chemicals, such as drug-induced liver injury, and that similarity can be found in ADME (T) properties. Overall, it was emphasized that data collection and prior knowledge is pivotal for endpoint specific similarity.
Carolina Andrade from Universidade Federal de Goias discussed the supervised vs unsupervised approaches of similarity. Salient points of the talk include the importance of removing discordant results from data sets in order to make QSAR models as accurate as possible; that models should be analyzed that include explainability/interprability of end results after utilizing modeling; and that better data beats fancier algorithms.
Todd Martin from the US Environmental Protection Agency Center for Computational Toxicology and Exposure spoke further on feature selection for predictive modeling in a talk titled “Comparison of Supervised vs Unsupervised Applicability Domain Measures.” Supervised applicability domain measures employ genetic algorithm feature selection to identify which descriptors best define molecular similarity for each property. OPERA local index, average cosine similarity (embedding descriptor), and average cosine similarity (all description) provide similar performance. PaDEL and WebTEST also provide similar performance. Embedding allows users to determine the essential descriptors to characterize similarity for a property. To evaluate applicability domain measures, one needs to test set chemicals inside the applicability domain and the test set score inside and outside the applicability domain.
Overall, the speakers gave explanations and examples of myriad concepts related to chemical similarity.
This blog reports on the Platform Session titled “The Mechanisms of Neurotoxicity, Neurodegeneration, and Neurodevelopmental Dysfunction Induced by Metals and Pesticides” that was held during the 2023 SOT Annual Meeting and ToxExpo. An on-demand recording of this session is available for meeting registrants on the SOT Online Planner and SOT Event App.
This blog was prepared by an SOT Reporter and represents the views of the author. SOT Reporters are SOT members who volunteer to write about sessions and events in which they participate during the SOT Annual Meeting and ToxExpo. SOT does not propose or endorse any position by posting this article. If you are interested in participating in the SOT Reporter program in the future, please email SOT Headquarters.
#Communique:ScienceNews
#2023AnnualMeeting
#SOTReporter
#Communique:AnnualMeeting