The Development of AI in Drug Discovery
Over recent years, AI-based biological analysis algorithms have advanced
rapidly. These algorithms effectively process biological network data by
constructing systems that simulate human intelligence, enabling
classification, clustering, and prediction tasks. This capability allows
AI to decode the complexity of cancer through gene interaction networks,
deepening our understanding of carcinogenesis and revealing novel
anticancer targets[4]. Since 2018, AI in pharmaceuticals has
advanced from a conceptual phase ("0") to practical application ("1").
In 2024, the Nobel Prize in Physics was awarded for groundbreaking
advancements in artificial neural networks for machine learning (ML),
the foundational technology driving current AI techniques, including
deep learning (DL), natural language processing (NLP), and computer
vision[5].
Figure 1. Brief overview of AI pharmaceutical development[5].
Today, AI's capacity to analyze massive
datasets is revolutionizing drug development. AI technologies
deliver significant advantages across the entire pipeline—from target
identification and drug discovery to preclinical studies, clinical
trials, regulatory review, and post-market surveillance. This
transformative potential has spurred widespread adoption by
pharmaceutical companies, biotech firms, and research institutions
seeking to overcome traditional methodological constraints[3].
Figure 2. Overview of AI applications in the drug development
pipeline[3].
Applications of AI in Drug Discovery
AI in Virtual Screening
Virtual screening computationally analyzes large chemical libraries to
identify compounds with high binding potential to specific biological
targets. Machine learning models have long supported
ligand-based virtual screening (LBVS), where quantitative
structure-activity relationship (QSAR) models leverage known ligand
properties to predict new candidates.
The AI revolution in QSAR is more recent, fueled by
novel molecular representations and deep learning (DL)
architectures. Deep QSAR now enables efficient screening of
ultra-large compound libraries, often integrated with pharmacophore
modeling or molecular docking. The latter underpins structure-based
virtual screening (SBVS), which utilizes 3D protein structures to
identify potential inhibitors.
AI advancements have refined classification methods, binding pocket
discovery, and scoring functions for assessing ligand-protein affinity.
Emerging DL-based scoring functions—particularly convolutional neural
network (CNN) models—are gaining traction in virtual screening by
processing vast datasets and recognizing structural patterns correlated
with successful target binding[6].
Figure 3. An overall flowchart for predicting protein-ligand
interactions based on DL models[7].
AI-Driven Compound Synthesis Planning
Chemical synthesis, one of the major bottlenecks in small-molecule drug
discovery, remains a highly technical and extremely laborious task.
Computer-aided synthesis planning (CASP) and automatic synthesis of
organic compounds can help alleviate the burden of repetitive laborious
tasks for chemists, enabling them to engage in more innovative works.
Modern CASP tools leverage retrosynthetic analysis to efficiently
determine optimal reaction pathways, building upon early rule-based
systems that applied logical heuristics to synthetic planning. Recent
breakthroughs have seen transformer models successfully applied to
critical aspects of synthesis planning, including retrosynthetic
analysis, regioselectivity and stereoselectivity prediction, and
reaction fingerprint extraction[3].
While purely data-driven AI approaches initially raised concerns about
reliability for complex synthesis planning, this challenge has driven
the development of robust hybrid systems that intelligently combine AI
with established chemical rules. A prime example is RetroExplainer, which introduces an interpretable
deep learning framework that conceptualizes retrosynthesis as a
molecular assembly process. This innovative approach not only
demonstrates superior performance to conventional methods but also
provides unprecedented interpretability, enabling transparent
decision-making through quantitative attribution analysis[8].
Figure 4. Overview of RetroExplainer[8].
Case Studies of AI in Drug Discovery
GeminiMol DL Model Accelerates Large-Scale Drug Discovery
GeminiMol incorporates conformational space profiles into molecular
representation learning, capturing intricate relationships between
molecular structures and their conformational spaces. The model
demonstrates balanced, superior performance across 67 molecular property
predictions, 73 cellular activity predictions, and 171 zero-shot tasks
(including virtual screening and target identification)[9]. This
conformational space profiling strategy enables rapid exploration of
chemical space and facilitates novel drug design paradigms.
Figure 5. The pre-training and the application framework of
GeminiMol[9].
Virtual Screening Driven Efficient Identification of MYH9 Inhibitors
High-throughput virtual screening (HTVS) identified human MYH9-binding
compounds, with 9 candidates selected based on binding scores and
literature evidence.
CCK-8 assays assessed their effects on primary
mouse chondrocyte proliferation, while SA-β-Gal staining evaluated
cellular senescence modulation. Subsequent validation in mouse
osteoarthritis (OA) models ultimately identified
4,5-dicaffeoylquinic
acid as a potent inhibitor that significantly alleviates both
injury-induced (DMM) and aging-related OA progression
[10].
Figure 6. Drug screening strategy targeting human MYH9[10].
Summary
Overall, ongoing advancements in AI technologies are substantially
improving the efficiency and cost-effectiveness of drug development.
However, AI-designed compounds and predicted properties still require
experimental validation through wet-laboratory experiments, and human
input will still be needed to guide the direction of AI research and its
applications.
Virtual screening relies on computer simulations and
molecular docking methods to evaluate and predict the biological
activity of various compounds. Artificial Intelligence (AI)
drug screening is a high-throughput screening method that integrates AI
technology with computational chemistry, extensively utilized in areas
such as protein structure prediction, new drug development, and
molecular design and optimization. AI screening leverages machine
learning (ML) algorithms to analyze vast datasets, identify patterns,
and generate AI scoring functions. This approach enhances screening
efficiency and accelerates the discovery of potential drug candidates.
MCE AI drug screening platform integrates various advanced
methodologies, including molecular docking, deep learning, and molecular
dynamics simulations. By utilizing high-performance servers, it can
efficiently screen tens of millions of molecules within just a few
hours, thereby facilitating truly effective drug screening.
Figure 7. Application of AI technology in drug discovery.
We are committed to continuously developing and improving our platform
capabilities. Our goal is to creat a one-stop drug discovery service
platform suitable for scientific research, and fostering infinite
possibilities for innovation.