Skip to content
View jingjingyan1's full-sized avatar

Block or report jingjingyan1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
jingjingyan1/README.md

Jingjing Yan, Ph.D.

Drug Discvoery Scientist × AI/ML Researcher — Building at the intersection of drug discovery and machine learning

I bring 8+ years of hands-on drug discovery and drug development combined with a deep personal interest in machine learning and AI. I've advanced real drug candidates through the preclinical pipeline and, independently, built ML systems that predict molecular properties and investigate model behavior, bridging the gap between bench science and computational approaches.


Personal Research Projects

AI for Drug Discovery — ADMET Prediction

Applying graph neural networks and gradient-boosted models to predict molecular ADMET properties using the Therapeutics Data Commons benchmark suite.

TDC-ADMET-Pgp-AttrMasking — Pretrained GIN with attribute masking for P-glycoprotein substrate prediction. Achieved AUROC 0.937 ± 0.004, ranking #2 on the TDC leaderboard. Includes XGBoost + Morgan fingerprint baseline (AUROC 0.912). Preprint submitted.

Upcoming: expanding to additional ADMET endpoints (Caco-2, BBB, hERG, CYP450s) and exploring multi-task learning across the full 22-dataset ADMET benchmark.

LLM Alignment & Mechanistic Interpretability

SycoSteer — A causally-validated activation steering framework that reduces sycophantic behavior in large language models. Applied contrastive activation addition (CAA) across Mistral-7B's transformer layers, achieving a 63% reduction in sycophancy on TruthfulQA with zero degradation on reasoning benchmarks (ARC, MMLU). Includes full-layer sweep analysis, cross-model comparison (5 models), and causal validation via activation patching.


The Combination

My day job is preclinical drug development. I've led bioanalytical strategy for various programs, developed LC-MS/MS methods for drug quantification, and run PK/biodistribution studies across liver, kidney, heart, and lung, generating the ADMET data that informed candidate drug investment decisions.

On the side, I've built GNN models that predict the same molecular properties I used to measure experimentally, and designed interpretability frameworks for transformer models. Knowing what ADMET data means for a real drug program helps me ask the right questions when building models and evaluate predictions with the skepticism of someone who has generated the ground-truth data.


Technical Stack

Preclinical & DMPK: LC-MS/MS bioanalysis (Sciex 6500+/5500), PK/PD modeling (Phoenix WinNonlin), GalNAc-siRNA/ASO platforms, metabolite ID, tissue biodistribution, SPE method development, GLP-compliant method validation, IND/IDE regulatory submissions

Drug Discovery & Cheminformatics: RDKit, Morgan fingerprints, molecular descriptors, SMILES, PyTDC, DeepPurpose, DGL/DGLLife, XGBoost, molecular property prediction

Deep Learning & ML: PyTorch, PyTorch Geometric, GNN (GIN, GCN, AttentiveFP), transformer architectures, contrastive activation addition, activation patching, Hugging Face

Languages & Tools: Python, Git, Docker, Google Colab, Runpod (cloud GPU)


Selected Publications

  • Lovrić, J.; Yan, J.; Li, X.-Q.; et al. In vitro Structure-Activity Relationship Stability Study of ASO Therapeutics. Pharmacology Research & Perspectives, 2025.
  • Bhattacharya, C.; Yan, J.; et al. Application of AMS to Characterize Mass Balance Recovery and Disposition of AZD4831. Drug Metabolism & Disposition, 2023.
  • Yan, J.; MacDonald, J.; Burdette, S. MOF Decomposition Using a Photodegradable Strut. Chemistry – A European Journal, 2019.
  • Yan, J.; et al. Detection of Adsorbates on Emissive MOF Surfaces with XPS. Dalton Transactions, 2019.

Education

  • Ph.D. Chemistry & Biochemistry — Worcester Polytechnic Institute
  • B.S. & M.S. Chemistry — Nankai University

Contact

Pinned Loading

  1. TDC-ADMET-Pgp-AttrMasking TDC-ADMET-Pgp-AttrMasking Public

    Pretrained GIN with AttrMasking for Pgp inhibition prediction — AUROC 0.937 ± 0.004 (TDC Leaderboard #2)

    Python