Skip to content
View unumbrela's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Jiangnan University
  • No. 1800, Lihu Avenue, Wuxi, 214122, P. R. China
  • 16:47 (UTC +08:00)

Highlights

  • Pro

Block or report unumbrela

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
unumbrela/README.md
Typing SVG
Profile Views

πŸŽ“ About Me

I'm an undergraduate researcher at Jiangnan University (Project 211 Β· Double First-Class), School of AI & Computer Science, B.Eng. in Digital Media Technology β€” Class of 2027 Β· GPA 88 / 100.

My research lies at the intersection of AI for Science (biology in particular), LLM reasoning & agents, multimodal learning, and medical image analysis.

  • πŸ“„ Authored / co-authored 7 papers β€” 6 as first / co-first author, 6 already accepted
  • 🧬 iGEM 2025 Gold Medal β€” dry-lab lead, in-person defence in Paris
  • πŸ›‘οΈ 1 invention patent filed (neuro-symbolic travel planning)
  • πŸ† ~10 academic competition / honour awards (MCM/ICM HM, CUMCM, Lanqiao Cup, Wuxi City Outstanding Student Cadre, Jiangnan Honor Student …)

πŸ”­ I am actively looking for research-assistant, summer-internship, and graduate-study opportunities β€” especially in AI for Life Sciences and biological sequence modelling.


πŸ”¬ Research Interests

AI for Science (Bioinformatics)   Β·   Single-Cell Foundation Models   Β·   Protein & Peptide Design
LLM Reasoning & Neuro-Symbolic Agents   Β·   Multimodal Learning   Β·   Medical Image Analysis

πŸ“„ Selected Publications

1 first author Β· * co-first author

# Paper Venue Role Status
1 Tokenization is Mechanism: Information-Asymmetric Token Merging for Biological Sequences NeurIPS 2026 (CCF-A) 1st author Under Review
2 ProtoGene: Bridging the RT-Gap in Single-Cell Foundation Models via Biology-Aware Prototypical Fine-Tuning ICIC 2026 Oral (CCF-C) 1st author βœ… Accepted
3 Extract Then Compile: Reliable Neuro-Symbolic Planning for Large Language Models ICIC 2026 Oral (CCF-C) 1st author βœ… Accepted
4 Fusion Direction Matters: Alignment-Adaptive Cross-Modal Fusion for Medical Image Segmentation ICIC 2026 Oral (CCF-C) 1st author βœ… Accepted
5 FWMamba-UNet: Frequency-Wavelet Enhanced Mamba UNet for Medical Image Segmentation ICIC 2026 Oral (CCF-C) 1st author βœ… Accepted
6 MambaGuard: A CLIP-Mamba Approach for OOD Generated Image Detection PRCV 2025 (CCF-C) Co-first βœ… Accepted
7 RA-Det: Towards Universal Detection of AI-Generated Images via Robustness Asymmetry ICML 2026 (CCF-A) 5th author βœ… Accepted

πŸ“š Full publication list & details on my academic homepage.


πŸ§ͺ Featured Projects

🩻 FWMamba-UNet

ICIC 2026 Oral Β· 1st Author

Frequency-domain + wavelet-transform branches augmenting a Mamba state-space UNet, capturing cross-scale boundary cues that pure spatial UNets miss.

Mamba Wavelet FFT Medical Imaging

🧬 AMP Forge β€” iGEM 2025 Gold

Dry-Lab Lead Β· Paris

De-novo antimicrobial peptide design: ESM-2 / ProtT5 / Ankh + BiGRU-VAE β†’ Latent Diffusion β†’ Transformer decoder. SOTA on multiple metrics; wet-lab-validated variants outperform LL-37.

PLMs Latent Diffusion RLHF PyTorch

🧠 Tokenization-as-Mechanism

NeurIPS 2026 (sub) Β· 1st Author

Information-asymmetric token merging that turns the tokenizer into an interpretable mechanism β€” consistent gains across pMHC Β· TCR Β· DNA Β· SMILES.

Information Theory pMHC TCR DNA

🧭 SHINE β€” Neuro-Symbolic Travel Agent

ICIC 2026 Oral Β· 1st Author + Patent

Extract-Then-Compile: LLM lifts natural-language constraints into a symbolic program, solved by classical search. Filed national invention patent.

LLM Agent Neuro-Symbolic Constraint Solving


πŸ› οΈ Tech Stack

Deep Learning & LLMs

PyTorch HuggingFace DeepSpeed CUDA Megatron-LM

Architectures & Techniques

Transformer Mamba/SSM Diffusion LoRA RLHF Wavelet

AI for Science

ESM-2 ProtT5 Ankh scGPT Geneformer BioNeMo PyRosetta

Languages & Tools

Python C++ LaTeX Linux Git


πŸ“Š GitHub Stats


"We must know β€” we will know." β€” David Hilbert

πŸ“˜ Visit my academic homepage β†’

Pinned Loading

  1. AMP-Forge AMP-Forge Public

    AMP Forge is an antimicrobial peptide generation project based on PLM embeddings, VAE, and latent diffusion.

    Python

  2. evo_fine_tune evo_fine_tune Public

    Fine-tuning pipeline for NVIDIA BioNeMo Evo2 1B on genomics sequences, including preprocessing, training, inference, embeddings, and evaluation.

    Python 1

  3. SHINE SHINE Public

    SHINE-LLM-Travel-Planner

    Python