Rongzhe Wei

Machine Learning · Trustworthy AI

Rongzhe Wei

魏容哲

rongzhe.wei - at - gatech.edu

Ph.D. Candidate in Machine Learning

Georgia Tech


Researching how language models remember, forget, and reason

Building trustworthy AI through structure, reasoning, and control.

I am a fifth-year PhD student at Georgia Tech advised by Prof. Pan Li. My research explores how language models store, retrieve, forget, and exploit knowledge. Towards that, I am interested in studying trustworthy AI through the lens of structured internal knowledge.

My research spans LLM unlearning, AI safety, and agentic reasoning, with foundations in graph learning and differential privacy.

Some representative research areas:

  • LLM Unlearning: Do LLMs really forget? Probing whether models truly unlearn through knowledge correlation and evaluation.
  • AI Safety & Jailbreaking: Reformulating LLM jailbreaking as adaptive tree search over correlated knowledge.
  • Agentic AI: Building intelligent agents with reasoning and planning capabilities.
  • Graph Learning & Differential Privacy: Foundations in graph neural networks with privacy guarantees.

Selected Research

Representative projects across my research areas.

ICLR'26 Workshop

CKA-Agent

Reformulating LLM jailbreaking as adaptive tree search over correlated knowledge. Achieves 96–99% success on GPT-5.2, Gemini-3.0-Pro & Claude-Haiku-4.5.

NeurIPS 2025

LLM Unlearning Evaluation

Do LLMs really forget? We probe whether models truly unlearn through knowledge correlation and confidence-aware evaluation.

ICLR 2026

MoEEdit

Efficient and routing-stable knowledge editing for Mixture-of-Experts LLMs — surgically updating knowledge without disrupting expert routing.

ICML 2025

Privacy Risks in LLM Unlearning

Revealing underestimated privacy risks for minority populations in large language model unlearning — showing that current methods disproportionately fail for underrepresented groups.

News

🏆
NAIRR Pilot Award

Proposal lead author — recognized for "high degree of alignment with national AI strategic focus."

🎤
Google Research & IBM Research

Invited talks on rethinking unlearning and jailbreaking in LLMs.

🏅
CSIP Outstanding Research Award

Fortunate to receive the Georgia Tech CSIP Award for 2025.

Recent Updates

Feb 2026

Invited Talk: "From Atomic Facts to Structured Internal Knowledge: Rethinking Unlearning and Jailbreaking in LLMs" at Google Research Seminar.

Feb 2026

Proposal Lead Author: "Structured-Knowledge-Guided Agentic LLM Jailbreaking and Defense"NAIRR Pilot Award.

Jan 2026

Invited Talk: "From Atomic Facts to Structured Internal Knowledge" at IBM Research.

Dec 2025

Fortunate to receive the CSIP Outstanding Research Award for 2025.

Nov 2025

New preprint: CKA-Agent achieves 96–99% jailbreak success on frontier LLMs via adaptive tree search. Paper · Project

Sep 2025

Two papers accepted at NeurIPS 2025.

May 2025

Two papers accepted at ICML 2025.

May 2025

Started AI Research Internship at Amazon, Seattle.

Publications

Full list on Google Scholar. * = equal contribution.

Preprints

The Trojan Knowledge: Bypassing Commercial LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search

Rongzhe Wei*, Peizhi Niu*, Xinjie Shen*, Tony Tu, Yifan Li, Ruihan Wu, Eli Chien, Pin-yu Chen, Olgica Milenkovic, Pan Li

ICLR 2026 AIWILD Workshop

Guarding Multiple Secrets: Enhanced Summary Statistic Privacy for Data Sharing

Shuaiqi Wang, Rongzhe Wei, Mohsen Ghassemi, Eleonora Kreacic, Vamsi K. Potluru

ICLR 2024 PML Workshop

Conference & Journal Papers

MoEEdit: Efficient and Routing-Stable Knowledge Editing for Mixture-of-Experts LLMs

Yupu Gu, Rongzhe Wei, Andy Zhu, Pan Li

ICLR 2026

Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness

Rongzhe Wei*, Peizhi Niu*, Hans Hao-Hsun Hsu, Ruihan Wu, Haoteng Yin, Mohsen Ghassemi, Yifan Li, Vamsi K. Potluru, Eli Chien, Kamalika Chaudhuri, Olgica Milenkovic, Pan Li

NeurIPS 2025

Differentially Private Relational Learning with Entity-level Privacy Guarantees

Yinan Huang*, Haoteng Yin*, Eli Chien, Rongzhe Wei, Pan Li

NeurIPS 2025

Underestimated Privacy Risks for Minority Populations in Large Language Model Unlearning

Rongzhe Wei, Mufei Li, Mohsen Ghassemi, Eleonora Kreacic, Yifan Li, Xiang Yue, Bo Li, Vamsi K. Potluru, Pan Li, Eli Chien

ICML 2025

Generalization Principles for Inference over Text-Attributed Graphs with Large Language Models

Haoyu Wang, Shikun Liu, Rongzhe Wei, Pan Li

ICML 2025

Privately Learning from Graphs with Applications in Fine-tuning Large Language Models

Haoteng Yin, Rongzhe Wei, Eli Chien, Pan Li

COLM 2025

Differentially Private Graph Diffusion with Applications in Personalized PageRanks

Rongzhe Wei, Eli Chien, Pan Li

NeurIPS 2024

On the Inherent Privacy Properties of Discrete Denoising Diffusion Models

Rongzhe Wei, Eleonora Kreacic, Haoyu Wang, Haoteng Yin, Eli Chien, Vamsi K. Potluru, Pan Li

TMLR 2024 → ICLR 2025

Learning Scalable Structural Link Representations with Bloom Signatures

Tianyi Zhang*, Haoteng Yin*, Rongzhe Wei, Pan Li, Anshumali Shrivastava

WWW 2024

SLA2P: Self-supervised Anomaly Detection with Adversarial Perturbation

Yizhou Wang, Can Qin, Rongzhe Wei, Yi Xu, Yue Bai, Yun Fu

TKDE 2024

Understanding Non-linearity in Graph Neural Networks from the Bayesian-Inference Perspective

Rongzhe Wei, Haoteng Yin, Junteng Jia, Austin R. Benson, Pan Li

NeurIPS 2022

Experience

May — Aug 2025

Amazon Seattle, WA

AI Research Intern — Reflection and Exploration-based LLM Action Planning

Jun — Aug 2023

JP Morgan Chase NYC

AI Research Intern — Graph Data Generation via Margin Relaxed Schrödinger Bridges

Education

2021 — Present

Georgia Institute of Technology

Ph.D. in Machine Learning (ECE)

Advisor: Prof. Pan Li

2017 — 2021

Xi'an Jiaotong University

B.S. in Mathematics & Applied Mathematics

Qian Xuesen College · GPA 3.89/4.00

Spring 2020

Georgia Tech — Visiting

Honors Student Program, School of Mathematics

Invited Talks

Feb 2026

Google Research Seminar

"From Atomic Facts to Structured Internal Knowledge: Rethinking Unlearning and Jailbreaking in LLMs"

Jan 2026

IBM Research

"From Atomic Facts to Structured Internal Knowledge: Rethinking Unlearning and Jailbreaking in LLMs"

Honors

  • 2025 CSIP Outstanding Research Award
  • 2025 Lambda's Research Grant
  • 2025 OpenAI Researcher Access Grant
  • 2022 NeurIPS Travel Award
  • 2019 IEEE BigData Student Award
  • "Zhufeng" Scholarship — First Prize (Ministry of Education)
  • Outstanding Student Award — XJTU

Service

Conference NeurIPS '22–25, ICML '24–25, ICLR '25–26, AISTATS '23–24, ISIT '25, AAAI '24, LoG '22–24
Journal TMLR, Computers & Math with Applications
Teaching Convex Optimization, Probability & Statistics, Conversational AI

Skills

Languages Chinese (native), English
Programming Python, C#, MATLAB, C, SQL, HTML/CSS
Piano Band 9/9 (Central Conservatory) · Gold Award, China Outstanding Talents Art Festival