×
I am currently studying at the College of Software, Zhejiang University, under the supervision of Meng Han.
My research focuses on AI Security, specifically Model Watermarking and Fingerprinting, involving LLM and VLM. 📚🔍 If you are interested in this topic, feel free to contact me via email! ✉️
2025.08
🎉
Four papers accepted by EMNLP 2025
2 Main
2 Findings
🚀 Projects
📝 Publications
Jingxuan Zhang and Zhenhua Xu (co-first authors), Rui Hu, Wenpeng Xing, Xuhong Zhang, Meng Han Code
LLMs are widely used, raising concerns about model ownership. MEraser is a method to remove backdoor-based fingerprints from LLMs while preserving performance. It uses a two-phase fine-tuning strategy with mismatched and clean datasets, achieving fingerprint removal with minimal data. The method is transferable across models without repeated training, highlighting vulnerabilities in current techniques and setting benchmarks for better model protection.
EverTracer: Hunting Stolen Large Language Models via Stealthy and Robust Probabilistic Fingerprint
Zhenhua Xu, Meng Han, Wenpeng Xing
The proliferation of large language models (LLMs) has intensified concerns over model theft and license violations, necessitating robust and stealthy ownership verification. Existing fingerprinting methods either require impractical white-box access or introduce detectable statistical anomalies. We propose EverTracer, a novel gray-box fingerprinting framework that ensures stealthy and robust model provenance tracing. EverTracer is the first to repurpose Membership Inference Attacks (MIAs) for defensive use, embedding ownership signals via memorization instead of artificial trigger-output overfitting. It consists of Fingerprint Injection, which fine-tunes the model on any natural language data without detectable artifacts, and Verification, which leverages calibrated probability variation signal to distinguish fingerprinted models. This approach remains robust against adaptive adversaries, including input level modification, and model-level modifications. Extensive experiments across architectures demonstrate EverTracer’s state-of-the-art effectiveness, stealthness, and resilience, establishing it as a practical solution for securing LLM intellectual property.
CTCC: A Robust and Stealthy Fingerprinting Framework for Large Language Models via Cross-Turn Contextual Correlation Backdoor
Zhenhua Xu, Xixiang Zhao, Xubin Yue, shengwei tian, Changting Lin, Meng Han
The widespread deployment of large language models (LLMs) has intensified concerns around intellectual property (IP) protection, as model theft and unauthorized redistribution become increasingly feasible. To address this, model fingerprinting aims to embed verifiable ownership traces into LLMs. However, existing methods face inherent trade-offs between stealthness, robustness, and generalizability—being either detectable via distributional shifts, vulnerable to adversarial modifications, or easily invalidated once the fingerprint is revealed. In this work, we introduce CTCC, a novel rule-driven fingerprinting framework that encodes contextual correlations across multiple dialogue turns—such as counterfactual—rather than relying on token-level or single-turn triggers. CTCC enables fingerprint verification under black-box access while mitigating false positives and fingerprint leakage, supporting continuous construction under a shared semantic rule even if partial triggers are exposed. Extensive experiments across multiple LLM architectures demonstrate that CTCC consistently achieves stronger stealth and robustness than prior work. Our findings position CTCC as a reliable and practical solution for ownership verification in real-world LLM deployment scenarios.
Unlocking the Effectiveness of LoRA-FP for Seamless Transfer Implantation of Fingerprints in Downstream Models
Zhenhua Xu, Zhaokun Yan, Binhan Xu, Xin Tong, Haitao Xu, Yourong Chen, Meng Han
With the rapid development of large language models (LLMs), protecting intellectual property (IP) has become increasingly crucial. To tackle high costs and potential contamination in fingerprint integration, we propose LoRA-FP, a lightweight plug-and-play framework that encodes backdoor fingerprints into LoRA adapters via constrained fine-tuning. This enables seamless fingerprint transplantation through parameter fusion, eliminating full-parameter updates while maintaining integrity. Experiments demonstrate that LoRA-FP achieves superior robustness against various scenarios like incremental training and model fusion, while significantly reducing computational overhead compared to traditional approaches.
PREE: Towards Harmless and Adaptive Fingerprint Editing in Large Language Models via Knowledge Prefix Enhancement
Xubin Yue and Zhenhua Xu (co-first authors), Wenpeng Xing, Jiahui Yu, Mohan Li, Meng Han
Addressing the intellectual property protection challenges in commercial deployment of large language models (LLMs), existing black-box fingerprinting techniques face dual challenges from incremental fine-tuning erasure and feature-space defense due to their reliance on overfitting high-perplexity trigger patterns. We first reveal that model editing in the fingerprint domain exhibits unique advantages including significantly lower false positive rates, enhanced harmlessness, and superior robustness. Building on this foundation, this paper innovatively proposes a Prefix-enhanced Fingerprint Editing Framework (PREE), which encodes copyright information into parameter offsets through dual-channel knowledge editing to achieve covert embedding of fingerprint features. Experimental results demonstrate that the proposed solution achieves 90% trigger precision in mainstream architectures including LLaMA-3 and Qwen-2.5. The minimal parameter offset (change rate < 0.03%) effectively preserves original knowledge representation while demonstrating strong robustness against incremental fine-tuning and multi-dimensional defense strategies, maintaining zero false positive rate throughout evaluations.
Zhenhua Xu, Meng Han, Xubin Yue, Wenpeng Xing
We propose InSty, a novel fingerprinting method for LLMs in multi-turn dialogues that embeds cross-granularity (word- and sentence-level) triggers across turns, enabling robust, stealthy, and high-recall IP protection under black-box settings.
Key Preprints
- FP-VEC: Fingerprinting Large Language Models via Efficient Vector Addition, Zhenhua Xu, Wenpeng Xing, Zhebo Wang, Chang Hu, Chen Jie, Meng Han
- RAP-SM: Robust Adversarial Prompt via Shadow Models for Copyright Verification of Large Language Models, Zhebo Wang and Zhenhua Xu (co-first authors), Maike Li, Wenpeng Xing, Chunqiang Hu, Chen Zhi, Meng Han
- A Survey of LLM-Driven AI Agent Communication: Protocols, Security Risks, and Defense Countermeasures, Dezhang Kong, Shi Lin, Zhenhua Xu, Zhebo Wang, Minghao Li, Yufeng Li, Yilun Zhang, Zeyang Sha, Yuyuan Li, Changting Lin, Xun Wang, Xuan Liu, Muhammad Khurram Khan, Ningyu Zhang, Chaochao Chen, Meng Han
💻 Internships
Primary Responsibilities: Conducting research on large language model security and AI ecosystem governance, focusing on model copyright protection (digital watermarking and model fingerprinting), jailbreak attacks and defenses, adversarial attack strategies, hallucination detection frameworks, and multi-agent system security.
Key Contributions:
- Led 10+ high-quality research projects as first author and co-first author, with 10+ papers submitted to top-tier conferences and journals including ACL, EMNLP, AAAI, NDSS, and SCIENTIA SINICA
- Independently mentored multiple interns through complete research workflows, from topic selection and methodology design to experimental replication and paper writing
- Filed 8 invention patents (3 granted, 5 under review), achieving initial industrial transformation and intellectual property implementation of research outcomes
Java Backend Development Engineer |
November 2023 - May 2024 |
LianLianPay, Hangzhou, Zhejiang, China |
Primary Responsibilities: As a backend development engineer, participated in the development and maintenance of the “Account+” payment system. This system is one of the company’s core business platforms, primarily responsible for managing merchant partnerships and associated user information, handling financial operations between the company and merchants including account recharge, internal fund transfers, withdrawals, and reconciliation processes.
📖 Educations
College of Software, Zhejiang University |
June 2024 - Present |
Master of Software Engineering |
GPA: 4.27/5.0 |
Zhejiang University of Technology |
September 2020 - June 2024 |
Bachelor of Digital Media Technology |
GPA: 3.84/5.0 |
Honors and Awards: Comprehensive Assessment: 100/100 (Ranked 1st in Major), Outstanding Graduate of Zhejiang Province, Outstanding Student Award
Scholarships: Zhejiang Provincial Government Scholarship (Top 5%), First-Class Scholarship for Outstanding Students (Top 2%), First-Class Academic Scholarship
Note: Digital Media Technology is a computer science major covering fundamental courses including Computer Networks, Data Structures, Operating Systems, and Computer Architecture. While the program later specializes in game design, human-computer interaction, and 3D animation programming, my academic focus shifted toward artificial intelligence and software development, leading to my current pursuit in software engineering.