Chinese Bio (click to expand)
News
- 05/2026 🎉 One paper accepted by USENIX Security 2026 !
- 04/2026 🎉 Four papers accepted by ACL 2026 · 1 Main, 3 Findings !
- 01/2026 🎉 Three papers accepted by ICASSP 2026 !
- 12/2025 🎉 One paper accepted by IEEE INFOCOM 2026 !
- 08/2025 🎉 Four papers accepted by EMNLP 2025 · 2 Main, 2 Findings !
- 08/2025 🎉 One paper accepted by SCIENTIA SINICA Informationis !
- 05/2025 🎉 One paper accepted by ACL 2025 Main Conference !
Projects
- Awesome LLM Copyright Protection - A curated collection of research and techniques for protecting intellectual property of large language models, including watermarking, fingerprinting, and more. [Website][Paper Link]
Publications
Conference Papers
We propose an adaptive multi-agent role-playing framework, AdaMARP, featuring an immersive message format that interleaves [Thought], (Action), <Environment>, and Speech, together with an explicit Scene Manager that governs role-playing through discrete actions (init_scene, pick_speaker, switch_scene, add_role, end) accompanied by rationales. To train these capabilities, we construct AdaRPSet for the Actor Model and AdaSMSet for supervising orchestration decisions, and introduce AdaptiveBench for trajectory-level evaluation.
We propose AttnDiff, a data-efficient white-box framework that extracts fingerprints from models via intrinsic information-routing behavior. AttnDiff probes minimally edited prompt pairs that induce controlled semantic conflicts, captures differential attention patterns, summarizes them with compact spectral descriptors, and compares models using CKA.
We propose EverTracer, a gray-box probabilistic fingerprint that leverages calibrated probability shifts from MIA-style memorization to enable stealthy, robust provenance tracing against input and model-level modifications.
@inproceedings{xuEverTracerHuntingStolen2025,
title = {{{EverTracer}}: {{Hunting Stolen Large Language Models}} via {{Stealthy}} and {{Robust Probabilistic Fingerprint}}},
booktitle = {Proceedings of the 2025 {{Conference}} on {{Empirical Methods}} in {{Natural Language Processing}}},
author = {Xu, Zhenhua and Han, Meng and Xing, Wenpeng},
editor = {Christodoulopoulos, Christos and Chakraborty, Tanmoy and Rose, Carolyn and Peng, Violet},
year = 2025,
pages = {7019--7042},
publisher = {Association for Computational Linguistics},
address = {Suzhou, China},
doi = {10.18653/v1/2025.emnlp-main.358},
urldate = {2025-11-14},
isbn = {979-8-89176-332-6}
}
We propose CTCC, a rule-driven fingerprint that encodes cross-turn contextual correlations in dialogue to achieve black-box verification with higher stealth and robustness and reduced false positives.
@inproceedings{xuCTCCRobustStealthy2025,
title = {{{CTCC}}: {{A Robust}} and {{Stealthy Fingerprinting Framework}} for {{Large Language Models}} via {{Cross-Turn Contextual Correlation Backdoor}}},
booktitle = {Proceedings of the 2025 {{Conference}} on {{Empirical Methods}} in {{Natural Language Processing}}},
author = {Xu, Zhenhua and Zhao, Xixiang and Yue, Xubin and Tian, Shengwei and Lin, Changting and Han, Meng},
editor = {Christodoulopoulos, Christos and Chakraborty, Tanmoy and Rose, Carolyn and Peng, Violet},
year = 2025,
pages = {6978--7000},
publisher = {Association for Computational Linguistics},
address = {Suzhou, China},
doi = {10.18653/v1/2025.emnlp-main.356},
urldate = {2025-11-14},
isbn = {979-8-89176-332-6}
}
We propose MEraser, a two-phase fine-tuning method that erases backdoor-based fingerprints from LLMs while preserving utility, transferring across models with minimal data and no repeated training.
@inproceedings{zhangMEraserEffectiveFingerprint2025,
title = {{{MEraser}}: {{An Effective Fingerprint Erasure Approach}} for {{Large Language Models}}},
booktitle = {Proceedings of the 63rd {{Annual Meeting}} of the {{Association}} for {{Computational Linguistics}} ({{Volume}} 1: {{Long Papers}})},
author = {Zhang, Jingxuan and Xu, Zhenhua and Hu, Rui and Xing, Wenpeng and Zhang, Xuhong and Han, Meng},
editor = {Che, Wanxiang and Nabende, Joyce and Shutova, Ekaterina and Pilehvar, Mohammad Taher},
year = 2025,
pages = {30136--30153},
publisher = {Association for Computational Linguistics},
address = {Vienna, Austria},
doi = {10.18653/v1/2025.acl-long.1455},
urldate = {2025-11-14},
isbn = {979-8-89176-251-0}
}
We propose LoRA-FP, a plug-and-play approach that encodes backdoor fingerprints into LoRA adapters and transfers them to downstream models via parameter fusion, enabling low-cost, robust, and contamination-free fingerprinting.
@inproceedings{xuUnlockingEffectivenessLoRAFP2025,
title = {Unlocking the {{Effectiveness}} of {{LoRA-FP}} for {{Seamless Transfer Implantation}} of {{Fingerprints}} in {{Downstream Models}}},
booktitle = {Findings of the {{Association}} for {{Computational Linguistics}}: {{EMNLP}} 2025},
author = {Xu, Zhenhua and Yan, Zhaokun and Xu, Binhan and Tong, Xin and Xu, Haitao and Chen, Yourong and Han, Meng},
editor = {Christodoulopoulos, Christos and Chakraborty, Tanmoy and Rose, Carolyn and Peng, Violet},
year = 2025,
pages = {4302--4312},
publisher = {Association for Computational Linguistics},
address = {Suzhou, China},
doi = {10.18653/v1/2025.findings-emnlp.230},
urldate = {2025-11-14},
isbn = {979-8-89176-335-7}
}
We propose PREE, a prefix-enhanced fingerprint editing framework that embeds copyright information as minimal parameter offsets via dual-channel knowledge editing, delivering high trigger precision and strong robustness under incremental fine-tuning and defenses.
@inproceedings{yuePREEHarmlessAdaptive2025,
title = {{{PREE}}: {{Towards Harmless}} and {{Adaptive Fingerprint Editing}} in {{Large Language Models}} via {{Knowledge Prefix Enhancement}}},
booktitle = {Findings of the {{Association}} for {{Computational Linguistics}}: {{EMNLP}} 2025},
author = {Yue, Xubin and Xu, Zhenhua and Xing, Wenpeng and Yu, Jiahui and Li, Mohan and Han, Meng},
editor = {Christodoulopoulos, Christos and Chakraborty, Tanmoy and Rose, Carolyn and Peng, Violet},
year = 2025,
pages = {3794--3804},
publisher = {Association for Computational Linguistics},
address = {Suzhou, China},
doi = {10.18653/v1/2025.findings-emnlp.204},
urldate = {2025-11-14},
isbn = {979-8-89176-335-7}
}
We propose DNF, a dual-layer nested fingerprinting framework that couples domain-specific stylistic cues with implicit semantic triggers to embed hierarchical backdoor-based fingerprints into large language models, enabling black-box ownership verification with enhanced stealth and resilience to detection and filtering.
@misc{xu2026dnfduallayernestedfingerprinting,
title={DNF: Dual-Layer Nested Fingerprinting for Large Language Model Intellectual Property Protection},
author={Zhenhua Xu and Yiran Zhao and Mengting Zhong and Dezhang Kong and Changting Lin and Tong Qiao and Meng Han},
year={2026},
eprint={2601.08223},
archivePrefix={arXiv},
primaryClass={cs.CR},
url={https://arxiv.org/abs/2601.08223},
}
We propose ForgetMark, a targeted unlearning–based fingerprint that encodes model ownership via probabilistic forgetting traces, enabling stealthy and robust black-/gray-box verification with minimal performance impact and low false positives.
@misc{xu2026forgetmarkstealthyfingerprintembedding,
title={ForgetMark: Stealthy Fingerprint Embedding via Targeted Unlearning in Language Models},
author={Zhenhua Xu and Haobo Zhang and Zhebo Wang and Qichen Liu and Haitao Xu and Wenpeng Xing and Meng Han},
year={2026},
eprint={2601.08189},
archivePrefix={arXiv},
primaryClass={cs.CR},
url={https://arxiv.org/abs/2601.08189},
}
We propose KinGuard, a hierarchical kinship-aware fingerprinting framework that models derivation relationships among LLM variants to enable fine-grained provenance tracing and robust defense against model stealing attacks.
@misc{xu2026kinguardhierarchicalkinshipawarefingerprinting,
title={KinGuard: Hierarchical Kinship-Aware Fingerprinting to Defend Against Large Language Model Stealing},
author={Zhenhua Xu and Xiaoning Tian and Wenjun Zeng and Wenpeng Xing and Tianliang Lu and Gaolei Li and Chaochao Chen and Meng Han},
year={2026},
eprint={2601.12986},
archivePrefix={arXiv},
primaryClass={cs.CR},
url={https://arxiv.org/abs/2601.12986},
}
We propose Web Fraud Attacks, a novel type of attack manipulating unique structures of web links to deceive MAS. We design 12 representative attack variants that encompass various methods, such as homoglyph deception, sub-directory nesting, and parameter obfuscation.
We propose MalURLBench, the first benchmark for evaluating LLMs' vulnerabilities to malicious URLs. MalURLBench contains 61,845 attack instances spanning 10 real-world scenarios and 7 categories of real malicious websites.
Journal Papers
We propose InSty, a novel fingerprinting method for LLMs in multi-turn dialogues that embeds cross-granularity (word- and sentence-level) triggers across turns, enabling robust, stealthy, and high-recall IP protection under black-box settings.
@article{xuInStyRobustMultilevel2025,
title = {{{InSty}}: A Robust Multi-Level Cross-Granularity Fingerprint Embedding Algorithm for Multi-Turn Dialogue in Large Language Models},
author = {Xu, Zhenhua and Han, Meng and Yue, Xubin and Xing, Wenpeng},
year = 2025,
journal = {SCIENTIA SINICA Informationis},
volume = {55},
number = {8},
pages = {1906},
publisher = {Science China Press},
issn = {1674-7267},
doi = {10.1360/SSI-2025-0022}
}
Key Preprints
@misc{xu2026psycot,
title={Improving General Role-Playing Agents via Psychology-Grounded Reasoning and Role-Aware Policy Optimization},
author={Zhenhua Xu and Dongsheng Chen and Jian Li and Yitong Lin and Zhebo Wang and Jiafu Wu and Yizhang Jin and Chengjie Wang and Meng Han and Yabiao Wang},
year={2026},
eprint={2606.27025},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2606.27025},
}
We propose Psy-CoT, a psychology-grounded chain-of-thought framework that decomposes pre-response reasoning into Interaction Perception, Psychological Empathy, and Logical Construction, so the model thinks dynamically from a character profile rather than mimicking surface patterns. We further introduce Role-Aware Policy Optimization (RAPO), which uses profile–token mutual information to weight gradients asymmetrically—amplifying role-specific tokens under positive advantage and attenuating them under negative advantage to curb reward hacking. On CoSER, CharacterBench, and CharacterEval, Psy-CoT outperforms existing role-playing CoT methods and RAPO consistently surpasses GRPO across model scales.
@misc{xu2025copyrightprotectionlargelanguage,
title={Copyright Protection for Large Language Models: A Survey of Methods, Challenges, and Trends},
author={Zhenhua Xu and Xubin Yue and Zhebo Wang and Qichen Liu and Xixiang Zhao and Jingxuan Zhang and Wenjun Zeng and Wengpeng Xing and Dezhang Kong and Changting Lin and Meng Han},
year={2025},
eprint={2508.11548},
archivePrefix={arXiv},
primaryClass={cs.CR},
url={https://arxiv.org/abs/2508.11548},
}
@misc{wang2026srafstealthyrobustadversarial,
title={SRAF: Stealthy and Robust Adversarial Fingerprint for Copyright Verification of Large Language Models},
author={Zhebo Wang and Zhenhua Xu and Maike Li and Wenpeng Xing and Chunqiang Hu and Chen Zhi and Meng Han},
year={2026},
eprint={2505.06304},
archivePrefix={arXiv},
primaryClass={cs.CR},
url={https://arxiv.org/abs/2505.06304},
}
@misc{xu2025fingerprintvectorenablingscalable,
title={Fingerprint Vector: Enabling Scalable and Efficient Model Fingerprint Transfer via Vector Addition},
author={Zhenhua Xu and Qichen Liu and Zhebo Wang and Wenpeng Xing and Dezhang Kong and Mohan Li and Meng Han},
year={2025},
eprint={2409.08846},
archivePrefix={arXiv},
primaryClass={cs.CR},
url={https://arxiv.org/abs/2409.08846},
}
@misc{kong2025surveyllmdrivenaiagent,
title={A Survey of LLM-Driven AI Agent Communication: Protocols, Security Risks, and Defense Countermeasures},
author={Dezhang Kong and Shi Lin and Zhenhua Xu and Zhebo Wang and Minghao Li and Yufeng Li and Yilun Zhang and Hujin Peng and Xiang Chen and Zeyang Sha and Yuyuan Li and Changting Lin and Xun Wang and Xuan Liu and Ningyu Zhang and Chaochao Chen and Chunming Wu and Muhammad Khurram Khan and Meng Han},
year={2025},
eprint={2506.19676},
archivePrefix={arXiv},
primaryClass={cs.CR},
url={https://arxiv.org/abs/2506.19676},
}
Internship Experience
—
Present

Details ▾
Primary Responsibilities: Conducting research on LLM role-playing to improve character consistency, dialogue fluency, and narrative engagement when models portray custom or specific characters.
—
Oct 2025

Details ▾
Primary Responsibilities: Conducting research on large language model security and AI ecosystem governance, focusing on model copyright protection (digital watermarking and model fingerprinting), jailbreak attacks and defenses, adversarial attack strategies, and agent system security risks.
—
May 2024

Details ▾
Primary Responsibilities: As a backend development engineer, participated in the development and maintenance of the "Account+" payment system. This system is one of the company's core business platforms, primarily responsible for managing merchant partnerships and associated user information, handling financial operations between the company and merchants including account recharge, internal fund transfers, withdrawals, and reconciliation processes.
Education
—
Jun 2027
(expected)

Selected Honors ▾
Honors and Awards: Outstanding Graduate Student (First Year), Five-Good Graduate Student (First Year)
Scholarships: 2025 National Scholarship (First Year)
—
Jun 2024

Selected Honors & Notes ▾
Honors and Awards: Comprehensive Assessment: 100/100 (Ranked 1st in Major), Outstanding Graduate of Zhejiang Province, Outstanding Student Award
Scholarships: Zhejiang Provincial Government Scholarship (Top 5%), First-Class Scholarship for Outstanding Students (Top 2%), First-Class Academic Scholarship
Note: Digital Media Technology is a computer science major covering fundamental courses including Computer Networks, Data Structures, Operating Systems, and Computer Architecture. While the program later specializes in game design, human-computer interaction, and 3D animation programming, my academic focus shifted toward artificial intelligence and software development, leading to my current pursuit in software engineering.