Zhuoran Jin

I am a 4th-year Ph.D student at the Natural Language Processing and Knowledge Engineering Group (NLPKE), Institute of Automation, Chinese Academy of Sciences (CASIA). I am fortunate to be advised by Prof. Jun Zhao. Before that, I obtained my B.E. degree in Software Engineering from Northeastern University (NEU) in 2021. My research interests include natural language processing, large language models, and knowledge engineering.

My research is dedicated to bridging the gap between large language models and human knowledge frameworks, with a focus on expanding knowledge boundaries, erasing harmful knowledge, and improving reasoning capabilities. Recently, I have also become increasingly interested in multimodal large language models, particularly in understanding and enhancing their ability to automatically solve complex and meaningful real-world problems. My primary research areas are:

Retrieval-Augmented Generation: RAG can effectively expand the internal memory boundaries of LLMs by providing external context. My work focuses on: (1) leveraging feedback reward signals from LLMs to improve retrieval quality (InstructoR, EMNLP 2023); (2) investigating the mechanisms underlying knowledge conflicts between internal memory and external context (Tug-of-War Between Knowledge, COLING 2024 and Cutting Off the Head Ends the Conflict, ACL 2024); and (3) aligning RAG model behavior with human preferences through reward modeling (RAG-RewardBench, ACL 2025).
Machine Unlearning: Machine unlearning enables the targeted removal of sensitive, harmful, or copyrighted knowledge from models. To better evaluate unlearning in the domain of LLMs, we propose a Real-World Knowledge Unlearning benchmark (RWKU, NeurIPS 2024). Building on this, we reveal the vulnerability of existing unlearning algorithms to adversarial attacks and propose Latent Adversarial Unlearning for robust unlearning (LAU, AAAI 2025). Moreover, to improve the naturalness of model responses after unlearning, we introduce an on-policy reinforcement learning framework that performs refusal boundary optimization (RULE).
Multimodal Reasoning: Multimodal reasoning is a core capability for AI systems to solve real-world tasks. However, the extent to which current models have truly advanced in this ability remains unclear. To address this gap, my research mainly involves: (1) exploring what multimodal CoT reasoning can and cannot do, and revealing the reasoning limitation known as “Look Shallow, Think Deep” (Look Shallow, Think Deep); (2) proposing a benchmark for multimodal video reasoning that exposes the challenges current models face in conducting long-range, multi-frame inference (MMR-V).
Reward Modeling: Reward models serve as a critical proxy for human values, guiding optimization in RLHF. We explore reward modeling in the contexts of RAG (RAG-RewardBench, ACL 2025), agent (Agent-RewardBench, ACL 2025), and omni-modal scenarios (Omni-Reward).

If you are interested in my work or want to collaborate, feel free to contact me via: zhuoran.jin[at]nlpr[dot]ia[dot]ac.cn.

News

May 26, 2025	Three papers are released on arXiv, exploring omni-modal reward modeling, demystifying multimodal CoT reasoning, and benchmarking multimodal video reasoning.
May 16, 2025	Five paper are accepted by ACL 2025.

Selected Publications

arXiv

Look Shallow, Think Deep: What Multimodal Chain-of-Thought Reasoning Can and Cannot Do

Zhuoran Jin^*, Kejian Zhu^*, Hongbang Yuan, Yupu Hao, Pengfei Cao, Yubo Chen, Kang Liu, and Jun Zhao

arXiv preprint (arXiv), 2025

Bib

@article{DBLP:journals/corr/abs-2412-13746,
  author = {Jin, Zhuoran and Zhu, Kejian and Yuan, Hongbang and Hao, Yupu and Cao, Pengfei and Chen, Yubo and Liu, Kang and Zhao, Jun},
  title = {Look Shallow, Think Deep: What Multimodal Chain-of-Thought Reasoning Can and Cannot Do},
  journal = {arXiv preprint},
  year = {2025},
}

arXiv

Omni-Reward: Towards Generalist Omni-Modal Reward Modeling with Free-Form Preferences

Zhuoran Jin^*, Hongbang Yuan^*, Kejian Zhu^*, Pengfei Cao, Yubo Chen, Kang Liu, and Jun Zhao

arXiv preprint (arXiv), 2025

Bib Code

@article{DBLP:journals/corr/abs-2412-13747,
  author = {Jin, Zhuoran and Yuan, Hongbang and Zhu, Kejian and Cao, Pengfei and Chen, Yubo and Liu, Kang and Zhao, Jun},
  title = {Omni-Reward: Towards Generalist Omni-Modal Reward Modeling with Free-Form Preferences},
  journal = {arXiv preprint},
  year = {2025},
}

arXiv

MMR-V: What’s Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos

Kejian Zhu, Zhuoran Jin, Hongbang Yuan, Jiachun Li, Shangqing Tu, Pengfei Cao, Yubo Chen, Kang Liu, and Jun Zhao

arXiv preprint (arXiv), 2025

Bib Code

@article{DBLP:journals/corr/abs-2412-13748,
  author = {Zhu, Kejian and Jin, Zhuoran and Yuan, Hongbang and Li, Jiachun and Tu, Shangqing and Cao, Pengfei and Chen, Yubo and Liu, Kang and Zhao, Jun},
  title = {MMR-V: What's Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos},
  journal = {arXiv preprint},
  year = {2025},
}

arXiv

RULE: Reinforcement UnLEarning Achieves Forget-retain Pareto Optimality

Chenlong Zhang, Zhuoran Jin, Hongbang Yuan, Jiaheng Wei, Tong Zhou, Kang Liu, Jun Zhao, and Yubo Chen

arXiv preprint (arXiv), 2025

Bib Code

@article{DBLP:journals/corr/abs-2412-13749,
  author = {Zhang, Chenlong and Jin, Zhuoran and Yuan, Hongbang and Wei, Jiaheng and Zhou, Tong and Liu, Kang and Zhao, Jun and Chen, Yubo},
  title = {RULE: Reinforcement UnLEarning Achieves Forget-retain Pareto Optimality},
  journal = {arXiv preprint},
  year = {2025},
}

ACL Findings

RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment

Zhuoran Jin, Hongbang Yuan, Tianyi Men, Pengfei Cao, Yubo Chen, Kang Liu, and Jun Zhao

In Annual Meeting of the Association for Computational Linguistics (ACL Findings) , 2025

arXiv Bib Code

@inproceedings{DBLP:journals/corr/abs-2412-13754,
  author = {Jin, Zhuoran and Yuan, Hongbang and Men, Tianyi and Cao, Pengfei and Chen, Yubo and Liu, Kang and Zhao, Jun},
  title = {RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment},
  booktitle = {Annual Meeting of the Association for Computational Linguistics},
  publisher = {Association for Computational Linguistics},
  year = {2025},
}

AAAI

Towards Robust Knowledge Unlearning: An Adversarial Framework for Assessing and Improving Unlearning Robustness in Large Language Models

Hongbang Yuan^*, Zhuoran Jin^*, Pengfei Cao, Yubo Chen, Kang Liu, and Jun Zhao

In Annual AAAI Conference on Artificial Intelligence (AAAI) , 2025

arXiv Bib HTML Code

@inproceedings{DBLP:conf/aaai/YuanJC0LZ25,
  author = {Yuan, Hongbang and Jin, Zhuoran and Cao, Pengfei and Chen, Yubo and Liu, Kang and Zhao, Jun},
  editor = {Walsh, Toby and Shah, Julie and Kolter, Zico},
  title = {Towards Robust Knowledge Unlearning: An Adversarial Framework for Assessing and Improving Unlearning Robustness in Large Language Models},
  booktitle = {Annual AAAI Conference on Artificial Intelligence},
  pages = {25769--25777},
  publisher = {{AAAI} Press},
  year = {2025},
}

NeurIPS

RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models

Zhuoran Jin, Pengfei Cao, Chenhao Wang, Zhitao He, Hongbang Yuan, Jiachun Li, Yubo Chen, Kang Liu, and Jun Zhao

In Annual Conference on Neural Information Processing Systems (NeurIPS) , 2024

arXiv Bib HTML Code Website

@inproceedings{DBLP:conf/nips/JinCWHYL00024,
  author = {Jin, Zhuoran and Cao, Pengfei and Wang, Chenhao and He, Zhitao and Yuan, Hongbang and Li, Jiachun and Chen, Yubo and Liu, Kang and Zhao, Jun},
  editor = {Globersons, Amir and Mackey, Lester and Belgrave, Danielle and Fan, Angela and Paquet, Ulrich and Tomczak, Jakub M. and Zhang, Cheng},
  title = {RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models},
  booktitle = {Annual Conference on Neural Information Processing Systems},
  year = {2024},
}

ACL Findings

Cutting Off the Head Ends the Conflict: A Mechanism for Interpreting and Mitigating Knowledge Conflicts in Language Models

Zhuoran Jin, Pengfei Cao, Hongbang Yuan, Yubo Chen, Jiexin Xu, Huaijun Li, Xiaojian Jiang, Kang Liu, and Jun Zhao

In Annual Meeting of the Association for Computational Linguistics (ACL Findings) , 2024

arXiv Bib HTML

@inproceedings{DBLP:conf/acl/JinCY0XLJ0024,
  author = {Jin, Zhuoran and Cao, Pengfei and Yuan, Hongbang and Chen, Yubo and Xu, Jiexin and Li, Huaijun and Jiang, Xiaojian and Liu, Kang and Zhao, Jun},
  editor = {Ku, Lun{-}Wei and Martins, Andre and Srikumar, Vivek},
  title = {Cutting Off the Head Ends the Conflict: {A} Mechanism for Interpreting and Mitigating Knowledge Conflicts in Language Models},
  booktitle = {Annual Meeting of the Association for Computational Linguistics},
  pages = {1193--1215},
  publisher = {Association for Computational Linguistics},
  year = {2024},
}

COLING

Tug-of-War between Knowledge: Exploring and Resolving Knowledge Conflicts in Retrieval-Augmented Language Models

Zhuoran Jin, Pengfei Cao, Yubo Chen, Kang Liu, Xiaojian Jiang, Jiexin Xu, Qiuxia Li, and Jun Zhao

In International Conference on Computational Linguistics (COLING) , 2024

arXiv Bib HTML

@inproceedings{DBLP:conf/coling/JinC0LJXLZ24,
  author = {Jin, Zhuoran and Cao, Pengfei and Chen, Yubo and Liu, Kang and Jiang, Xiaojian and Xu, Jiexin and Li, Qiuxia and Zhao, Jun},
  editor = {Calzolari, Nicoletta and Kan, Min{-}Yen and Hoste, V{\'{e}}ronique and Lenci, Alessandro and Sakti, Sakriani and Xue, Nianwen},
  title = {Tug-of-War between Knowledge: Exploring and Resolving Knowledge Conflicts in Retrieval-Augmented Language Models},
  booktitle = {International Conference on Computational Linguistics},
  pages = {16867--16878},
  year = {2024},
}

EMNLP Findings

InstructoR: Instructing Unsupervised Conversational Dense Retrieval with Large Language Models

Zhuoran Jin, Pengfei Cao, Yubo Chen, Kang Liu, and Jun Zhao

In Conference on Empirical Methods in Natural Language Processing (EMNLP Findings) , 2023

Bib HTML

@inproceedings{DBLP:conf/emnlp/JinC0LZ23,
  author = {Jin, Zhuoran and Cao, Pengfei and Chen, Yubo and Liu, Kang and Zhao, Jun},
  editor = {Bouamor, Houda and Pino, Juan and Bali, Kalika},
  title = {InstructoR: Instructing Unsupervised Conversational Dense Retrieval with Large Language Models},
  booktitle = {Conference on Empirical Methods in Natural Language Processing},
  pages = {6649--6675},
  publisher = {Association for Computational Linguistics},
  year = {2023},
}