Kaiwen Zhou
This is Kaiwen Zhou, a fourth-year Ph.D student at the University of California, Santa Cruz, fortunately advised by Prof. Xin (Eric) Wang. My current research focuses on Responsible AI and AI agents. Below is a list of research areas I’ve worked on (purple denotes first-author contributions):
- LLM alignment-training: SafeKey (EMNLP 2025)
- Safety Evaluation: R1 Safety Eval, Multimodal Situational Safety (ICLR 2025)
- Responsible Embodied Agent: FedVLN (ECCV 2022), Navigation as the Attacker Wishes (NAACL 2024)
- LLM for Embodied Agent: ESC (ICML 2023), JARVIS (NeSy 2025 Oral)
- Multimodal Understanding & Reasoning: ViCor (ACL Findings 2024), Multipanel VQA (ACL 2024)
Before joining UCSC, I received my bachelor’s degree in statistics from Zhejiang University.
News
- Our SafeKey paper is accepted by EMNLP 2025!(08/2025)
- Invited talk at Microsoft on safety reasoning!(06/2025)
- I will join Microsoft as a research intern this summer!(03/2025)
- Our MSSBench paper is accepted by ICLR 2025!(01/2025)
- Two papers are accepted by ACL 2024!(05/2024)
- One paper is accepted by NAACL 2024!(03/2024)
- Our SlugJARVIS team won the third place in the first-ever Amazon Alexa SimBot challenge! (06/2023)
- Our ESC paper is accepted by ICML 2023!(04/2023)
- I will join Honda Research Institute as research intern this spring and summer!(04/2023)
- Our paper FedVLN is accepted by ECCV 2022!(07/2022)
- We are ranking No.1 in Alexa Prize SimBot Public Benchmark Challenge!(04/2022)
- I will join Samsung AI Center as a research intern this summer!(04/2022)
Selected Publications
SafeKey: Amplifying Aha-Moment Insights for Safety Reasoning
Kaiwen Zhou, Xuandong Zhao, Gaowen Liu, Jayanth Srinivasa, Aosong Feng, Dawn Song, Xin Eric Wang
EMNLP 2025
[Paper] [Website] [Code] [Models]
The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1
Kaiwen Zhou, Chengzhi Liu, Xuandong Zhao, Shreedhar Jangam, Jayanth Srinivasa, Gaowen Liu, Dawn Song, Xin Eric Wang
ICML 2025 R2-FM Workshop
[Paper] [Website]
Multimodal Situational Safety
Kaiwen Zhou*, Chengzhi Liu*, Xuandong Zhao, Anderson Compalas, Dawn Song, Xin Eric Wang
ICLR 2025
NeurIPS Workshop on RBFM 2024 Oral
[Paper] [Website] [Code] [Data]
Muffin or Chihuahua? Challenging Large Vision-Language Models with Multipanel VQA
Yue Fan, Jing Gu, Kaiwen Zhou, Qianqi Yan, Shan Jiang, Ching-Chen Kuo, Xinze Guan, Xin Eric Wang
ACL 2024
[Paper] [Website]
ViCor: Bridging Visual Understanding and Commonsense Reasoning with Large Language Models
Kaiwen Zhou, Kwonjoon Lee, Teruhisa Misu, Xin Eric Wang
Findings of ACL 2024
[Paper]
Navigation as the Attacker Wishes? Towards Building Byzantine-Robust Embodied Agents under Federated Learning
Yunchao Zhang, Zonglin Di, Kaiwen Zhou, Cihang Xie, Xin Eric Wang
NAACL 2024
[Paper]
ESC: Exploration with Soft Commonsense Constraints for Zero-shot Object Navigation
Kaiwen Zhou, Kaizhi Zheng, Connor Pryor, Yilin Shen, Hongxia Jin, Lise Getoor, Xin Eric Wang
ICML 2023
[Paper] [Website]
FedVLN: Privacy-preserving Federated Vision-and-Language Navigation
Kaiwen Zhou, Xin Eric Wang
ECCV 2022
[Paper] [Code]
JARVIS: A Neuro-Symbolic Commonsense Reasoning Framework for Conversational Embodied Agents
Kaizhi Zheng*, Kaiwen Zhou*, Jing Gu*, Yue Fan*, Zonglin Di*, Jialu Wang, Xuehai He, Xin Eric Wang
SoCal NLP 2022, NeSy 2025 Oral
Winner Model of the Alexa Prize SimBot Public Benchmark Challenge
[Paper]
Service
Reviewer
NeurIPS 2023, ICLR 2024, ICML 2024, ICLR 2025