Hi there! I am Hao Li (李昊 in Chinese), a final-year Ph.D. candidate in the School of Computer Science at Peking University, advised by Prof. Li Yuan and Prof. Yonghong Tian. Before that, I got my Bachelor degree in Computer Science at Peking University with a Summa Cum Laude.

My research interests include multimodal Learning, visual understanding, and AI for Chemical Science. I have published more than 20 papers at the top international AI conferences with total Google Scholar citations 2000+.

🔥 News

2026.05: 🎉🎉 Our MHE-LLama-8B for unified molecule understanding & generation via LLM has accepted by Nature Communications.
2025.06: 🎉🎉 We release ChemCoTBench, the first LLM-benchmark evaluating step-wise reasoning on complex chemical tasks, which is accepted by NeurIPS-2025.
2025.06: 🎉🎉 Our SUR-LID for Face Forgery Detection accepted by CVPR-2025.
2024.12: 🎉🎉 Our ECDFormer for Efficent Spectra Prediction accepted by Nature Computational Science.
2024.08: 🎉🎉 Two papers accepted by The 18th European Conference on Computer Vision(ECCV-2024).
2023.08: 🎉🎉 One paper accepted by The International Conference on Computer Vision(ICCV-2023).
2023.06: 🎉🎉 One paper accepted by Transactions on Image Processing(TIP).
2023.04: 🎉🎉 Three paper accepted by The International Joint Conference on Artifical Intelligence(IJCAI-2023).
2022.08: 🎉🎉 One oral paper accepted by The International Conference on Multimedia and Expo(ICME-2022-Oral).

📝 Selected Publications

Nature Communications 2026

Navigating Chemical-Linguistic Sharing Space with Heterogeneous Molecular Encoding

Hao Li, Liuzhenghao Lv, Yu Wang, Zijun Chen, Yuyang Liu, Li Yuan†, Yonghong Tian†

Project

Official Homepage for MHE, a Nature Communications work on heterogeneous molecular encoding for chemical language models. MHE bridges molecular structures and natural language by unifying sequence, topology, geometry, and fragment-level information, supporting bidirectional molecular understanding and generation across chemical-linguistic space.

NeurIPS 2025

Beyond Chemical QA: Evaluating LLM’s Chemical Reasoning with Modular Chemical Operations

Hao Li, He Cao, Bin Feng, Yanjun Shao, Xiangru Tang, Zhiyuan Yan, Li Yuan†, Yonghong Tian†, Yu Li†

Project

Official Homepage for the ChemCoTBench, the first step-wise reasoning benchmark for LLMs in the Chemical domain, covering molecular understanding, editing, optimization, and reaction-related predictions. We have contributed an expert-annotated benchmark dataset of 1.5K samples, along with a high-quality SFT-RL reasoning dataset comprising 22K instances.

TPAMI 2025

Hierarchical banzhaf interaction for general video-language representation learning

Hao Li, Peng Jin, Shuicheng Yan, Li Yuan†, Jie Chen†

Project

Official Homepage for the HBI, a new approach that models video-text as game players using multivariate cooperative game theory to handle uncertainty during fine-grained semantic interactions with diverse granularity, flexible combination, and vague intensity.

ECCV 2024

FreestyleRet: Retrieving Images from Style-Diversified Queries

Hao Li, Yanhao Jia, Peng Jin, Zesen Cheng, Kehan Li, Jialu Sui, Chang Liu, Li Yuan

Project

Official Code for the FreestyleRet framework. Official Release for the Diversified-Style Retrieval Dataset (DSR).

Nature Computational Science 2024

Decoupled peak property learning for efficient and interpretable ECD spectra prediction

Hao Li, Da Long, Li Yuan, Yu Wang, Yonghong Tian, Xinchang Wang, Fanyang Mo

Project

Existing predictive approaches lack the consideration of ECD spectra due to the data scarcity, and the interpretability to achieve trust-worthy prediction. Here, we establish a large-scale dataset for Chiral Molecular ECD spectra (CMCDS) and propose ECDFormer for accurate and interpretable ECD spectra prediction. ECDFormer decomposes ECD spectra into peak entities, employs the QFormer architecture to learn peak properties, and renders peaks into spectra. Compared to spectra sequence prediction methods, our decoupled peak prediction approach substantially enhances both accuracy and efficiency, improving the peak symbol accuracy from 37.3% to 72.7% and decreasing the time cost from an average of 4.6 CPU hours to 1.5 seconds.

ICCV 2023

Diffusionret: Generative text-video retrieval with diffusion model

Hao Li, Peng Jin, Zesen Cheng, Kehan Li, Chang Liu, Li Yuan, Jie Chen

Project

Official Code for the DiffusionRet model. We propose DiffusionRet, a generative text-video retrieval framework that models retrieval as generating joint distributions from noise. Unlike standard methods, it excels not only on standard benchmarks but also in retrieving out-of-distribution videos, demonstrating superior generalization.

🎖 Honors and Awards

2025.9 National Scholarship for Doctoral Students (Top-3), Peking University
2023.9 Hongqiao Scholarship in Peking University (Top 1%).

📖 Educations

2017.09 - 2021.06, Bachelor, in School of Electronics Engineering and Computer Science (EECS), Peking University.
2021.09 - 2023.09, Master, School of Electronics and Computer Engineering (ECE), Peking University.
2023.09 - 2026.07, PhD, School of Computer Science, Peking University.
2026.06 - now, Senior Engineer, Qwen Post-Training Team.

💻 Internships

2020.07 - 2021.09, Mentored by Xu Li, Cognitive Computing Lab, Baidu Research, Beijing, China.
2022.07 - 2023.02, Mentored by Songyang Zhang, OpenMMLab, Shanghai AI Lab, Shanghai, China.
2024.12 - present, Mentored by Yu Li, International Digital Economy Academy, Shenzhen, China.