Publications
(For the most up-to-date publications, please visit my Google Scholar.)
GEM: A Gym for Generalist LLMs
Proceedings of ICLR, 2026
[pdf]
[code]
SEA Workshop @ NeurIPS, 2025 (Outstanding Paper)
Efficient Process Reward Model Training via Active Learning
Proceedings of COLM, 2025
[pdf]
[code]
[data]
[model]
Understanding R1-Zero-Like Training: A Critical Perspective
Proceedings of COLM, 2025 (Oral)
[pdf]
[code]
AI4MATH Workshop @ ICML, 2025 (Best Paper Runner-Up)
Bootstrapping Language Models with DPO Implicit Rewards
Proceedings of ICLR, 2025
[pdf]
[code]
MHFAIA @ ICML, 2024
Unlocking Large Language Model's Planning Capabilities with Maximum Diversity Fine-tuning
Findings of NAACL, 2025
[pdf]
On Learning Informative Trajectory Embeddings for Imitation, Classification and Regression
Proceedings of AAMAS, 2025
[pdf]
[code]
Sample-Efficient Alignment for LLMs
LanGame @ NeurIPS, 2024
[pdf]
[code]
Generative Modelling of Stochastic Actions with Arbitrary Constraints in Reinforcement Learning
Proceedings of NeurIPS, 2023
[pdf]
[project page]
[code]
Multiscale Generative Models: Improving Performance of a Generative Model Using Feedback from Other Dependent Generative Models
Proceedings of AAAI, 2022
[pdf]
[code]