I am a master's student at The University of Texas at Austin, focusing on the planning and reasoning abilities of large language models. Specifically, I am interested in reinforcement learning and post-training to improve reasoning ability.
My most recent work investigates the impact of curriculum learning on reasoning. We introduce E2H Reasoner (Easy2Hard), a method that improves post-training via cosine or gaussian scheduling. We compare supervised fine-tuning to reinforcement learning and measure their impact on trivial, easy, medium, and hard tasks.
Curriculum Reinforcement Learning from Easy to Hard Tasks Improves LLM Reasoning
Shubham Parashar*, Shurui Gui*, Xiner Li*, Hongyi Ling, Sushil Vemuri, Blake Olson, Eric Li, Yu Zhang, James Caverlee, Dileep Kalathil, Shuiwang Ji
International Conference on Learning Representations (ICLR), 2026
Complex LLM Planning via Automated Heuristics Discovery
Hongyi Ling*, Shubham Parashar*, Sambhav Khurana*, Blake Olson, Anwesha Basu, Gaurangi Sinha, Zhengzhong Tu, James Caverlee, Shuiwang Ji
arXiv preprint, 2025
Inference-Time Computations for LLM Reasoning and Planning: A Benchmark and Insights
Shubham Parashar*, Blake Olson*, Sambhav Khurana*, Eric Li*, Hongyi Ling, James Caverlee, Shuiwang Ji
arXiv preprint, 2025
Fragment and Geometry Aware Tokenization of Molecules for Structure-Based Drug Design Using Language Models
Cong Fu*, Xiner Li*, Blake Olson, Heng Ji, Shuiwang Ji
International Conference on Learning Representations (ICLR), 2025
(* indicates equal contribution)
Full Resume in PDF.