About Me
Google Scholar | CV Link | Github | Email
- Research Scientist at Salesforce AI Research
- Interested in Language Conditioned Video Understanding & Robot Learning
- PhD at Stony Brook University, NY, USA advised by Michael Ryoo
- Former Intern at Google Research with Srikumar Ramalingam, Meta with Tsung-Yu Lin, Apple with Jonathon Shlens, Alexander Toshev, and Salesforce AI Research with Juan Carlos Niebles.
- Former Researcher at MBZUAI with Salman Khan, Muzammal Naseer, and Fahad Khan
- Enjoy ballroom dancing, cooking, and theatre during my leisure time
Have a look at my Curriculum Vitae for more details and Google Scholar for full list of papers.
Updates
- FOFPred: Language-Driven Future OF Prediction accepted to CVPR Findings 2026!
- DAWN: Pixel Motion Diffusion for Robot Control accepted to CVPR 2026!
- IVRA: Visual-Token Relations for Robot Policy accepted to ICRA 2026!
- LVNet for Long-Video QnA is accepted to EACL 2026!
Selected Publications
- July 2025: Language Repository for Long Video Understanding, ACL Findings 2025.
- April 2025: Understanding Long Videos with Multimodal Language Models, ICLR 2025.
- April 2025: LLaRA: Large Language and Robotics Assistant, ICLR 2025.
- June 2024: Localization in Visual-LLMs Improves Reasoning, CVPR 2024.
- May 2023: Language-based Video Self-Supervised Learning, NeurIPS 2023.
- Mar 2023: Perceptual Grouping in Contrastive VLMs, ICCV 2023.
Featured Work
- Our Diffusion Illusions work (CVPR ‘23 Best Demo) featured on Stony Brook News.
