About Me
Google Scholar | CV Link | Github | Email
- Research Scientist at Salesforce AI Research
- Interested in Language Conditioned Video Understanding & Robot Learning
- PhD at Stony Brook University, NY, USA advised by Michael Ryoo
- Former Intern at Google Research with Srikumar Ramalingam, Meta with Tsung-Yu Lin, Apple with Jonathon Shlens, Alexander Toshev, and Salesforce AI Research with Juan Carlos Niebles.
- Former Researcher at MBZUAI with Salman Khan, Muzammal Naseer, and Fahad Khan
- Enjoy ballroom dancing, cooking, and theatre during my leisure time
Have a look at my Curriculum Vitae for more details and Google Scholar for full list of papers.
Updates
- FOFPred: Language-Driven Future OF Prediction accepted to CVPR Findings 2026!
- DAWN: Pixel Motion Diffusion for Robot Control accepted to CVPR 2026!
- IVRA: Visual-Token Relations for Robot Policy accepted to ICRA 2026!
- LVNet for Long-Video QnA is accepted to EACL 2026!
Selected Publications
- July, 2025: Language Repository for Long Video Understanding, ACL Findings 2025.
- April 2025: Understanding Long Videos with Multimodal Language Models, ICLR 2025.
- April 2025: LLaRA: Large Language and Robotics Assistant, ICLR 2025.
- June 2024: Localization in Visual-LLMs Improves Reasoning, CVPR 2024.
- May 2023: Language-based Video Self-Supervised Learning, NeurIPS 2023.
- October 2022: Perceptual Grouping in Contrastive VLMs, ICCV 2023. <!– * November 2021: Self-supervised Video Transformers, CVPR 2022 (oral).
- July 2021: Adversarial Transferability of Vision Transformers, ICLR 2022 (spotlight).
- May 2021: Intriguing Properties of Vision Transformers, NeurIPS 2021 (spotlight).
- March 2021: Orthogonal Projection Loss, ICCV 2021.
- January, 2021: Conditional Generative Modeling, ICLR 2021.
- September, 2020: Panoptic Segmentation, BMVC 2020 (oral).
- September 2019: Activity Recognition in Videos, TCSVT journal. –>
Featured Work
- Our Diffusion Illusions work (CVPR ‘23 Best Demo) featured on Stony Brook News.
