Research
PhD Thesis
I defended my PhD thesis titled “Data-Efficient Learning On Structured Output Data” (Nov 19, 2024), and I started at Samsung AI Toronto. [defense slides]
Publications

Hallucination Score: Towards Mitigating Hallucinations in Generative Image Super-Resolution
Weiming Ren*, Raghav Goyal*, Zhiming Hu*, Tristan Ty Aumentado-Armstrong*, Iqbal Mohomed, Alex Levinshtein (* equal contribution)
arXiv. 2507.14367. [pdf]

Extending Video Masked Autoencoders to 128 frames
Nitesh B. Gundavarapu*, Luke Friedman*, Raghav Goyal*, Chaitra Hegde*, Eirikur Agustsson, Sagar M. Waghmare, Mikhail Sirotenko, Ming-Hsuan Yang, Tobias Weyand, Boqing Gong, Leonid Sigal (* equal contribution)
In NeurIPS. Vancouver, Canada. 2024. [pdf]

TAM-VT: Transformation-Aware Multi-scale Video Transformer for Segmentation and Tracking
Raghav Goyal*, Wan-Cyuan Fan*, Mennatullah Siam, Leonid Sigal (* equal contribution)
In WACV. Tucson, USA. 2025. [pdf] [project page]

MINOTAUR: Multi-task Video Grounding From Multimodal Queries
Raghav Goyal, Effrosyni Mavroudi, Xitong Yang, Sainbayar Sukhbaatar, Leonid Sigal, Matt Feiszli, Lorenzo Torresani, Du Tran
arXiv. 2302.08063. [pdf]

A Simple Baseline for Weakly-Supervised Human-centric Relation Detection
Raghav Goyal, Leonid Sigal
In BMVC. Virtual. 2021. [pdf]

UniT: Unified Knowledge Transfer for Any-shot Object Detection and Segmentation
Siddhesh Khandelwal*, Raghav Goyal*, Leonid Sigal (* equal contribution)
In CVPR. Virtual. 2021. [pdf]

Improved Few-Shot Visual Classification
Peyman Bateni, Raghav Goyal, Vaden Masrani, Frank Wood, Leonid Sigal
In CVPR. Seattle, USA. 2020. [pdf]

Evaluating visual “common sense” using fine-grained classification and captioning tasks
Raghav Goyal, Farzaneh Mahdisoltani, Guillaume Berger, Waseem Gharbieh, Ingo Bax, Roland Memisevic
In ICLR Workshop. Vancouver, Canada. 2018. [pdf]

The “something something” video database for learning and evaluating visual common sense
Raghav Goyal, Samira Ebrahimi Kahou, Vincent Michalski, *, Ingo Bax, Roland Memisevic (* see paper for additional authors)
In ICCV. Venice, Italy. 2017. [pdf] [supp] [code] [data]

Natural Language Generation through Character-based RNNs with Finite-state Prior Knowledge
Raghav Goyal, Marc Dymetman, Eric Gaussier
In COLING. Osaka, Japan. 2016. [pdf]
ML Challenges
- (Sep, 2018) Placed 3rd in Visual Dialog challenge hosted as a part of SIVL workshop at ECCV’18. Rankings can be found here.
- (Jul, 2017) Placed 3rd in the Kinetics video recognition challenge, hosted by DeepMind at ActivityNet workshop at CVPR’17, with our approach detailed in this blog post.
Miscellaneous
- Reviewer: ECCV’20, ICLR’21, CVPR’21