Scaling Video Reasoning for the Real World
직함: Assistant Professor

Video understanding has advanced rapidly, yet current models still struggle with the complexity of real-world scenarios. Unlike curated benchmarks, real-world video is often long, noisy, and incomplete, making reliable reasoning significantly more challenging. In this talk, I will discuss how video understanding systems can scale across both training and inference by better identifying salient information, leveraging diverse multimodal signals, and retaining past experience through an advanced memory system. I will further highlight a shift toward more active forms of reasoning, where models move beyond passive observation and adaptively allocate computation at inference time to support long-horizon reasoning.
Dr. Jaehong Yoon is an Assistant Professor at Nanyang Technological University (NTU), Singapore. Prior to joining NTU, he was a postdoctoral research associate at UNC-Chapel Hill, working with Prof. Mohit Bansal. He received his Ph.D. from the School of Computing at KAIST, advised by Prof. Sung Ju Hwang. His research focuses on building AI systems that reliably operate and continuously learn in complex real-world environments. Dr. Yoon has received several honors, including the AAAI New Faculty Highlights (2026), the NSCC Young Investigator Seed Project Award (2026), the CoLLAs Early-Career Spotlight (2025), and the Google PaliGemma Academic Program Award (2024). He will serve as the DEI Chair for CoLLAs 2026 and has served as an Area Chair for multiple venues, including ACL 2026, NeurIPS 2025, NAACL 2025, and EMNLP 2024; 2025.