One of the longstanding challenges in software engineering is how to reliably translate human intent into formats that can be understood by machines. Many important software engineering problems, such as the oracle problem in software testing, essentially stem from the difficulty of precisely capturing the desired program semantics in executable formats. While such reasoning has traditionally been done by human engineers, the recent advances in deep learning and language modeling techniques have resulted in strong understanding of both natural and programming languages, bridging the gap between the two. In this talk, we will go over the recent advances in uses of Large Language Models (LLMs) to deal with problems that require a joint understanding of natural language and software. Subsequently, we will investigate the remaining challenges and interesting future research directions, especially regarding the reliability of LLM outputs.
Sungmin Kang is a postdoctoral researcher at Korea Advanced Institute of Science and Technology (KAIST). His research on the theory and practice of automated debugging techniques, including bug-reproducing test generation and explainable fault localization and program repair. He is also interested in how developers actually do their jobs, such as how they resolve issues and maintain code, as a means to better understand how academic results could provide tangible benefit for developers. He serves as a reviewer for prominent software engineering journals such as TOSEM and TSE. Sungmin received his PhD in Computer Science from KAIST in 2024.