| Nov 08, 2025 | I’m so happy to share a news that our paper has been accepted to AAAI-26 as oral presentation!! Do you think MLLMs understand sound as human understands? For some models, the answer is YES, for others, the answer is NO. GPT and Gemini seems to not infer the sound as humans does. However, Qwen2.5 resembled the result of experiment conducted by humans. Our research digged down to this question. This is the first paper I participated as co-author, researching with Jinhong Jeong and supervised by professor Youngjae Yu. I would like to appreciate all authors; Jinhong Jeong, Jaeyoung Lee, Seonah Han, and Youngjae Yu. Jinhong Jeong provided fabulous and deep insights with linguistics, leading the whole research. Jaeyoung Lee provided marvelous ideas with the research, helping to turn the idea of the research to consider mechanistic interpretability with our research. Seonah Han devoted a lot of effort with ideation during our meeting, and spent lots of effort in constructing and preprocessing the dataset. And, Youngjae Yu supervised our research a lot, always helped our research to think out of the box and to think about key questions that researchers would ask. I, Sunghyun Lee, took effort in constructing the dataset, set the experiments to be precise and persuasive (introducing semantic dimension for the experiment), and analyzed the attention layers. Here I share you the links: [github] [arxiv] See you in Singapore! |