Image
alt="The “Diffusion Forcing” method can sort through noisy data and reliably predict the next steps in a task, helping a robot complete manipulation tasks, for example. In one experiment, it helped a robotic arm rearrange toy fruits into target spots on circular mats despite starting from random positions and visual distractions (Credits: Mike Grimmett/MIT CSAIL)."
CSAIL article

In the current AI zeitgeist, sequence models have skyrocketed in popularity for their ability to analyze data and predict what to do next. For instance, you’ve likely used next-token prediction models like ChatGPT, which anticipate each word (token) in a sequence to form answers to users’ queries. There are also full-sequence diffusion models like Sora, which convert words into dazzling, realistic visuals by successively “denoising” an entire video sequence

Image
Figure 1: Schematic overview of the framework for on-road evaluation of explanations in automated vehicles (Credit: MIT CSAIL and GIST).
CSAIL article

The Proceedings of the ACM on Interactive, Mobile, Wearable, and Ubiquitous Technologies (IMWUT) Editorial Board has awarded MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Gwangju Institute of Science and Technology (GIST) researchers with a Distinguished Paper Award for their evaluation of visual explanations in autonomous vehicles’ decision-making.

Image
alt="The “Faces in Things” dataset is a comprehensive, human-labeled collection of over 5,000 pareidolic images. The research team trained face-detection algorithms to see faces in these pictures, giving insight into how humans learned to recognize faces within their surroundings (Credits: Alex Shipps/MIT CSAIL)."
CSAIL article

In 1994, Florida jewelry designer Diana Duyser discovered what she believed to be the Virgin Mary’s image in a grilled cheese sandwich, which she preserved and later auctioned for $28,000. But how much do we really understand about pareidolia, the phenomenon of seeing faces and patterns in objects when they aren’t really there? 

Image
alt="The automated, multimodal approach developed by MIT researchers interprets artificial vision models that evaluate the properties of images (Credits: iStock)."
CSAIL article

As artificial intelligence models become increasingly prevalent and are integrated into diverse sectors like health care, finance, education, transportation, and entertainment, understanding how they work under the hood is critical. Interpreting the mechanisms underlying AI models enables us to audit them for safety and biases, with the potential to deepen our understanding of the science behind intelligence itself.