In our current age of artificial intelligence, computers can generate their own “art” by way of diffusion models, iteratively adding structure to a noisy initial state until a clear image or video emerges. Diffusion models have suddenly grabbed a seat at everyone’s table: Enter a few words and experience instantaneous, dopamine-spiking dreamscapes at the intersection of reality and fantasy. Behind the scenes, it involves a complex, time-intensive process requiring numerous iterations for the algorithm to perfect the image.
Imagine yourself glancing at a busy street for a few moments, then trying to sketch the scene you saw from memory. Most people could draw the rough positions of the major objects like cars, people, and crosswalks, but almost no one can draw every detail with pixel-perfect accuracy. The same is true for most modern computer vision algorithms: They are fantastic at capturing high-level details of a scene, but they lose fine-grained details as they process information.
Audio deepfakes have had a recent bout of bad press after an artificial intelligence-generated robocall purporting to be the voice of Joe Biden hit up New Hampshire residents, urging them not to cast ballots. Meanwhile, spear-phishers — phishing campaigns that target a specific person or group, especially using information known to be of interest to the target — go fishing for money, and actors aim to preserve their audio likeness.
In 2012, the best language models were small recurrent networks that struggled to form coherent sentences. Fast forward to today, and large language models like GPT-4 outperform most students on the SAT. How has this rapid progress been possible?
Peripheral vision enables humans to see shapes that aren’t directly in our line of sight, albeit with less detail. This ability expands our field of vision and can be helpful in many situations, such as detecting a vehicle approaching our car from the side.
Before a robot can grab dishes off a shelf to set the table, it must ensure its gripper and arm won’t crash into anything and potentially shatter the fine china. As part of its motion planning process, a robot typically runs “safety check” algorithms that verify its trajectory is collision-free.
Three MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) members are among 126 early-career researchers honored with 2024 Sloan Research Fellowships by the Alfred P. Sloan Foundation. Representing the departments of Chemistry, Electrical Engineering and Computer Science, and Physics, and the MIT Sloan School of Management, the awardees will receive a two-year, $75,000 fellowship to advance their research.
Pollsters trying to predict presidential election results and physicists searching for distant exoplanets have at least one thing in common: They often use a tried-and-true scientific technique called Bayesian inference.