The Hertz Foundation announced that it has awarded fellowships to eight MIT affiliates. The prestigious award provides each recipient with five years of doctoral-level research funding (up to a total of $250,000), which gives them an unusual measure of independence in their graduate work to pursue groundbreaking research.
When you’re trying to communicate or understand ideas, words don’t always do the trick. Sometimes the more efficient approach is to do a simple sketch of that concept — for example, diagramming a circuit might help make sense of how the system works.
But what if artificial intelligence could help us explore these visualizations? While these systems are typically proficient at creating realistic paintings and cartoonish drawings, many models fail to capture the essence of sketching: its stroke-by-stroke, iterative process, which helps humans brainstorm and edit how they want to represent their ideas.
Humans naturally learn by making connections between sight and sound. For instance, we can watch someone playing the cello and recognize that the cellist’s movements are generating the music we hear.
Diffusion models like OpenAI’s DALL-E are becoming increasingly useful in helping brainstorm new designs. Humans can prompt these systems to generate an image, create a video, or refine a blueprint, and come back with ideas they hadn’t considered before.
A human clearing junk out of an attic can often guess the contents of a box simply by picking it up and giving it a shake, without the need to see what’s inside. Researchers from MIT, Amazon Robotics, and the University of British Columbia have taught robots to do something similar.
What would a behind-the-scenes look at a video generated by an artificial intelligence model be like? You might think the process is similar to stop-motion animation, where many images are created and stitched together, but that’s not quite the case for “diffusion models” like OpenAl's SORA and Google's VEO 2.
When the Venice Biennale’s 19th International Architecture Exhibition launches on May 10, its guiding theme will be applying nimble, flexible intelligence to a demanding world — an ongoing focus of its curator, MIT faculty member Carlo Ratti.
Essential for many industries ranging from Hollywood computer-generated imagery to product design, 3D modeling tools often use text or image prompts to dictate different aspects of visual appearance, like color and form. As much as this makes sense as a first point of contact, these systems are still limited in their realism due to their neglect of something central to the human experience: touch.
When visual information enters the brain, it travels through two pathways that process different aspects of the input. For decades, scientists have hypothesized that one of these pathways, the ventral visual stream, is responsible for recognizing objects, and that it might have been optimized by evolution to do just that.
Think of your most prized belongings. In an increasingly virtual world, wouldn’t it be great to save a copy of that precious item and all the memories it holds?