Image
Top row, left to right: Matthew Caren, April Qiu Cheng, Arav Karighattam, and Benjamin Lou. Bottom row, left to right: Isabelle Quaye, Albert Qin, Ananthan Sadagopan, and Gianfranco (Franco) Yee (Credits: Photos courtesy of the Hertz Foundation).
CSAIL article

The Hertz Foundation announced that it has awarded fellowships to eight MIT affiliates. The prestigious award provides each recipient with five years of doctoral-level research funding (up to a total of $250,000), which gives them an unusual measure of independence in their graduate work to pursue groundbreaking research.

Image
alt="SketchAgent uses a multimodal language model to turn natural language prompts into sketches in a few seconds. It can doodle on its own or through collaboration, drawing with a human or incorporating text-based input to sketch each part separately (Credits: Alex Shipps/MIT CSAIL, with AI-generated sketches from the researchers)."
CSAIL article

When you’re trying to communicate or understand ideas, words don’t always do the trick. Sometimes the more efficient approach is to do a simple sketch of that concept — for example, diagramming a circuit might help make sense of how the system works.

But what if artificial intelligence could help us explore these visualizations? While these systems are typically proficient at creating realistic paintings and cartoonish drawings, many models fail to capture the essence of sketching: its stroke-by-stroke, iterative process, which helps humans brainstorm and edit how they want to represent their ideas.

Image
PhD student Faraz Faruqi, lead author of a new paper on the project, says that TactStyle could have far-reaching applications extending from home decor and personal accessories to tactile learning tools (Credits: Mike Grimmett/MIT CSAIL).
CSAIL article

Essential for many industries ranging from Hollywood computer-generated imagery to product design, 3D modeling tools often use text or image prompts to dictate different aspects of visual appearance, like color and form. As much as this makes sense as a first point of contact, these systems are still limited in their realism due to their neglect of something central to the human experience: touch.

Image
The models were trained on a dataset of synthetic images like the ones pictured, with objects such as tea kettles or calculators superimposed on different backgrounds. Researchers trained the model to identify one or more spatial features of an object, including rotation, location, and distance (Credits: Courtesy of the researchers).
CSAIL article

When visual information enters the brain, it travels through two pathways that process different aspects of the input. For decades, scientists have hypothesized that one of these pathways, the ventral visual stream, is responsible for recognizing objects, and that it might have been optimized by evolution to do just that.