Origami and AI's Abstract Thinking with MIT CSAIL Postdoc Emanuele Sansone

 

Audrey Woods, MIT CSAIL Alliances | June 22, 2026
Almost thirty years ago, it was big news when an AI model beat a human grandmaster in chess. While today's models far surpass the conquering supercomputer, they still fall far short of humans on abstract reasoning, physical intelligence, and efficient real-world learning. In other words, they excel at chess but fail at origami. 

The ancient Japanese art of paper folding turns out to be a surprisingly good test of what current AI cannot yet do, which is why MIT CSAIL Postdoc Emanuele Sansone is using origami to evaluate the abstract, spatial reasoning of existing and future models. Working in Professor Armando Solar-Lezama's Computer Assisted Programming Group, Dr. Sansone develops models that can abstract meaning from images and actionable information from previous experience, leveraging neurosymbolic programming to combine the pattern-recognition power of neural networks with the interpretability and transparency of symbolic programming. 

 

ORIGAMI: A BENCHMARK FOR ABSTRACT THINKING 

Building AI models that can understand and interact with the 3D world is one of the biggest open challenges in machine learning. Beyond pattern recognition, true physical AI will require the ability to understand causal mechanisms and physical constraints, making a sequence of decisions or actions based on the laws of physics or previous experience. The goal is to find a way to model physical AI after human intelligence, which can learn from very few examples, leverage abstraction, and solve problems by efficiently drawing on past examples. 

Dr. Sansone and his colleagues studied the human version of this in a recent cognitive science study, where they developed a Pattern Builder Task in which participants had to recreate a target visual pattern using a small set of geometric pieces, like lines, diagonals, squares, and triangles. To build a thick cross, for instance, a participant might start with a horizontal line, combine it with another to make a thick bar, then reflect the bar diagonally and add the two together. Crucially, participants could save intermediate shapes like the thick bar as reusable "helpers," building their own personal library to draw on for future puzzles. Over time, users converged on similar abstractions and became more efficient. Most tellingly, human solution times tracked closely with the search complexity of finding a helper, not with the raw length of the program. What makes a puzzle hard for humans isn't how long the answer is, but how much effort it takes to discover the right abstractions while solving it. 

"We are very good at abstraction," Dr. Sansone says. "Machines are not." 

The question is whether machines can be taught to do the same. To evaluate this, Dr. Sansone led the development of OrigamiBench, an interactive environment in which models can propose folds based on a target configuration and receive feedback on their responses. Origami, he explains, is an excellent representative of abstract thinking for a number of reasons. For one, "it has this nice balance between being able to express very complex things, and at the same time being simple enough to systematize." With a single primitive operation—the fold—origami can produce a remarkable range of shapes, but doing so requires spatial thinking and abstracting what a fold will look like based on previous experience. To recreate a shape, a model must perceive the target, recognize its components, decompose the problem into smaller subproblems, and plan a sequence of physically valid folds. "That really requires a lot of abstraction and problem solving. It's very complex." 

Origami also “sits in a regime where only limited data is available,” challenging the assumption of abundant data and large-scale computation currently dominant in other modern machine learning approaches. “Large-scale computation has become increasingly accessible as compute has become cheaper, while the web provides a vast resource for training machine learning models” Dr. Sansone explains. “However, data is not distributed uniformly across the web. Some domains and topics are represented far more frequently than others.” Origami is one such underrepresented domain. Even synthesizing origami data for training proves problematic, since using large-scale computation to explore the space and search for ‘meaningful’ origamis is “like finding a needle in a haystack.” Instead, a model must understand the “underlying mathematical language for describing the synthesis of origami folds, associated in particular with the work of Robert J. Lang, which involves learning abstractions that make it possible to build increasingly complex shapes. This means, in order to master origami, algorithms would need to master its underlying language,” deploying true abstract thinking to do so. 

OrigamiBench reveals just how far today's models have to go on physical planning tasks. While most models can produce syntactically valid actions, even the strongest of them fail to compose those actions into multi-step plans, often getting stuck after one or two folds and producing only the simplest of shapes. "They don't have an understanding of what it means to physically fold the paper," Dr. Sansone says. In multiple choice tests to choose the next shape in a sequence, when wrong options were drawn from the same folding sequence as the correct answer rather than an entirely different shape, even the strongest models scored lower than 50%. 

Dr. Sansone's goal now is to develop a "language of construction," much like the helpers in the cognitive science study, which would enable the AI to reference a library of foundational moves and then compose those moves into an increasingly complex "grammar" of actions leading to origami shapes. Here neurosymbolic programming becomes essential. Rather than the black boxes of pure neural networks, the incorporation of symbolic programs in future models will allow researchers like Dr. Sansone to see inside a model's reasoning process and know it is accurately using the language of abstraction rather than just hallucinating correct answers. With neurosymbolic programming "you can go back and check what the model has learned, because you can check what language the model has learned. From there you can discover how the models behave, study the biases present in the data, etc." If a given training leads a model toward the wrong abstractions, researchers can see exactly where things went off course and correct it, a crucial property for trustworthy AI. 

 

TOWARD MACHINES THAT REASON 

The challenges Dr. Sansone is tackling are core to what makes human intelligence so flexible and physical AI so difficult. Bridging this gap will require progress on three intertwined fronts: learning from small amounts of data, doing efficient inference over discrete structures, and generating diverse, creative hypotheses about the world. Each of these is a frontier in its own right, and advances in any one will push the others forward. OrigamiBench offers a structured way to measure that progress and suggests new approaches for integrating vision, language, and geometric reasoning. 

Dr. Sansone’s work has implications well beyond paper-folding, with applicability in scientific discovery and education. Models equipped with real-world intelligence and a "language" for describing biological mechanisms could help medical researchers discover new molecules or treatments. AI might be able to describe and discover the underlying methods used to prove math theorems by abstracting information and reasoning upon it, a skill which could be transferred to new domains. This line of work also has important connections to cognitive science, especially to the study of how humans acquire, structure, and transfer skills across domains. The possibilities for education are particularly striking, where a future reasoning model could watch a video lecture, capture the underlying knowledge it conveys, offer more in-depth explanations to students, and identify what is incorrect, flagging errors that a pattern-matching model would simply repeat. 

For Dr. Sansone, the uncertainty of research keeps him motivated. "I make plans, but most of the time the outcomes are unexpected. There is surprise in the process of doing research that really makes me interested, because with these unexpected outcomes, you end up learning things you never would have thought of." And from those lessons, Dr. Sansone can abstract knowledge that might someday make AI models reason about the world as well as he does. 

 

Visit Dr. Emanuele Sansone's website to learn more.