PRODUCED BY: Nate Caldwell | WRITTEN BY: Matt Busekroos

Originally from China, Yunzhu Li received his bachelor’s degree in Computer Science from Peking University, in Beijing, China, back in 2017. During his undergraduate years, Li spent seven months as a research assistant at the Stanford AI Lab. After graduation, he decided to pursue his PhD degree at MIT. Li began doing research in 2015, starting to learn about AI and Deep Learning. Due to seeing ground-breaking results ranging from the breakthroughs in the ImageNet challenge to AlphaGo outperforming human players, he decided that he wanted to contribute to research. 

"Completing a PhD is a very good way to not only equip me with the cutting-edge technical skills but be able to make contributions to push these frontiers forwards,” Li said. Li recently completed his PhD at CSAIL, where he was co-advised by Professor Antonio Torralba and Professor Russ Tedrake. His primary research interests are robotics, computer vision, and machine learning. Li works to enable robots to better perceive and interact with the world through sensing, perception, dynamics, and control.

"For sensing, we are equipping the robots with sensors that can obtain information of multiple sensory modalities, especially of vision and touch,” Li said. “In papers accepted to Nature and Nature Electronics, we developed tactile gloves, socks, vests, and robot skin, with tactile sensors covering the surface area that can give us dense tactile sensing information during physical interactions among humans, robots, and environments.”

“Then there is the perception module, which essentially asks what kind of representation we should use to describe the environments. Should the representations be at a very fine level or at some coarser and more abstract levels? Depending on the task, the manipulating object, or even different task stages, we want to choose the representation that suits the best for the task at hand.”

“Then there is a dynamics module. We are asking the question: given the representation, how can we build a world model that can predict the environment’s evolution under a given action? This capability is directly inspired by our human’s intuitive understanding of the physical environment. There is a mental model in our human mind that can imagine how the world state will change if we apply a specific action, like predicting the interaction outcomes of pushing a box, squeezing a water bottle, or spreading peanut butter on a piece of bread. We want to endow the robots with similar capabilities and learn predictive models based on the representations obtained from the perception module.”

“Given the dynamics module, the last component is planning/control, which asks: if we want to achieve a specific goal, how can we plan the robot’s behavior and derive the control signals using the predictive model? This requires us to solve the model-based optimization problem.”

“My research tackled the challenges over the entire robotic pipeline and made important contributions to all four components. I have successfully enabled the robots to accomplish complicated manipulation tasks in both simulation and in the real world, including manipulating an object pile, pouring a cup of water, and shaping deformable foam into a target configuration.”