Driving Down Datacenter Compute & Carbon Cost: Efficient ML

When we think about reducing our carbon footprint as organizations and individuals, we may invest in renewable energy, integrate recyclable materials into product lines, reduce fuel emissions by using public transportation and working remotely, and go paperless with digital subscriptions and electronic statements. Yet many of the measures we take to mitigate the ongoing climate crisis rely heavily on computing and computational resources, which produce their own surprisingly heavy emissions.

As with other types of climate change solutions, while we can take individual actions to save computing power — such as turning machines off when not in use, installing plugins to websites that trim down heavy code, and sending fewer emails — big infrastructure issues loom large. Increased video streaming, cryptocurrency mining, cloud storage, and deep learning all contribute to explosive data growth that requires large-scale solutions. Many experts and policy makers propose artificial intelligence as a solution to our energy problems, but the carbon footprint of AI itself is alarming: In a 2019 study, researchers at the University of Amherst, Massachusetts, found that training just one AI model can produce 626,000 pounds of carbon dioxide, equal to the emissions of five cars in their lifetimes.

As we move toward progressively massive models and deep neural networks that help us automate processes, make groundbreaking predictions, and analyze big data, researchers at MIT are working to shrink the carbon footprint of AI and make machine learning more energy efficient and sustainable.

Designing Efficient Deep Neural Networks

The human brain is the best machine we know for learning and problem-solving, so efficient that it’s used as a gold standard for machine learning. Computer scientists develop algorithms and programs that emulate biological neurons, computational elements of the brain that help us learn through changes in signal-scaling weights associated with synapses in response to stimuli.

Artificial neural networks are designed to help machines learn, perceive, predict, and make decisions similar to biological neural networks. Deep learning takes these capabilities even further. Typical neural networks have an input layer, an output layer, and a “hidden” layer in the middle where values from the input layer are propagated to the neurons there, and then the weighted sums from the hidden layer are propagated to the output layer. Deep learning is possible when there is more than one hidden layer in the network, in which case the network is a deep neural network (DNN).

DNNs can learn high-level features with more complexity and abstraction, achieving superior performance in many tasks -- but added complexity and goals of performing better than human-level accuracy in some areas means much higher energy consumption and cost in hardware.

If living brains can do a lot with very little relative power consumption, how can we make artificial intelligence just as efficient?

This is the challenge for our Information Age: resource management for sustainable computing, which includes hardware, software, electricity, cloud compute time, and financial cost. Often, a huge increase in computation leads to only a fraction of improvement in performance.

“We need to rethink the entire stack — from software to hardware,” said Senior Research Scientist Aude Oliva, who is researching computer vision capabilities and inner workings of deep neural networks with convolutional structure in the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), to MIT News. “Deep learning has made the recent AI revolution possible, but its growing cost in energy and carbon emissions is untenable.”

CSAIL Research Scientist Neil Thompson co-authored a paper on the current computational limits of deep learning. In the paper, the researchers write that continued progress in deep learning applications will require “dramatically more computationally efficient methods, which will either have to come from changes to deep learning or from moving to other machine learning methods.”

To meet the challenges ahead, researchers in CSAIL are making exciting progress in rethinking the way DNNs are designed so that processing and training them is more energy efficient and cost effective.

Predicting Performance of Deep Neural Networks at Scale

When developing DNNs, it is helpful to know what resources are necessary, especially when performing specific tasks at scale. Professor Nir Shavit has helped build a framework, based on the concept of model scaling, that provides accurate performance predictions across scales with 50 times less resources.

Potentially, data scientists and business leaders could use this new framework to make more informed decisions about the resources they want to invest in a system during the DNN development process.

“Our approach tells us things like the amount of data needed for an architecture to deliver a specific target performance, or the most computationally efficient trade-off between data and model size,” said Professor Shavit.

“We view these findings as having far-reaching implications in the field by allowing researchers in academia and industry to better understand the relationships between the different factors that have to be weighed when developing deep learning models, and to do so with the limited computational resources available to academics.”

Finding Winning Subnetworks with the Lottery Ticket Hypothesis

With the emergence of colossal models, such as OpenAI’s GPT-3 language generator that can write convincing articles and top-performing blog posts, approaches to natural language processing (NLP) are reaching unprecedented computational and environmental training costs. To make such DNN models more efficient and accessible, researchers are analyzing how such networks are structured.

CSAIL Professor Michael Carbin and graduate student Jonathan Frankle found that hiding within the bulky model BERT, a costly approach to NLP with a large carbon footprint, are leaner subnetworks that can complete the same tasks and predict as accurately as the larger neural networks. Sometimes, these subnetworks are faster and more efficient.

In their Lottery Ticket Hypothesis, Prof. Carbin and Frankle propose that these “lucky” subnetworks are “winning tickets” that, if discovered in a network, could significantly minimize computing costs for NLP and make massive state-of-the-art models more accessible.

“The hope is that this will lower the cost, that this will make it more accessible to everyone…to the little guys who just have a laptop,” said Frankle.

Deploying Deep Learning onto Hardware and Devices

Deploying deep learning models on diverse hardware platforms presents many challenges, especially as models continue to grow in size and computation cost. Diverse hardware means different requirements, conditions, and properties, so the neural network architecture may vary from platform to platform.

To address this deployment problem, MIT CSAIL researchers have developed an automated AI system called a “once-for-all” network that trains one large network and specializes it for deployment. They achieve this by training subnetworks that can be tailored to diverse hardware platforms without having to retrain every time.

MIT Professor Song Han said that “searching efficient neural network architectures has until now had a huge carbon footprint. But we reduced that footprint by orders of magnitude with these new methods.”

Prof. Han is also working, alongside his student Ji Lin, to bring deep learning to IoT devices by designing compact neural networks with MCUNet.

CSAIL researcher Professor Vivienne Sze has designed a chip called Eyeriss 2 that is flexible enough in design to process compact, data-sparse DNN models. The chip uses 10 times less energy than a mobile GPU, and is versatile enough to also support large DNNs.

Typically, large neural networks require expensive cloud infrastructure, like GPUs. To address this, CSAIL graduate students Lucas Liebenwein and Cenk Baykal are working with CSAIL Director Prof. Daniela Rus on coreset-based pruning methods with performance guarantees of reducing large neural networks into smaller architecture, so that they can be deployed onto small devices. They have shown that it is possible to reduce the number of parameters by 80-90% without reducing accuracy.

The team has developed algorithms that can directly execute the neural network on the robot, without having to send data to the cloud. According to Liebenwein, this direct deployment enables them to “use the neural network for tasks that might otherwise be too computationally expensive for the robot system to run.”

Designing Simpler Deep Learning Models

The human brain is an extraordinary machine to aspire to when designing a neural network -- but some researchers are looking to the living brains of other organisms for inspiration.

MIT CSAIL Director Daniela Rus and CSAIL student Alexander Amini were part of an international research team that took a cue from nature to improve deep learning: specifically, the nematode C. elegans, a type of worm that is able to function with very few neurons.

“Nature shows us that there is still a lot of room for improvement,” said Prof. Rus. “Therefore, our goal was to massively reduce complexity and develop a new kind of neural network architecture.”

The researchers developed a Neural Circuit Policies architecture that is simple and sparse, but still capable of solving complex tasks, sometimes better than ever before. With just a small number of artificial neurons, this AI system was tested to make sure self-driving cars stayed in their lane. After being fed hours of traffic videos and instructional information on car steering in different situations, the network learned to connect images with appropriate steering direction and make steering decisions. The control part of this neural circuit policy consisted of only 19 neurons.

Besides being more efficient and robust, systems with fewer neurons allow scientists to observe individual cells and their behavior with a level of interpretability not possible for larger deep learning models.

Working Toward a More Sustainable Future

The promise of AI is that it will transform our everyday lives and revolutionize industry for the better. To see that promise fulfilled, to keep up with the demands of deep learning and reduce its environmental impact, we will need to continue to find new ways to innovate and rethink the overall design of DNNs. We may even enlist AI to help reduce its own emissions, just as we are becoming more conscious of our carbon footprints and our collective impact on the planet.

Collaborate with CSAIL Researchers on Energy Efficient AI

Many CSAIL researchers actively collaborate with industry partners. If you are interested in connecting with the lab and learning more about the energy efficient AI projects mentioned here, contact Lori Glover, Managing Director, Global Strategic Alliances at loriglover@csail.mit.edu.