A quick way to measure the power consumption of AI | MIT News

Due to the huge increase in artificial intelligence, it is estimated that data centers will use up to 12 percent of the total US electricity by 2028, according to the Lawrence Berkeley National Laboratory. Improving data center energy efficiency is one way scientists are aiming to make AI sustainable.
To that end, researchers from MIT and the MIT-IBM Watson AI Lab have developed a rapid prediction tool that tells data center operators how much power will be consumed by running a specific AI load on a specific processor or AI accelerator chip.
Their method produces reliable energy estimates in seconds, unlike traditional modeling methods that can take hours or days to produce results. In addition, their prediction tool can be applied to a variety of hardware configurations – even emerging designs that have not yet been installed.
Data center operators can use these metrics to efficiently allocate limited resources across AI models and multiple processors, improving energy efficiency. In addition, this tool can allow algorithm developers and model providers to evaluate the potential energy consumption of a new model before they implement it.
“The challenge of AI sustainability is a pressing question that we need to answer. Because our measurement method is fast, simple, and provides a precise answer, we hope it makes algorithm developers and data center operators think more about reducing energy consumption,” said Kyungmi Lee, an MIT postdoc and lead author of the paper on this method.
He is joined on the paper by Zhiye Song, a graduate student in electrical engineering and computer science (EECS); Eun Kyung Lee and Xin Zhang, research managers at IBM Research and the MIT-IBM Watson AI Lab; Tamar Eilam, IBM Fellow, principal scientist for sustainable computing at IBM Research, and member of the MIT-IBM Watson AI Lab; and senior author Anantha P. Chandrakasan, MIT provost, Vannevar Bush Professor of Electrical Engineering and Computer Science, and member of the MIT-IBM Watson AI Lab. The research was presented this week at the IEEE International Symposium on Performance Analysis of Systems and Software.
Accelerate energy balance
Inside the data center, thousands of powerful graphics processing units (GPUs) perform the tasks of training and feeding AI models. The power consumption of a particular GPU will vary based on its configuration and the task it is handling.
Most traditional methods used to predict power consumption involve breaking the workload into individual steps and simulating how each module within the GPU is used one step at a time. But AI workloads such as model training and data preprocessing are very large and can take hours or days to simulate in this way.
“As an operator, if I want to compare different algorithms or configurations to find the most energy-efficient way to proceed, if a single simulation is going to take days, that’s going to be very inefficient,” Lee said.
To speed up the forecasting process, MIT researchers sought to use small-scale information that could be measured quickly. They found that AI workloads tend to have many repeating patterns. They can use these patterns to generate the information needed to make reliable but quick energy estimates.
In most cases, algorithm developers write programs that will run as efficiently as possible on a GPU. For example, they use a well-planned configuration to distribute work across the same processing cores and move data chunks in a more efficient manner.
“This configuration that software developers use creates a standard structure, and that’s what we’re trying to implement,” Lee explained.
The researchers developed a lightweight measurement model, called EnergAIzer, that captures the GPU’s power consumption pattern in that configuration.
Accurate assessment
But while their estimation was fast, the researchers found that it did not take into account all energy costs. For example, every time the GPU runs a program, there is a constant energy cost required to set up and optimize that program. Then each time the GPU starts working on some data, additional energy costs must be paid.
Due to hardware fluctuations or conflicts in accessing or moving data, the GPU may not be able to use all available bandwidth, slowing performance and drawing more power over time.
To account for these additional costs and variances, the researchers collected real-time measurements from GPUs to generate the correction values they used in their measurement model.
“In this way, we can get a faster and more accurate estimate,” he said.
Finally, the user can provide their workload information, such as the AI model they want to use and the number and length of user inputs to be processed, and EnergAIzer will generate an energy consumption estimate in a matter of seconds.
The user can also change the GPU configuration or adjust the processing speed to see how such choices affect the overall power consumption.
When researchers tested EnergAIzer using real AI load information on real GPUs, it could estimate energy consumption with an error of only 8 percent, compared to traditional methods that could take hours to produce results.
Their method can also be used to predict the power consumption of future GPUs and emerging device configurations, as long as the hardware does not change significantly in the short term.
In the future, the researchers want to test EnergAIzer on new GPU configurations and scale up the model to run on multiple GPUs interacting with each other to run the task.
“To make a real impact on sustainability, we need a tool that can provide a quick solution to measure energy across the stack, for hardware designers, data center operators, and algorithm developers, so that they all know more about energy consumption. With this tool, we’ve taken one step toward that goal,” Lee said.
This research was funded, in part, by the MIT-IBM Watson AI Lab.



