A less wasteful way to train large language models, such as a GPT string, finishes in the same amount of time with up to 30% less energy, according to a new study from the University of Michigan.
This approach could provide enough energy to power 1.1 million American homes in 2026, based on Wells Fargo’s projections of energy demand using artificial intelligence. It could also reduce the International Monetary Fund’s projection that data centers could account for 1.2% of the world’s carbon emissions by 2027 — and the water demand that comes with energy use.
Some experts say these costs could be outweighed by the environmental benefits. They argue that AI could be a game-changer in the fight against climate change by identifying ways to improve supply chains and the grid, manage our energy needs, and improve climate change research. However, this does not justify wasting energy, and some of the energy used to train the AI has no effect on the training time and accuracy of the model.
“Why spend something when there is no benefit?” Mosharraf Chowdhury, associate professor of computer science and engineering and corresponding author of the study presented in, said 30th Symposium on Principles of Operating Systems Last Monday.
“We can’t keep building bigger and bigger data centers because we won’t have the capacity to operate them,” Chowdhury said. “If we can reduce the power that AI consumes, we can reduce the AI’s carbon footprint and cooling requirements and allow more calculations to fit within our current energy constraints.”
Energy waste is created when AI training is split unevenly between GPUs, which are computer processors that specialize in large data and graphics applications. Although this opens the door to waste, division of labor is necessary to process large data sets.
“Today’s AI models are so large they cannot fit inside a single computer processor,” said Jae-Won Chung, a doctoral student in computer science and engineering and first author of the study. “It would have to be divided into tens of thousands of processors to be trained, but dividing models into exactly equal sizes across all processors is practically impossible.”
It is very difficult to divide training jobs evenly because some tasks need to be grouped together on the same processor, such as how to group each part of a book series together into an organized shelf. Depending on how the quests are grouped, some wizards may get stuck with the AI training equivalent of the Encyclopedia Britannica while others are assigned the fantasy trilogy.
Since current training methods run each processor at full speed, processors with a lighter load will finish their calculations before others. This does not speed up the training process, which is only completed after each processor has finished its task, but it is wasteful because faster calculations require more energy. In addition, issues such as hardware malfunctions or network delays waste power by slowing down the computing speed of a single processor.
To save energy, researchers developed a software tool Perseus called Defines a critical path, or a series of subtasks that will take the longest to complete. Perseus then slows down processors that are not on the critical path so that they all finish their tasks at roughly the same time, eliminating unnecessary power usage.
“Reducing the energy cost of AI could have important implications for equitable access to AI,” Choudhary said. “If a country does not have enough power to run a large model, it may need to use services from farther away, or have to run smaller, less accurate models. This gap may further perpetuate disparities between different societies.”
The team tested Perseus by training GPT-3, three other large language models, and one computer vision model.
Perseus is an open source tool available as part of Zeusa tool to measure and optimize energy consumption with artificial intelligence.
Quotation: Up to 30% of the energy used to train AI is wasted: A software tool could help fix that (2024, November 7) Retrieved November 7, 2024 from
This document is subject to copyright. Notwithstanding any fair dealing for the purpose of private study or research, no part may be reproduced without written permission. The content is provided for information purposes only.