AI Sparks

Enables privacy-preserving AI training on everyday devices | MIT News

A new method developed by MIT researchers could speed up the privacy-preserving artificial intelligence training method by up to 81 percent. This advance could enable a wide range of resource-constrained edge devices, such as sensors and smartwatches, to run more accurate AI models while keeping user data secure.

MIT researchers have increased the efficiency of a technique known as ensemble learning, which involves a network of connected devices working together to train a shared AI model.

In clustered learning, the model is distributed from a central server to wireless devices. Each device trains a model using its local data and transmits model updates back to the server. Data is kept secure because it resides on each device.

But not all devices on the network have enough capacity, computing power, and connectivity to store, train, and transfer the model back and forth through the server in time. This causes a delay that worsens the training performance.

MIT researchers have developed a strategy to overcome these memory constraints and communication bottlenecks. Their approach is designed to handle a diverse network of wireless devices with varying limitations.

This new approach could make it easier for AI models to be used in applications with strict security and privacy standards, such as healthcare and finance.

“This work is about bringing AI to small devices where it is not possible at the moment to use these types of powerful models. We carry these devices around us in our daily lives. We need AI to be able to work on these devices, not just on large servers and GPUs, and this work is an important step to allow that,” said Irene Tenison, electrical engineer and computer science (EECS) graduate author of this application.

Co-authors include Anna Murphy ’25, a machine learning engineer at Lincoln Laboratory; Charles Beauville, visiting scholar from Ecole Polytechnique Fédérale de Lausanne (EPFL) in Switzerland and machine learning engineer at Flower Labs; and senior author Lalana Kagal, principal research scientist at the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT. The research will be presented at the IEEE International Joint Conference on Neural Networks.

Reducing sleep time

Most collective learning methods assume that all devices in the network have enough memory to train a full AI model, and a stable connection to transmit updates to the server quickly.

But this thinking is limited by the network of various devices, such as smartwatches, wireless sensors, and cell phones. These edge devices have limited memory and processing power, and often experience intermittent network connectivity.

A central server usually waits to receive model updates from all devices, and then averages them to complete the training cycle. This process repeats until training is completed.

“This downtime can slow down the training process or cause it to fail,” Tenison said.

In order to overcome these limitations, MIT researchers developed a new framework called FTTE (Combined Field Training Engine) that reduces the memory and communication overhead required by each mobile device.

Their framework includes three new innovations.

First, rather than broadcasting the entire model to all devices, FTTE sends a small set of model parameters instead, reducing the memory requirement for each device. Parameters are internal variables of the model that are adjusted during training.

FTTE uses a special search process to identify parameters that will maximize the accuracy of the model while staying within a specific memory budget. That limit is set based on the device’s maximum memory latency.

Second, the server updates the model using an asynchronous method. Rather than waiting for responses from all devices, the server accumulates incoming updates until it reaches a constant capacity, and then continues the training round.

Third, server weights are updated from each device based on when they received them. This way, old updates don’t contribute much to the training process. This outdated data can hold the model back, slow down the training process and reduce accuracy.

“We use this semi-asynchronous method because we want to involve less powerful devices in the training process so they can input their data into the model, but we don’t want the powerful devices in the network to sit idle for a long time and waste resources,” Tenison said.

Gaining speed

The researchers tested their framework through simulations with hundreds of heterogeneous devices and a variety of models and data sets. On average, FTTE enabled the training process to reach completion 81 percent faster than conventional combined learning methods.

Their method reduced on-device memory by 80 percent and communication overhead by 69 percent, while approaching the accuracy of other techniques.

“Because we want the model to train as fast as possible to save the battery life of these resource-constrained devices, we have a tradeoff in accuracy. But a small decrease in accuracy may be acceptable for some applications, especially since our method is very fast,” he says.

FTTE also demonstrated effective scaling and delivered high performance gains for large groups of devices.

In addition to these simulations, the researchers tested FTTE on a small network of real devices with different coupling capabilities.

“Not everyone has the latest Apple iPhone. In many developing countries, for example, users may have cell phones with less power. With our strategy, we can bring the benefits of interactive learning to these settings,” he said.

In the future, the researchers want to learn how their method can be used to increase the personal performance of AI models for each device, rather than focusing on the average performance of the model. And they want to do big tests on real hardware.

This work was funded, in part, by a Takeda PhD Fellowship.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button