Intel and the National Science Foundation (NSF) have awarded a three-year grant to a research team for research on delivering distributed machine learning computations over wireless edge networks to enable a broad range of new wireless applications. The team is a joint group from the University of Southern California (USC) and the University of California, Berkeley. The award was part of Intel’s and the NSF’s Machine Learning for Wireless Networking Systems effort, a multi-university research program to accelerate “fundamental, broad-based research” on developing wireless-specific machine learning techniques which can be applied to new wireless systems and architecture design.
Machine learning can hopefully manage the size and complexity of next-generation wireless networks. Intel and the NSF focused on efforts to harness discoveries in machine learning to design new algorithms, schemes, and communication protocols to handle density, latency, and throughput demands of complex networks. In total, US $9,000,000 has been awarded to 15 research teams.
The USC and UC Berkeley team will focus on enhanced federated learning over wireless communications. Federated learning refers to performing machine learning securely across all the data collected by hundreds of millions of devices in a large network. Specifically, the team will be researching how to apply federated learning to devices at the edge of the network, which don’t have much in the way of computational resources. The team is led by Salman Avestimehr, a professor in USC’s electrical and computer engineering department, and Kannan Ramshandran, a professor in UC Berkeley’s electrical engineering and computer science department.
“AI [artificial intelligence] and machine learning has been used in a variety of fields. Why not use it to design better wireless networks?” Avestimehr said.
Many apps and services that use machine learning—such as image processing or transaction history analysis—complete their computations in the cloud because very few devices can handle the heavy workload alone. Demand for these kinds of advanced connected services and devices is expected to grow as 5G networks become more available.
While higher speeds are often touted for next-generation networks, just as important is the scalability to meet demand. If connectivity is poor or bandwidth is low, uploading large data sets is not feasible. Machine learning across thousands, or millions, of devices means a lot of communication between devices. Breaking out the workload across multiple cloud services doesn’t significantly reduce the amount of time it takes to run the training algorithm because at least half of the time is spent on machines communicating with each other, Avestimehr said.
There are also security and privacy concerns because users may not want their data to leave their devices. Future wireless networks need to meet the density, latency, throughput, and security requirements of these applications.
State-of-the-art federated learning schemes are currently limited to hundreds of users, Avestimehr said. “There’s a long way to get to one million.”
Avestimehr and Ramshandran’s research will focus on deploying machine learning services closer to where the data is generated on the wireless edge. They hope that will alleviate bandwidth consumption, increase privacy, reduce latency, and boost the scalability of using machine learning on wireless networks. Their research goal is to apply a “coding-centric approach” to enhance federated learning over wireless networks.
Coded computing is a framework pioneered by research groups at USC and UC Berkeley led by Avestimehr and Ramchandran that takes the concepts and tools from information theory and coding that made communication networks efficient and uses them to solve problems in information systems. The problems they will be looking at are the current performance bottlenecks in large-scale distributed computing and machine learning. For example, coding theory has specific codes which are used for error detection and correction, data compression, and increasing the data rate. An error correction code adds extra or redundant bits to make the transmission of data more robust when moving over unreliable or noisy channels. The research teams will adapt these concepts to work with distributed computing and machine learning.
The research will build on Avestimehr’s past work on a DARPA-funded project to enable coded computing for distributed learning across geographically dispersed networks. His team injected “coded” redundant computations into the network to make computing efficient, scalable, and resilient.
After scalability, the second challenge is performing machine learning in a way that preserves privacy so that the input and output are both protected. Given the results of a training algorithm and the actual model, it is possible to invert the process and learn the original data. The user may be confident the app never took the images off the device, but the fact that the results of the computations are uploaded and can be reversed isn’t ideal.
“You think you just ran an algorithm on your data and gave me the results, but I know what images you had,” Avestimehr said.
Past work has looked at how to keep the data private, and how to make the algorithm robust so that it would be able to handle bad data. For privacy concerns, there needs to be a way to use the data for training the machine learning model without seeing the actual data. But there also has to be a way to trust that the computation result is correct, and not the result of manipulated data or an algorithm used improperly. It is a chicken-and-egg problem, Avestimehr said. One of the focus areas for the research is to make both possible by keeping the data and model private while running the training algorithm across multiple systems.
“The fuel of machine learning is the data,” Avestimehr said. “The better data, the more data that you have, the better models you have.”