Multiplexing Could Give Neural Networks a Big Boost

Combining multiple data streams into one feed could speed up networks and let them tackle more than one task at a time

3 min read
animated illustration showing phrases moving from left through a box marked MUX to the neural net and out of a box labelled DEMUX

Data multiplexing for neural networks.

Princeton University

Just as multiplexing can help a single communication channel carry many signals at the same time, a new study reveals that multiplexing can help neural networks—the AI systems that now often power speech recognition, computer vision, and more—scan dozens of streams of data simultaneously, letting them greatly boost the rate at which they analyze information.

In artificial neural networks, components dubbed “neurons” are fed data and cooperate to solve a problem, such as recognizing images. The neural net repeatedly adjusts the links between its neurons and sees if the resulting patterns of behavior are better at finding a solution. Over time, the network discovers which patterns are best at computing results. It then adopts these as defaults, mimicking the process of learning in the human brain. The features of a neural net that change with learning, such as the nature of the connections between neurons, are known as its parameters.

Recent research suggests that modern neural networks often have vastly more parameters than they need—potentially, they could prune the numbers of their parameters by more than 90 percent to reduce their sizes without harming their accuracy. This raised a question that researchers at Princeton University aimed to address—if neural networks possessed more computing power than they needed, could they each analyze multiple streams of information simultaneously to help learn a task, just as a radio channel can share its bandwidth to carry multiple signals at the same time?

The researchers developed a technique they named DataMUX wherein a neural network can analyze multiple data feeds simultaneously as one mixed clump of information. This can significantly boost its efficiency, letting it analyze substantially more quickly while demanding little in the way of extra time, computation, or memory requirements. They suggest their new method, which they detailed online 18 February on the ArXiv preprint server, may be the first instance of data multiplexing in neural networks.

“We hope this can have a substantial impact on energy consumption and the environmental footprint of machine-learning models, especially for computing services that process a large number of requests at a time,” says study coauthor Vishvak Murahari, a machine-learning researcher at Princeton.

DataMUX works by adding a multiplexing layer and a demultiplexing layer to both ends of a neural network. The signals entering the neural network are each given a specific unique key to help distinguish them all, and the multiplexing layer then merges these multiple inputs together into a single compressed feed. After the neural network processes this input, the demultiplexing layer converts the combined output back into multiple separate results.

The scientists conducted experiments with DataMUX using three different kinds of neural networks—transformers, multilayer perceptrons, and convolutional neural networks. The experiments involved several tasks—image recognition; sentence classification, in which a machine aims to identify whether text is spam, a business article, and so on; named entity recognition, which involves locating and classifying named entities such as people, groups, and places.

Experiments with transformers on text-classification tasks revealed they could multiplex up to 40 inputs, achieving up to an 18-fold speedup in the rate at which they could process these inputs with as little as a 2 percent drop in accuracy.

“We could aim to increase throughput of machine-learning models manyfold,” Murahari says. “This is profound since it opens up applications where one might need to invoke the model dynamically with high frequency. For instance, imagine an assistive writing application where one could invoke state-of-the-art language models very frequently, leading to a more intuitive and a smooth running application.” In addition, “products using large machine-learning models could decrease their compute costs dramatically with DataMUX,” Murahari says. “We could also imagine models running on less specialized hardware, such as CPUs instead of GPUs, since models running DataMUX on CPUs could close the throughput gap with nonmultiplexed models running on GPU. This would enable a large number of machine-learning applications to run on low-resource edge devices.”

This 40-fold increase in the inputs received led to only an 18-fold boost in throughput, instead of the expected 40-fold enhancement, likely because of the way in which the keys associated with each input grew in length the more inputs there were. Future work may potentially improve this speedup through better multiplexing and demultiplexing strategies, the researchers say.

As to how neural networks using DataMUX do not get confused by this mixed feed, “we don't really know,” says study coauthor Carlos Jimenez, a machine learning researcher at Princeton. He notes there is nothing that necessarily causes multiple inputs streaming into a neural network to interfere with each other, “but more work needs to be done to really get to the bottom of this.”

A neural network using DataMUX is not limited to performing just one task on this multiplexed data, such as just recognizing names. It could use this combined input to carry out multiple tasks that it is trained on at the same time, such as both recognizing names and classifying sentences, notes study senior author Karthik Narasimhan, a machine-learning researcher at Princeton.

In the future, the researchers aim to experiment with multiplexing state-of-the-art neural networks such as BERT and GPT-3. They would also like to investigate other multiplexing schemes with which they could scale up to hundreds or even thousands of inputs at once, “leading to even larger improvements in throughput,” Murahari says. “We could really just be at the tip of the iceberg.”

The Conversation (1)
Anjan Saha 22 Mar, 2022

MUX means many into

one in signal Chanelling;

DEMUX means the opposite One in many in

Signal Chanelling.

It is used in MODEM & Satellite TRANSPONDER for telecommunications systems. In Biological

System Neurons (Axon , Dendron, Synapses )

Communicates with other other brain cells in this fashions.

AI & Neural networks in

Silicon chip can explore

the possibility of this type of communication

for research and analysis for various physical & biological processes.

Will AI Steal Submarines’ Stealth?

Better detection will make the oceans transparent—and perhaps doom mutually assured destruction

11 min read
A photo of a submarine in the water under a partly cloudy sky.

The Virginia-class fast attack submarine USS Virginia cruises through the Mediterranean in 2010. Back then, it could effectively disappear just by diving.

U.S. Navy

Submarines are valued primarily for their ability to hide. The assurance that submarines would likely survive the first missile strike in a nuclear war and thus be able to respond by launching missiles in a second strike is key to the strategy of deterrence known as mutually assured destruction. Any new technology that might render the oceans effectively transparent, making it trivial to spot lurking submarines, could thus undermine the peace of the world. For nearly a century, naval engineers have striven to develop ever-faster, ever-quieter submarines. But they have worked just as hard at advancing a wide array of radar, sonar, and other technologies designed to detect, target, and eliminate enemy submarines.

The balance seemed to turn with the emergence of nuclear-powered submarines in the early 1960s. In a 2015 study for the Center for Strategic and Budgetary Assessment, Bryan Clark, a naval specialist now at the Hudson Institute, noted that the ability of these boats to remain submerged for long periods of time made them “nearly impossible to find with radar and active sonar.” But even these stealthy submarines produce subtle, very-low-frequency noises that can be picked up from far away by networks of acoustic hydrophone arrays mounted to the seafloor.

Keep Reading ↓ Show less