Deep Learning Gets a Boost From New Reconfigurable Processor

The ReAAP processor allows AI to be faster, more efficient

2 min read

different colored beams of light shooting up
iStock

This article is part of our exclusive IEEE Journal Watch series in partnership with IEEE Xplore.

Deep learning is a critical computing approach that is pushing the boundaries of technology—crunching immense amounts of data and uncovering subtle patterns that humans could never discern on their own. But for optimal performance, deep learning algorithms need to be supported with the right software compiler and hardware combinations.In particular, reconfigurable processors, which allow for flexible use of hardware resources for computing as needed, are key.

In a recent study, researchers in Hong Kong report a new reconfigurable processor, dubbed ReAAP, that outperforms several computing platforms commonly used to support deep neural networks (DNNs), a useful form of deep learning that often involves large data sets with many computationally intensive layers of data. They describe it in a paper published on 10 October in IEEE Transactions on Computers.

Whereas ordinary processors typically allow data to be processed using specific hardware pathways, reconfigurable processors offer a more adaptive option: reconfiguring the most efficient hardware resources to process the data as needed.

“Reconfigurable processors combine the advantages of software flexibility and hardware parallelism,” explains Jianwei Zheng, a postdoctoral researcher with the department of electronic and computer engineering at the Hong Kong University of Science and Technology, who was involved in the study.

These advantages prompted his team to create ReAAP, which is an integrated software-hardware system. Its software compiler is responsible for evaluating and optimizing diverse deep-learning workloads. Once it determines the best solution for processing data in parallel, it sends instructions to reconfigure the hardware coprocessor, which allocates appropriate hardware resources to do the parallel computing. “As an end-to-end system, ReAAP can be deployed to accelerate various deep-learning applications just by customizing a Python script in [the] software for each application,” explains Zheng.

In their study, the researchers compared their proposed software compiler in ReAAP to three other baseline software compilers on a Nvidia GPU and an ARM CPU. The results show that it performs 1.9 to 5.7 times as fast as the next best software complier running on the GPU and 1.6 to 3.3 times as fast as the same software compiler running on the CPU.

Additionally, Zheng notes that ReAAP achieves a consistently high utilization of hardware resources for a wide range of diverse compute-intensive layers.

While ReAAP is good at handling DNNs with typical data-intensive workloads, it is currently not well suited to support DNNs when data is sparse. Zheng says his team hopes to address this issue in the future. What’s more, the researchers hope to build upon ReAAP so that it can better process quantized data (data that’s processed in a way that significantly reduces the memory requirement and computational cost of neural networks).

“After the extension [of ReAAP to better handle quantized data] is completed and evaluated, we will consider commercializing it together with a couple of other acceleration solutions for AI computing,” says Zheng, noting that this will make ReAAP more efficient on resource-constrained platforms such as various Internet-of-Things (IoT) devices.

The Conversation (0)