On-Chip Routers Could be a Choke Point for Future Chips
Interconnect delays dwarfed by slow-downs at sluggish routers in future many-core chips, say researchers
10 March 2010—Dual- and quad-core processors are already fairly common, and the microprocessor industry has its sights set on a not-too-distant future when hundreds of cores will populate each chip.
But in research set to appear in an upcoming issue of IEEE Electron Device Letters, engineers at Georgia Tech, in Atlanta, report that in order for the semiconductor industry to make 100-core chips commercially available by 2015 (with hundreds more cores per chip a few years after that), the basic thinking behind their design will have to change. Until now, researchers thought that delays caused by the chips’ nanometers-thin interconnects would be the bugaboo, but it turns out that routers represent a much bigger source of delay that must be overcome first.
Azad Naeemi, a professor of microelectronics and microsystems at Georgia Tech, sought to find the optimal network-on-chip design. He and a graduate student ran a series of tests on chips measuring 391 square millimeters, each containing 144 cores arranged in 12-by-12 arrays with 1.5 mm between adjacent cores. (These dimensions and the layout are predicted for 2015 by the International Technology Roadmap for Semiconductors.) The researchers concluded that optimal interconnect dimensions using the on-chip routers available today would be about 50 nanometers across, or less than one-tenth the size of the fattest wires in today’s dual- and quad-core chips. Global interconnects—the wires that in today’s chips are made comparatively thick to ensure high bandwidth and low latency—should be close to that minimum dimension. (Interconnects any smaller would not be possible to fabricate.) Naeemi says that making them bigger would be a waste of materials and wiring area.
Why? The team discovered that in a single-hop transfer of data from one core to an adjacent core, the biggest performance issue was the latency caused by the chips’ routers, not the interconnects. They reported that the delay before the first byte arrives at its destination could be as many as 20 clock cycles. In fact, says Naeemi, router delay was so significant that the width of interconnects—usually a significant source of delay—was practically irrelevant.
”We need to make on-chip routers as simple as possible,” says Naeemi, who adds that ”new router types are in the works but are still at research level.” The layout of networks on chips must also be optimized to ensure more direct routing of data from origin to destination. Today’s network-on-chip designs make the data go through several routers, resulting in unnecessary latency, he says.
Naeemi notes that once those issues are addressed, the interconnects will play a much greater role in a chip’s overall performance. Chipmakers will be itching to replace the tiny copper wires that crisscross chips with alternate interconnect technologies, such as carbon nanotubes, graphene nanoribbons, and optical interconnects. But Naeemi predicts that, at least for the next few years, designers and manufacturers will have to live with copper.
This story was corrected on 24 March 2010.