Enjoy more free content and benefits by creating an account
Saving articles to read later requires an IEEE Spectrum account
The Institute content is only available for members
Downloading full PDF issues is exclusive for IEEE Members
Access to Spectrum's Digital Edition is exclusive for IEEE Members
Following topics is a feature exclusive for IEEE Members
Adding your response to an article requires an IEEE Spectrum account
Create an account to access more content and features on IEEE Spectrum, including the ability to save articles to read later, download Spectrum Collections, and participate in conversations with readers and editors. For more exclusive content and features, consider Joining IEEE.
Join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of Spectrum’s articles, archives, PDF downloads, and other benefits. Learn more →
This article is for IEEE members only. Join IEEE to access our full archive.
If you're already an IEEE member, please sign in to continue reading.
- Get unlimited access to IEEE Spectrum content
- Follow your favorite topics to create a personalized feed of IEEE Spectrum content
- Save Spectrum articles to read later
- Network with other technology professionals
- Establish a professional profile
- Create a group to share and collaborate on projects
- Discover IEEE events and activities
- Join and participate in discussions
Neural rendering harnesses machine learning to paint pixels
Matthew S. Smith is a freelance consumer-tech journalist. An avid gamer, he is a former staff editor at Digital Trends and is particularly fond of wearables, e-bikes, all things smartphone, and CES, which he has attended every year since 2009.
On 20 September, Nvidia’s Vice President of Applied Deep Learning, Bryan Cantanzaro, went to Twitter with a bold claim: In certain GPU-heavy games, like the classic first-person platformer Portal, seven out of eight pixels on the screen are generated by a new machine-learning algorithm. That’s enough, he said, to accelerate rendering by up to 5x.
This impressive feat is currently limited to a few dozen 3D games, but it’s a hint at the gains neural rendering will soon deliver. The technique will unlock new potential in everyday consumer electronics.
Neural rendering as turbocharger
Cantanzaro’s claim is made by possible by DLSS 3, the latest version of Nvidia’s DLSS (Deep Learning Super Sampling). It combines AI-powered image upscaling with a new feature exclusive to DLSS 3: optical multi-frame generation. Sequential frames are combined with an optical flow field used to predict changes between frames. DLSS 3 then slots unique, AI-generated frames between traditionally rendered frames.
“When you’re playing with DLSS super resolution on performance mode in 4K, seven out of every eight pixels are being run through a neural network,” says Cantanzaro. “I think that’s one of the reasons why you see such a great speed-up. In that mode, in games that are GPU-heavy like Portal RTX […] seven out of every eight pixels are being generated by AI, and as a result we’re 530 percent faster.”
This example, which references testing by the 3D graphics publication and YouTube channel Digital Foundry, is a best-case scenario. But results in other tests remain impressive. Most show DLSS 3 delivering a two to three-times performance gain over purely traditional rendering at 4K resolution. And while Nvidia leads the pack, it has competitors. Intel offers XeSS (Xe Super Sampling), an AI-powered upscaler. AMD’s RDNA 3 graphics architecture includes a pair of AI accelerators in each compute unit, though it’s yet unclear how the company will use them.
Microsoft Flight Simulator | NVIDIA DLSS 3 - Exclusive First-Lookwww.youtube.com
Games have led the wave of neural rendering because they’re well suited to use of machine learning techniques. “That problem there, where you look at little patches of an image and try to guess what’s missing, is just a really good fit for machine learning,” says Jon Barron, senior staff researcher at Google. The similarity between frames, along with frame rate high enough to obscure minor errors in motion, works to machine learning’s strengths.
It’s not perfect: DLSS3 has trouble with scene transitions, while XeSS can cause a shimmering effect in some situations. However, both Barron and Catanzaro think obstacles in quality can be overcome by feeding neural rendering models additional training data. 2023 provides the chance to see the technology progress as Nvidia, Intel, and AMD work with software partners to enhance their respective neural rendering techniques.
3D neural rendering steps into the spotlight
This is just the tip of the spear. Barron sees a fork between “2D neural rendering” techniques like Nvidia DLSS 3, which improves the results of a traditional graphics pipeline, and “3D neural rendering,” which generates graphics entirely through machine learning. Barron co-authored a paper on DreamFusion, a machine learning model that generates 3D objects from plain text inputs. The resulting 3D models can be exported to rendering software and game engines. Nvidia has shown equally impressive results with Instant NeRF, which generates full color 3D scenes from 2D images.
Anton Kaplanyan, Vice President of Graphics Research at Intel, believes that neural rendering techniques will make 3D content creation more approachable. “If you look at the current social networks, it’s so much commoditized. A person can just click on a button, take a photo, share it with their friends and relatives,” says Kaplanyan. “If we want to elevate this experience into 3D, we need to pull people [in] who don’t know the professional tools, to become content creators as well.”
DreamFusion can generate 3D models from plain text inputs.Google
The pace of 3D neural rendering’s improvement through 2023 will be a key component of its future. It’s impressive, but unproven compared to traditional rendering. “Computer graphics are amazing, it works really well, and we have really good ways of solving a lot of problems that may be the way we do it forever,” says Barron. He notes content creators and developers are already familiar with the tools used to create for, and optimize, a traditional graphics pipeline.
The question, then, is how quickly the graphics industry will embrace 3D neural rendering as an alternative to tried-and-true methods. It may prove an unsettling transition because of the conflicting incentives that surround it. Machine learning models often run well on modern graphics architectures, but there’s tension in how GPU, CPU, and dedicated AI co-processors—all of which are relevant to AI performance, depending on its implementation—combine in a consumer product. Betting on the wrong technique, or the wrong architecture to support it, could prove a costly mistake.
Still, Catanzaro believes the lure of 3D neural rendering will be hard to resist. “I think that we’re going to see a lot of neural rendering techniques that are even more radical,” he says, referencing generative text-to-image and text-to-3D techniques. “The graphical quality from some of these completely neural models is quite extraordinary. Some of them are able to do shadows and refractions and reflections and, you know, these things that we typically only know how to do in graphics with ray tracing, are able to be simulated by a neural network without any explicit instructions on how to do that. So I would consider those even more radical approaches to neural rendering than DLSS, and I think the future of graphics is going to use both of those things."
Neural rendering’s best perk? Efficiency
Neural rendering is alluring not just because of its potential performance but, also, its potential efficiency. The 530 percent gain DLSS 3 delivers in Portal with RTX can improve framerates—or it can lower power consumption by capping the framerate at a target. In that scenario, DLSS 3 can reduce the cost of rendering each frame.
“Moore’s Law is running out of team. ... My personal belief is that post-Moore graphics is neural graphics."
—Bryan Cantanzaro, Nvidia VP of Applied Deep Learning.
That’s a big deal, because consumer electronics has a problem. Moore’s Law is dead—or, if not dead, on life support. “Moore’s Law is running out of steam, as you know, and my personal belief is that post-Moore graphics is neural graphics,” says Cantanzaro. For Nvidia, neural rendering’s represents a way to keep delivering big gains without doubling up on transistors.
Intel’s Kaplanyan disputes the demise of Moore’s Law (Intel CEO Pat Gelsinger insists it’s alive and well), but agrees neural rendering can improve efficiency. “There are some solutions to chip size, there are the chiplets, which Pat has talked about,” he says. “On the other hand, I also agree that we have a great opportunity with machine learning algorithms to use this energy and this area way more efficiently to produce new visuals.”
Efficiency is a battleground for AMD, Nvidia, and Intel, as all three companies work with device manufacturers to design new consumer laptops and tablets. For device makers, efficiency gains lead to thinner, lighter devices that last longer on battery, while at the same time enhancing what users can accomplish with the device.
“I am very excited about enabling... the experiences that you would otherwise see only in high-end Hollywood movies or Triple-A games, but those experiences you would be able to make yourself,” says Kaplanyan. “You’d be able to do it on your laptop, or some other very power-confined device.”
NVIDIA’s New AI: Wow, Instant Neural Graphics! 🤖www.youtube.com
It’s clear 2023 will be a foundational year for neural rendering in consumer devices. Nvidia’s RTX 40-series with DLSS 3 support will roll out broadly to consumer desktops and laptops; Intel is expected to expand its Arc graphics line with its upcoming ‘Battlemage’ architecture; and AMD will launch more variants of cards using its RDNA 3 architecture.
These releases lay the groundwork for a revolution in graphics. It won’t happen overnight, and it won’t be easy—but as consumers demand ever more impressive visuals, and more capable content creation, from smaller, thinner form factors, neural rendering could prove the best way to deliver.
Watch out Tiger Woods, Golfi has a mean short game
Edd Gent is a freelance science and technology writer based in Bangalore, India. His writing focuses on emerging technologies across computing, engineering, energy and bioscience. He's on Twitter at @EddytheGent and email at edd dot gent at outlook dot com. His PGP fingerprint is ABB8 6BB3 3E69 C4A7 EC91 611B 5C12 193D 5DFC C01B. His public key is here. DM for Signal info.
While being able to drive the ball 300 yards might get the fans excited, a solid putting game is often what separates a golf champion from the journeymen. A robot built by German researchers is quickly becoming a master of this short game using a clever combination of classical control engineering and machine learning.
In golf tournaments, players often scout out the greens the day beforehand to think through how they are going to play their shots, says Annika Junker, a doctoral student at Paderborn University in Germany. So she and her colleagues decided to see if giving a robot similar capabilities could help it to sink a putt from anywhere on the green, without assistance from a human.
Golfi, as the team has dubbed their creation, uses a 3D camera to take a snapshot of the green, which it then feeds into a physics-based model to simulate thousands of random shots from different positions. These are used to train a neural network that can then predict exactly how hard and in what direction to hit a ball to get it in the hole, from anywhere on the green.
On the green, Golfi was successful six or seven times out of ten.
Like even the best pros, it doesn’t get a hole in one every time. The goal isn’t really to build a tournament winning golf robot though, says Junker, but to demonstrate the power of hybrid approaches to robotic control. “We try to combine data-driven and physics based methods and we searched for a nice example, which everyone can easily understand,” she says. “It's only a toy for us, but we hope to see some advantages of our approach for industrial applications.”
So far, the researchers have only tested their approach on a small mock-up green inside their lab. The robot, which is described in a paper due to be presented at the IEEE International Conference on Robotic Computing in Italy next month, navigates its way around the two meter-square space on four wheels, two of which are powered. Once in position it then uses a belt driven gear shaft with a putter attached to the end to strike the ball towards the hole.
First though, it needs to work out what shot to play given the position of the ball. The researchers begin by using a Microsoft Kinect 3D camera mounted on the ceiling to capture a depth map of the green. This data is then fed into a physics-based model, alongside other parameters like the rolling resistance of the turf, the weight of the ball and its starting velocity, to simulate three thousand random shots from various starting points.
This data is used to train a neural network that can predict how hard and in what direction to hit the ball to get it in the hole from anywhere on the green. While it’s possible to solve this problem by combining the physics based model with classical optimization, says Junker, it’s far more computationally expensive. And training the robot on simulated golf shots takes just five minutes, compared to around 30 to 40 hours if they collected data on real-world strokes, she adds.
Before it can make it’s shot though, the robot first has to line its putter up with the ball just right, which requires it to work out where on the green both itself and the ball are. To do so, it uses a neural network that has been trained to spot golf balls and a hard-coded object detection algorithm that picks out colored dots on the top of the robot to work out its orientation. This positioning data is then combined with a physical model of the robot and fed into an optimization algorithm that works out how to control its wheel motors to navigate to the ball.
Junker admits that the approach isn’t flawless. The current set-up relies on a bird’s eye view, which would be hard to replicate on a real golf course, and switching to cameras on the robot would present major challenges, she says. The researchers also didn’t report how often Golfi successfully sinks the putt in their paper, because the figures were thrown off by the fact that it occasionally drove over the ball, knocking it out of position. When that didn’t happen though, Junker says it was successful six or seven times out of ten, and since they submitted the paper a colleague has reworked the navigation system to avoid the ball.
Golfi isn’t the first machine to try its hand at the sport. In 2016, a robot called LDRICK hit a hole-in-one at Arizona's TPC Scottsdale course and several devices have been built to test out golf clubs. But Noel Rousseau, a golf coach with a PhD in motor learning, says that typically they require an operator painstakingly setting them up for each shot, and any adjustments take considerable time. “The most impressive part to me is that the golf robot is able to find the ball, sight the hole and move itself into position for an accurate stoke,” he says.
Beyond mastering putting, the hope is that the underlying techniques the researchers have developed could translate to other robotics problems, says Niklas Fittkau, a doctoral student at Paderborn University and co-lead author of the paper. “You can also transfer that to other problems, where you have some knowledge about the system and could model parts of it to obtain some data, but you can’t model everything,” he says.
- Next-Gen Sensors Make Golf Clubs, Tennis Rackets, and Baseball Bats Smarter Than Ever ›
- Video Friday: Japan's Space Humanoid, Robot Golfer, and Most Destructive Bot Ever ›
Intensive clinical collaboration is fueling growth of NYU Tandon’s biomedical engineering program
Dexter Johnson is a contributing editor at IEEE Spectrum, with a focus on nanotechnology.
This is a sponsored article brought to you by NYU’s Tandon School of Engineering.
When Andreas H. Hielscher, the chair of the biomedical engineering (BME) department at NYU’s Tandon School of Engineering, arrived at his new position, he saw raw potential. NYU Tandon had undergone a meteoric rise in its U.S. News & World Report graduate ranking in recent years, skyrocketing 47 spots since 2009. At the same time, the NYU Grossman School of Medicine had shot from the thirties to the #2 spot in the country for research. The two scientific powerhouses, sitting on opposite banks of the East River, offered Hielscher a unique opportunity: to work at the intersection of engineering and healthcare research, with the unmet clinical needs and clinician feedback from NYU’s world-renowned medical program directly informing new areas of development, exploration, and testing.
“There is now an understanding that technology coming from a biomedical engineering department can play a big role for a top-tier medical school,” said Hielscher. “At some point, everybody needs to have a BME department.”
In the early days of biomedical engineering departments nationwide, there was some resistance even to the notion of biomedical engineering: either you were an electrical engineer or a mechanical engineer. “That’s no longer the case,” said Hielscher. “The combining of the biology and medical aspects with the engineering aspects has been proven to be the best approach.”
The proof of this can be seen by the trend that an undergraduate biomedical degree has become one of the most desired engineering degrees, according to Hielscher. He also noted that the current Dean of NYU’s Tandon School of Engineering, Jelena Kovačević, has a biomedical engineering background, having just received the 2022 IEEE Engineering in Medicine and Biology Society career achievement award for her pioneering research related to signal processing applications for biomedical imaging.
Mary Cowman, a pioneer in joint and cartilage regeneration, began laying the foundations for NYU Tandon’s biomedical engineering department in the 2010s. Since her retirement in 2020, Hielscher has continued to grow the department through innovative collaborations with the medical school and medical center, including the recently-announced Translational Healthcare Initiative, on which Hielscher worked closely with Daniel Sodickson, the co-director of the medical school’s Tech4Health.
Andreas Hielscher joined NYU Tandon in 2020 as Professor and Chair of the Department of Biomedical Engineering.
“The fundamental idea of the Initiative is to have one physician from Langone Medical School, and one engineer at least—you could have multiple—and have them address some unmet clinical needs, some particular problem,” explained Hielscher. “In many cases they have already worked together, or researched this issue. What this initiative is about is to give these groups funding to do some experimentation to either prove that it won’t work, or demonstrate that it can and prioritize it.”
With this funding of further experimentation, it becomes possible to develop the technology to a point where you could begin to bring investors in, according Hielscher. “This mitigates the risk of the technology and helps attract potential investors,” added Hielscher. “At that point, perhaps a medical device company comes in, or some angel investor, and then you can get to the next level of investment for moving the technology forward.”
Biophotonics for Cancer Diagnosis
Hielscher himself has been leading research on developing new technologies within the Clinical Biophotonics Laboratory. One of the latest areas of research has been investigating the application of optical technologies to breast cancer diagnosis.
Cross sections of a breast with a tumor during a breath hold, taken with a dynamic optical tomographic breast imaging system developed by Dr. Hielscher, As a patient holds their breath, the blood concentration increases by up to 10 percent (seen in red). Dr. Hielscher’s team found that analyzing the increase and decrease in blood concentrations inside a tumor could help them determine which patients would respond to chemotherapy.
A.H. Hielscher, Clinical Biophotonics Laboratory
Hielscher and his colleagues have built a system that shines light through both breasts at the same time. By measuring how much light is reflected back it’s possible to generate maps of locations with high levels of oxygen and total hemoglobin, which may indicate tumors.
“We look at where there’s blood in the breast,” explained Hielscher. “Because breast tumors recruit new blood vessels, or, once they grow, they generate their own vascular network requiring more oxygen, wherever there is a tumor you will see an increase in total blood volume, and you will see more oxygenated blood.”
Initially, this diagnostic tool was targeted for early detection, since mammograms can only detect calcification in lower density breast tissue of women over a certain age. But it soon became clear in collaboration with clinical partners that it was also highly effective in monitoring treatment.
“Technology coming from a biomedical engineering department can play a big role for a top-tier medical school”
—Andreas H. Hielscher, Biomedical Engineering Department Chair, NYU Tandon
This realization came in part because of a recent change in cancer treatment that has moved towards what is known as neoadjuvant chemotherapy, in which chemotherapy drugs are administered before surgical extraction of the tumor. One of the drawbacks of this approach is that only around 60 percent of patients respond favorably to the chemotherapy, resulting in a large percentage of patients suffering through a grueling six-month-long chemotherapy treatment with minimal-to-no impact on the tumor.
With the optical technique, Hielscher and his colleagues have found that if they can detect a noticeable decrease of blood in targeted areas after two weeks, it’s very likely that the patient will respond to the chemotherapy. On the other hand, if they see that the amount of blood in that area stays the same, then there’s a very high likelihood that the patient will not respond to the therapy.
This same fundamental technique can also be applied to what is known as peripheral artery disease (PAD), which affects many patients with diabetes and involves the narrowing or blockage of the vessels that carry blood from the heart to the legs. An Israel-based company called VOTIS has licensed the technology for diagnosing and treating PAD.
Example of a frequency-domain image of a finger joint (proximal interphalangeal joint of index finger) affected by lupus arthritis.
A.H. Hielscher, Clinical Biophotonics Laboratory
While Hielscher’s work is in biophotonics, he recognized that the department has also quickly been developing a reputation in other emerging areas, including wearables, synthetic biology, and neurorehabilitation and stroke prediction.
Hielscher highlighted the recent work of Rose Faghih, working in smart wearables and data for mental health, Jef Boeke, a synthetic biology pioneer, and S. Farokh Atashzar, doing work in neurorehabilitation and stroke prediction. Atashzar’s work was highlighted last year in the pages of IEEE Spectrum.
“Rose Faghih is leveraging all kinds of sensors to make inferences about the mental state of patients, to determine if someone is depressed or schizophrenic, and then possibly have a feedback loop where you actually also treat them,” said Hielscher. “Jef Boeke is involved in what I term ‘wet engineering,’ and is currently involved in efforts to take cancer cells outside of the body to find a way to attack them, or reprogram them.”
As NYU Tandon’s BME department goes forward, Hielscher’s aim is that the department becomes a trusted source for the medical school, and that partnership enables key technologies to go from an unmet clinical need or an idea in a lab to a patient’s bedside in a 3-5 year timeframe.
“What I really would like, “Hielscher concluded, “is that if somebody in the medical school has a problem, the first thing they would say is, ‘Oh, I’ll call the engineering school. I bet there’s somebody there that can help me.’ We can work together to benefit patients, and we’re starting this already.”