The July 2022 issue of IEEE Spectrum is here!

Close bar

Robots Hallucinate Humans to Aid in Object Recognition

Robots hallucinating humans sounds like a serious problem, but it'll be good for us, we promise

2 min read
Robots Hallucinate Humans to Aid in Object Recognition

Almost exactly a year ago, we posted about how Ashutosh Saxena's lab at Cornell was teaching robots to use their "imaginations" to try to picture how a human would want a room organized. The research was successful, with algorithms that used hallucinated humans (which are the best sort of humans) to influence the placement of objects performing significantly better than other methods. Cool stuff indeed, and now comes the next step: labeling 3D point-clouds obtained from RGB-D sensors by leveraging contextual hallucinated people.

A significant amount of research has been done investigating the relationships between objects and other objects. It's called semantic mapping, and it's very valuable in giving robots what we'd call things like "intuition" or "common sense." However, being humans, we tend to live human-centered lives, and that means that the majority of our stuff tends to be human-centered too, and keeping this in mind can help to put objects in context.

In the above case, a traditional semantic mapping algorithm might take a look at all of the objects on the desk and be able to figure out that its a desk area, but some of the objects (like the bottle of water or the jacket) don't necessarily fit into the "desk" semantic category. When you imagine a human there, though, it starts to make more sense, because clothing and water often show up where humans tend to spend significant amounts of time.

The other concept to deal with is that of object affordances. An affordance is some characteristic of an object that allows a human to do something with it. For example, a doorknob is an affordance that lets us open doors, and a handle on a coffee cup is an affordance that lets us pick it up and drink out of it. There's plenty to be learned about the function of an object by how a human uses it, but if you don't have a human handy to interact with the object for you, hallucinating one up out of nowhere can serve a similar purpose.

Here's a few videos of Cornell's PR2, Kodiak, demonstrating how this works:

We test our system on [a cushion]. We visualize the sampled human poses and object locations in red and blue heatmaps. We can see that most sampled humans are sitting on the couch, bean bag or chair, and the most likely location for the cushion is on the couch and the desk.


Our robot Kodiak (PR2) placing several objects in a fridge, on a table and on the ground, using hallucinated human skeletons and learned human-object relationships.


This research will be presented next week at RSS in Berlin, and also at CVPR in Portland. Because the researchers are good eggs, they've made their papers fully available ahead of time, and you can read them in full here and here.

[ Cornell ] via [ Txnologist ]

The Conversation (0)

How the U.S. Army Is Turning Robots Into Team Players

Engineers battle the limits of deep learning for battlefield bots

11 min read
Robot with threads near a fallen branch

RoMan, the Army Research Laboratory's robotic manipulator, considers the best way to grasp and move a tree branch at the Adelphi Laboratory Center, in Maryland.

Evan Ackerman

“I should probably not be standing this close," I think to myself, as the robot slowly approaches a large tree branch on the floor in front of me. It's not the size of the branch that makes me nervous—it's that the robot is operating autonomously, and that while I know what it's supposed to do, I'm not entirely sure what it will do. If everything works the way the roboticists at the U.S. Army Research Laboratory (ARL) in Adelphi, Md., expect, the robot will identify the branch, grasp it, and drag it out of the way. These folks know what they're doing, but I've spent enough time around robots that I take a small step backwards anyway.

This article is part of our special report on AI, “The Great AI Reckoning.”

The robot, named RoMan, for Robotic Manipulator, is about the size of a large lawn mower, with a tracked base that helps it handle most kinds of terrain. At the front, it has a squat torso equipped with cameras and depth sensors, as well as a pair of arms that were harvested from a prototype disaster-response robot originally developed at NASA's Jet Propulsion Laboratory for a DARPA robotics competition. RoMan's job today is roadway clearing, a multistep task that ARL wants the robot to complete as autonomously as possible. Instead of instructing the robot to grasp specific objects in specific ways and move them to specific places, the operators tell RoMan to "go clear a path." It's then up to the robot to make all the decisions necessary to achieve that objective.

Keep Reading ↓Show less