There are all kinds of apps that will remind you to do things, which is great, if you remember to ask them to remind you to do things. At ICRA yesterday, researchers from Cornell and Stanford presented a project called Watch-Bot, which can independently learn your household activity patterns to provide you with helpful reminders. If you leave the milk out, or forget to turn a monitor off, or leave food in the microwave, the robot will figure out on its own that you forgot to do something and then gently remind you.
Watch-Bot consists of a 3D sensor (a Kinect, in this case), a camera that can pan and tilt, a laptop, and a laser pointer. The robot was set up in a kitchen and an office, and spent a week watching people go about their business. It collected 458 videos of normal human activity, which were then annotated with 21 different actions and 23 types of objects, and in 222 of those videos, someone (deliberately, in the training case) forgot to do something.
For training, Watch-Bot was shown sets of videos of humans mostly remembering to do stuff, followed by sets of videos of humans mostly forgetting to do stuff, but it wasn’t told what order actions were supposed to be in, or what the humans were forgetting to do: it figured that out on its own by using probabilistic learning models capable of detecting patterns and relations directly from the camera and Kinect data. This approach is based on what AI researchers call unsupervised learning. And what does Watch-Bot do once it notices a forgotten thing? It uses its laser pointer to target that thing as a reminder.
Watch-Bot (left) consists of a Kinect v2 sensor that tracks RGB-D frames of human actions, a laptop that detects the forgotten actions and related objects, and a pan/tilt camera with a laser pointer that identifies and points out the object. To detect forgotten actions, the system goes through a series of steps (right). First, it uses the learned model to detect a forgotten action and the related object based on the Kinect’s input. Then it maps the view from the Kinect to the pan/tilt camera so that the bounding box of the object is mapped in the camera’s view. Finally, the camera adjusts pan and tilt until the laser spot hits the target object. Images: Watch-Bot Project
During testing, Watch-Bot was able to tell when humans forgot to do something (and successfully remind them) about 60 percent of the time. Tasks that it provided reminders for included putting a book back on a shelf after reading it, turning a monitor off when you’re done with a computer, putting milk back into the fridge, and getting food from the microwave. Most study participants thought that the robot was helpful, according to the researchers.
It’s important to note that, fundamentally, the robot has no idea what’s going on here. It doesn’t know what a microwave is, or that milk goes bad: there’s no semantic understanding of the scene. All it’s doing, using unsupervised learning algorithms, is tracking probabilistic patterns to detect forgotten actions: it sees that when someone takes an object out of the fridge, that same person almost always puts the object back in the fridge, so if you don’t do that occasionally, it’ll point the object out for you. In some ways, this lack of understanding is what makes Watch-Bot helpful. By avoiding the problem of knowing what things are, it can adapt to new situations and identify new patterns, becoming more helpful over time, especially in new settings where it may not have any prior experience. In fact, the researchers are now working to allow the robot to extend its learning from online videos, including user-generated content from YouTube, as part of a related project called RoboWatch .
As far as robots go, Watch-Bot isn’t exactly a looker. But that’s okay. It’s not designed to be a finished product, but rather it’s a proof of concept of the underlying technology, which can be easily transferred to a wide variety of robots as long as they’re equipped with an RGB-D sensor like a Kinect (which most are) and a laser weapon (which most should be). Even if it doesn’t work all the time, it still seems like a feature that’s worth paying for.