How to Program Sony's Robot Dog Aibo

We take a look at the new visual programming interface and API for Sony’s adorable robot dog

11 min read

Evan Ackerman is IEEE Spectrum’s robotics editor.

Sony Aibo robot dog
Image: Sony

The Sony Aibo has been the most sophisticated home robot that you can buy for an astonishing 20 years. The first Aibo went on sale in 1999, and even though there was a dozen year-long gap between 2005’s ERS-7 and the latest ERS-1000, there was really no successful consumer robot over that intervening time that seriously challenged the Aibo.

Part of what made Aibo special was how open Sony was user customization and programmability. Aibo served as the RoboCup Standard Platform for a decade, providing an accessible hardware platform that leveled the playing field for robotic soccer. Designed to stand up to the rigors of use by unsupervised consumers (and, presumably, their kids), Aibo offered both durability and versatility that compared fairly well to later, much more expensive robots like Nao

Aibo ERS-1000: The newest model

The newest Aibo, the ERS-1000, was announced in late 2017 and is now available for US $2,900 in the United States and 198,000 yen in Japan. It’s faithful to the Aibo family, while benefiting from years of progress in robotics hardware and software. However, it wasn’t until last November that Sony opened up Aibo to programmers, by providing visual programming tools as well as access to an API (application programming interface). And over the holidays, Sony lent us an Aibo to try it out for ourselves.

This is not (I repeat not) an Aibo review: I’m not going to talk about how cute it is, how to feed it, how to teach it to play fetch, how weird it is that it pretends to pee sometimes, or how it feels to have it all snuggled up in your lap while you’re working at your computer. Instead, I’m going to talk about how to (metaphorically) rip it open and access its guts to get it to do exactly what you want.

Sony Aibo ERS-1000The newest Aibo, the ERS-1000, was announced in late 2017 and is now available for US $2,900 in the United States and 198,000 yen in Japan.Photo: Evan Ackerman/IEEE Spectrum

As you read this, please keep in mind that I’m not much of a software engineer—my expertise extends about as far as Visual Basic, because as far as I’m concerned that’s the only programming language anyone needs to know. My experience here is that of someone who understands (in the abstract) how programming works, and who is willing to read documentation and ask for help, but I’m still very much a beginner at this. Fortunately, Sony has my back. For some of it, anyway.

Getting started with Aibo’s visual programming

The first thing to know about Sony’s approach to Aibo programming is that you don’t have access to everything. We’ll get into this more later, but in general, Aibo’s “personality” is completely protected and cannot be modified:

When you execute the program, Aibo has the freedom to decide which specific behavior to execute depending on his/her psychological state. The API respects Aibo's feelings so that you can enjoy programming while Aibo stays true to himself/herself.

This is a tricky thing for Sony, since each Aibo “evolves” its own unique personality, which is part of the appeal. Running a program on Aibo risks very obviously turning it from an autonomous entity into a mindless robot slave, so Sony has to be careful to maintain Aibo’s defining traits while still allowing you to customize its behavior. The compromise that they came up with is mostly effective, and when Aibo runs a program, it doesn’t disable its autonomous behaviors but rather adds the behaviors you’ve created to the existing ones. 

Aibo’s visual programming system is based on Scratch. If you’ve never used Scratch, that’s fine, because it’s a brilliantly easy and intuitive visual language to use, even for non-coders. Sony didn’t develop it—it’s a project out of MIT, and while it was originally designed for children, it’s great for adults who don’t have coding experience. Rather than having to type in code, Scratch is based around colorful blocks that graphically represent functions. The blocks are different shapes, and only fit together in a way that will yield a working bit of code. Variables appear in handy little drop-down menus, and you can just drag and drop different blocks to build as many programs as you want. You can even read through the code directly, and it’ll explain what it does in a way that makes intuitive sense, more or less:

Sony Aibo visual programming exampleA sample Aibo visual program from Sony.Screenshot: Evan Ackerman/IEEE Spectrum

Despite the simplicity of the visual programming language, it’s possible to create some fairly complex programs. You have access to control loops like if-then-else and wait-until, and multiple loops can run at the same time. Custom blocks allow you to nest things inside of other things, and you have access to variables and operators. Here’s a program that I put together in just a few minutes to get Aibo to entertain itself by kicking a ball around:

Sony Aibo visual programming codeA program I created to make Aibo chase a ball around.Screenshot: Evan Ackerman/IEEE Spectrum

This program directs Aibo to respond to “let’s play” by making some noises and motions, locating and approaching its ball, kicking its ball, and then moving in some random directions before repeating the loop. Petting Aibo on its back will exit the loop.

Programming Aibo: What you can (and can’t) do

It’s a lot of fun to explore all of Aibo’s different behaviors, although if you’re a new user, it does minimize a bit of the magic to see this big long list of everything that Aibo is capable of doing. The granularity of some of commands is a little weird—there’s a command for “gets close to” an object, as well as a command for “gets closer to” an object. And rather than give you direct access to Aibo’s servos to convey emotions or subtle movement cues, you’re instead presented with a bewildering array of very specific options, like:

  • Aibo opens its mouth a little and closes it
  • Aibo has an “I get it” look
  • Aibo gives a high five with its right front paw
  • Aibo faces to the left petulantly
  • Aibo has a dream of becoming a human being and runs about

Unfortunately, there’s no way to “animate” Aibo directly—you don’t have servo-level control, and unlike many (if not most) programmable robots, Sony hasn’t provided a way for users to move Aibo’s servos and then have the robot play back those motions, which would have been simple and effective.

Running one of these programs can be a little frustrating at times, because there’s no indication of when (or if) Aibo transitions from its autonomous behavior to your program—you just run the program and then wait. Sony advises you to start each program with a command that puts Aibo’s autonomy on hold, but depending on what Aibo is in the middle of doing when you run your program, it may take it a little bit to finish its current behavior. My solution for this was to start each program with a sneeze command to let me know when things were actually running. This worked well enough I guess, but it’s not ideal, because sometimes Aibo sneezes by itself.

Running one of these programs can be a little frustrating at times, because there's no indication of when (or if) Aibo transitions from its autonomous behavior to your program. My solution for this was to start each program with a sneeze command to let me know when things were actually running.

The biggest restriction of the visual programming tool is that as far as I can tell there’s no direct method of getting information back from Aibo—you can’t easily query the internal state of the robot. For example, if you want to know how much battery charge Aibo has, there’s a sensing block for that, but the best you seem to be able to do is have Aibo do specific things in response to the value of that block, like yap a set number of times to communicate what its charge is. More generally, however, it can be tough to write more interactive programs, because it’s hard to tell when, if, why, or how such programs are failing. From what I can tell, there’s no way “step” through your program, or to see which commands are being executed when, making it very hard to debug anything complicated. And this is where the API comes in handy, since it does give you explicit information back.

Aibo API: How it works

There’s a vast chasm between the Aibo visual programming language and the API. Or at least, that’s how I felt about it. The visual programming is simple and friendly, but the API just tosses you straight into the deep end of the programming pool. The good news is that the majority of the stuff that the API allows you to do can also be done visually, but there are a few things that make the API worth having a crack at, if you’re willing to put the work in.

The first step to working with the Aibo API is to get a token, which is sort of like an access password for your Sony Aibo account. There are instructions about how to do this that are clear enough, because it just involves clicking one single button. Step two is finding your Aibo’s unique device ID, and I found myself immediately out of my comfort zone with Sony’s code example of how to do that:

$ curl -X GET \
-H "Authorization:Bearer ${accessToken}" 

As it turns out, “curl” (or cURL) is a common command line tool for sending and receiving data via various network protocols, and it’s free and included with Windows. I found my copy in C:\Windows\System32. Being able to paste my token directly into that bit of sample code and have it work would have been too easy—after a whole bunch of futzing around, I figured out that (in Windows) you need to explicitly call “curl.exe” in the command line and that you have to replace “${accessToken}” with your access token, as opposed to just the bit that says “accessToken.” This sort of thing may be super obvious to many people, but it wasn’t to me, and with the exception of some sample code and a reasonable amount of parameter-specific documentation, Sony itself offers very little hand-holding. But since figuring this stuff out is my job, on we go!

I don’t have a huge amount of experience with APIs (read: almost none), but the way that the Aibo API works seems a little clunky. As far as I can tell, everything runs through Sony’s Aibo server, which completely isolates you from the Aibo itself. As an example, let’s say we want to figure out how much battery Aibo has left. Rather than just sending a query to the robot and getting a response, we instead have to ask the Aibo server to ask Aibo, and then (separately) ask the Aibo server what Aibo’s response was. Literally, the process is to send an “Execute HungryStatus” command, which returns an execution ID, and then in a second command you request the result of that execution ID, which returns the value of HungryStatus. Weirdly, HungryStatus is not a percentage or a time remaining, but rather a string that goes from “famished” (battery too low to move) to “hungry” (needs to charge) to “enough” (charged enough to move). It’s a slightly strange combination of allowing you to get deep into Aibo’s guts while seeming trying to avoid revealing that there’s a robot under there.

Sony Aibo command line exampleExample of the code required to determine Aibo’s charge. (I blurred areas showing my Aibo’s device ID and token.)Screenshot: Evan Ackerman/IEEE Spectrum

Anyway, back to the API. I think most of the unique API functionality is related to Aibo’s state—how much is Aibo charged, how sleepy is Aibo, what is Aibo perceiving, where is Aibo being touched, that sort of thing. And even then, you can kludge together ways of figuring out what’s going on in Aibo’s lil’ head if you try hard enough with the visual programming, like by turning battery state into some number of yaps.

But the API does also offer a few features that can’t be easily replicated through visual programming. Among other things, you have access to useful information like which specific voice commands Aibo is responding to and exactly where (what angle) those commands are coming from, along with estimates of distance and direction to objects that Aibo recognizes. Really, though, the value of the API for advanced users is the potential of being able to have other bits of software interact directly with Aibo.

API possibilities, and limitations

For folks who are much better at programming than I am, the Aibo API does offer the potential to hook in other services. A programming expert I consulted suggested that it would be fairly straightforward to set things up so that (for example) Aibo would bark every time someone sends you a tweet. Doing this would require writing a Python script and hosting it somewhere in the cloud, which is beyond the scope of this review, but not at all beyond the scope of a programmer with modest skills and experience, I would imagine.

Fundamentally, the API means that just about anything can be used to send commands to Aibo, and the level of control that you have could even give Aibo a way to interact with other robots. It would just be nice if it was a little bit simpler, and a little more integrated, since there are some significant limitations worth mentioning.

For example, you have only indirect access to the majority of Aibo’s sensors, like the camera. Aibo will visually recognize a few specific objects, or a general “person,” but you can’t add new objects or differentiate between people (although Aibo can do this as part of its patrol feature). You can’t command Aibo to take a picture. Aibo can’t make noises that aren’t in its existing repertoire, and there’s no way to program custom motions. You also can’t access any of Aibo’s mapping data, or command it to go to specific places. It’s unfortunate that many of the features that justify Aibo’s cost, and differentiate it from something that’s more of a toy, aren’t accessible to developers at this point.

Sony Aibo ERS-1000 with ballAibo’s API gives users access to, among other things, specific voice commands the robot is responding to and exactly where (what angle) those commands are coming from, along with estimates of distance and direction to objects that Aibo recognizes.Photo: Evan Ackerman/IEEE Spectrum

Aibo’s programmability: The future

Overall, I appreciate the approach that Sony took with Aibo’s programmability, making it accessible to both absolute beginners as well as more experienced developers looking to link Aibo to other products and services. I haven’t yet seen any particularly compelling examples of folks leveraging this capability with Aibo, but the API has only been publicly available for a month or two. I would have liked to have seen more sample programs from Sony, especially more complex visual programs, and I would have really appreciated a gentler transition over to the API. Hopefully, both of these things can be addressed in the near future.

There’s a reluctance on Sony’s part to give users more control over Aibo. Some of that may be technical, and some of it may be privacy-related, but there are also omissions of functionality and limitations that don’t seem to make sense. I wonder if Sony is worried about risking an otherwise careful compromise between a robot that maintains its unique personality, and a robot that can be customized to do whatever you want it to do. As it stands, Sony is still in control of how Aibo moves, and how Aibo expresses emotions, which keeps the robot’s behavior consistent, even if it’s executing behaviors that you tell it to. 

At this point, I’m not sure that the Aibo API is full-featured and powerful enough to justify buying an Aibo purely for its developer potential, especially given the cost of the robot. If you already have an Aibo, you should definitely play with the new programming functions, because they’re free. I do feel like this is a significant step in a very positive direction for Sony, showing that they’re willing to commit resources to the nascent Aibo developer community, and I’m very much looking forward to seeing how Aibo’s capabilities continue to grow.

Sony Aibo sleepingAibo deserves a rest!Photo: Evan Ackerman/IEEE Spectrum

Thanks to Sony for lending us an Aibo unit for the purposes of this review. I named it Aibo, and I will miss its blue eyes. And special thanks to Kevin Finn for spending part of his holiday break helping me figure out how Aibo’s API works. If you need help with your Aibo, or help from a professional software engineer on any number of other things, you can find him here.

[ Aibo Developer Site ]

The Conversation (0)