Steve Mann’s Better Version of Reality
His vision systems have been mediating and augmenting reality for three decades
Steven Cherry: Hi, this is Steven Cherry for IEEE Spectrum’s “Techwise Conversations.”
Google Glass is much in the news these days. So is Steve Mann. He’s the University of Toronto electrical engineer who’s been wearing a computerized headset for more than 30 years now.
People are looking to him to get an idea of what life with Google glasses would be like, but the two are, so far, doing mostly different things. Google Glass will be a kind of information overlay on the physical world, able to answer queries and display information. Steven Mann does some of that as well, but mainly he’s altering vision itself, in part by making different spectral bands available. He can see that an empty classroom seat was recently sat in, for example, by its heat signature. He can stare into car headlights because his headset filters most of the light out. He can zoom in on things far away.
Both systems, Google Glass and Steve Mann’s headset, can record what you see, and for Steve Mann, that means he has a near continuous record of well over half his life now, and that’s something to ask him about. In fact, we have a lot to ask him about to augment, so to speak, a hugely popular article he wrote for Spectrum in our March issue. So we have him on the show today to talk more about his unique life experience of mediating reality 24/7. He joins us by phone.
Steve, welcome to the podcast.
Steve Mann: Hello.
Steven Cherry: You started in the 1970s; you were still a teenager. To set the scene, personal computers barely existed. The Apple II was just coming out. The Commodore 64 wasn’t yet invented. What was your first system like?
Steve Mann: So I started...I guess if I go back even earlier, I learned to weld when I was about 4 years old. You know, my grandfather taught me welding. And so I understood the world of light and looking through glass to see things better. And so the idea of seeing and understanding the world through a piece of glass kind of is something that fascinated me, being able to see those things.
So I kind of experimented with originally kind of a system that you might envision as something that would replace the welder’s glass and help people see better in general. And so I had this kind of photographic vision, if you will, of a camera feeding a display and seeing the world modified by, you know, in an electrical sense by some sort of computation or signal processing, as very rudimentary and early as it would have been in those days.
And then I also, as I was playing around with computers back in the 1970s, I built a wearable computer system that allowed me to see the world photographically and to understand the world that way.
But the idea is that the computer has text and graphics and other elements with video and imagery and the camera image. And kind of I played around a lot with overlays and what I called “mediated reality” or what we might call “augmediated reality,” which is to augment as well as possibly diminish things where sight can be improved by managing what comes in, not just adding, merely adding to it.
Steven Cherry: Your headsets are pretty sleek nowadays, but even well into the 1990s, they were pretty bulky and heavy.
Steve Mann: Well, some of the early systems that I built were in, you know, large helmets, like welding helmets and things like that. But the system, I guess by the 1980s, you know, I had them in eyeglass format. And then by the 1990s, I had what we called digital eyeglass, which was just a strip of aluminum running across the face with two little nose pads for it and a little glasslike thing kind of over the right eye.
Steven Cherry: But the rest of the system was sort of in a backpack, or I guess at one point you had sort of like a belly pack.
Steve Mann: Yeah. I mean, some of the systems, it was a small computer that fits in a shirt pocket or something like that, that processes the information. That’s kind of varied over the years. Some of them were built entirely into the eyeglass device itself, depending on the amount of computation required.
Steven Cherry: Yeah, you’ve been riding the curve of Moore’s Law, I guess experiencing it personally in a way that maybe no one else has. I imagine a new processor or radio technology could change your life quite a bit.
Steve Mann: Yeah. I guess you could almost—I would almost think of this as Less’s Law, you know, the size of things decreases exponentially for the same amount of computation required.
Steven Cherry: Wireless communication has been a big part of the challenge of building your systems?
Steve Mann: Well, it was in the early days. See, in the very early days it wasn’t possible to have that much computation on my body, so I transmitted my video live wirelessly and then processed it on a supercomputer at a remote facility and then had—and then took that data and sent it back to my eye.
Steven Cherry: You’re not seeing the world the same way that people around you are, even your wife, your kids, your students. You called this “mediated reality,” and I think that phrase hints at something. Is there a psychological distance or a loss of social intimacy?
Steve Mann: I guess what happens with mediated reality is that the idea here is to help people see better, not just to overlay material. So you have to understand, I guess, that augmented reality is a proper subset of mediated reality. Mediated reality is a more general concept, which includes augmented reality as one case or one embodiment.
Steven Cherry: But you do kind of...I guess my question is, Does it create a sort of psychological distance even when you’re sort of in maybe an intimate family setting in your living room?
Steve Mann: Well, what it does is it allows you to see better. That’s something that I think the current makers of a lot of these products have neglected to add in. You see, originally I just had something that overlaid material in the 1970s, just overlaying material. And I found that that resulted in a lot of information overload and confusion and so on. So I was looking for something that would actually help people see better in addition to having the capability to overlay. So I was looking for something that has the capacity to a) overlay information and so on, and b) actually help people see better by mediating or managing.
So one of the examples is, things that I came up with, was something called high dynamic range—HDR, which I invented back in the ’80s and filed some patents at MIT in the early ’90s on this, which is to grab differently exposed images of the same subject matter in rapid succession, like dark, medium, light, dark, medium, light, one right after the other in rapid succession, and then calculate the actual amount of light received, and then redraw that onto the retina in a more clear way so that a person could see much better and understand the world better.
So that if you’re welding something or looking into car headlights, you can still see the dark areas of the scene or recognize the driver’s face or see the license plate, so the idea of improving vision in this sense. And also, other things like the wearable face recognizer, I did about 20 years ago, which recognizes a person’s face from a database and prints their name on the glass so that you can see their name hovering as if it was a virtual name tag, so that it helps people recognize others or recognize objects and things like that.
Steven Cherry: Yeah. And I think that’s the kind of thing that people are expecting Google Glass to do. And your system also does something that it does, which is I noticed it overlaid street names, for example, when you’re just walking around.
Steve Mann: Yeah. Yeah, that’s one of the things that my glass was doing. I had that running—how long ago did we have that? Well, when [Google] Street View first came out and we had that running on my eyeglass for sure, so you can see what is actually there through the glass.
Steven Cherry: Google is still in the process of working out the design and specs of Google Glass, but I gather you think that Google is making some mistakes, and visual focus is one of them.
Steve Mann: Yeah. I think one issue is that the camera I found back in the ’70s, I had just a camera hooked to a display, Generation-1 glass, just a camera feeding into a display where the camera’s separate from the eye. And in the Generation-2 digital eyeglass, I had the camera as the eye itself, in effect, so that it’s not removed spatially from where your eye is. And that alleviated 90 percent of the dizzying, disorienting sort of problems.
Then there’s the remaining issue of focus, you know, where one eye’s focusing on a different distance than the other eye when you’re looking through the glass. So then the Generation-3 digital eyeglass, I fixed that problem. And then finally, Generation-4 digital eyeglass has infinite depth of focus. So the Gen-4 glass is something that we completed here in 1998, 1999, sort of in the 1990s. And that system is what you see—you’ve seen probably some pictures of that, just an aluminum strip with two little nose pads and a little piece of glass over the right eye.
Steven Cherry: So you’ve published all of your results. It would be kind of odd if Google made some mistakes that you fixed in the ’90s.
Steve Mann: Yeah. Yeah, I’ve written lots of papers on it, written a couple of books on it. So it’s out there, and I’ve been teaching it for the last 14 years, I guess I’ve been teaching this digital eyeglass course.
Steven Cherry: Now, you’ve never tried to commercialize your systems. I’m wondering if you’re thinking about that now, especially now that Google is commercializing theirs.
Steve Mann: Well, we’ve had some commercial projects. As I said, you know, MIT filed for some patents for some of my inventions, which are originally. And then subsequent to that, we’ve had other companies working. We started a number of small companies. For example, we started…the brain-computer interface work we did, we started a little company. Some of my students got together and started a little company called InteraXon. Interaxon.ca is the website. And that company now has headquarters in the U.S. and in Canada, and is selling products internationally that, you know, brain-computer interface products and things like that based on some of this digital eyeglass work.
Steven Cherry: One striking feature of Google’s system is that it relies on the cloud for a lot of its processing. I’m wondering if you’ve thought about that and experimented with it.
Steve Mann: Well, the early systems I built relied entirely on the cloud, basically, because the computers weren’t fast enough locally. But then, as computing became faster locally, I started to actually be able to build the processing power in the glass device itself to do the imaging.
Steven Cherry: In 2005, Spectrum had a cover story profiling Gordon Bell and his MyLifeBits system. He had digitized all the paper in his life, everything from his personal library to old tax returns. But the most interesting part of it was a wearable camera that recorded much of his day. He only wore it for about a year at the point of the article, and you’ve been wearing cameras and taking still pictures and video for most of your life—the only human in the world who can say that. What’s it like to have that kind of record, and how often do you go back to the videotape, so to speak?
Steve Mann: So, it’s useful. I mean, as I say, I was, for example, the victim of a hit-and-run, and it turned out useful. I was a pedestrian. But, you know, we now accept sort of dashcams in cars, very common, and nobody questions that. And so now the camera on the body—I call it the “on-bod” camera instead of the onboard camera you’d have in a car. And so it’s a similar function, and it turned out to be quite useful in that hit-and-run, for example. The idea of cameras on people is an interesting idea in general.
Steven Cherry: The ACLU says that the answer to unpopular speech is more speech. I gather you think the answer to the surveillance society—the CCTV and GPS tracking and cellphone monitoring and black boxes in cars and credit card tracking, and soon there’ll be 10 000 drones with cameras in the skies—I gather you think the answer to the surveillance society is more surveillance with, like, 300 million people recording our daily lives in the U.S.A.
Steve Mann: Well, that’s not surveillance. Don’t forget that surveillance means watching from above. So when somebody’s holding a camera or wearing it, that’s not surveillance; that’s not watching from above. That’s sousveillance.
Steven Cherry: Fair enough. But I think a lot of people use the word more generally to mean that, in particular, the authorities are watching people.
Steve Mann: Yes, that’s watching from above. So the police photographing citizens, that’s surveillance. When citizens photograph the police, that’s more like sousveillance.
Steven Cherry: And you would prefer to live in that world, I gather.
Steve Mann: I would prefer to live in a world which, shall we call a world of “equiveillance,” that is to say, a world in which there’s a balance between surveillance and sousveillance. You see, if you put cameras in only the east end, it pushes crime westward. But if you put cameras in the east and the west and the north and the south, throughout the whole city at street level, the crime has nowhere to move but up the ladder to corruption. In other words, you extinguish low-level street crime, or at least reduce it.
What I don’t think is healthy is where there’s a one-sided-veillance from above, that is to say, where there’s a God’s-eye view, there’s an eye-in-the-sky of the surveillance camera, and that’s a world in which the police take a godlike position, watching over us, but are not subject to any scrutiny. I would have to say that I guess surveillance corrupts, and absolute surveillance corrupts absolutely.
Steven Cherry: Would you want your own children to wear your system, and at what age would it be best to start?
Steve Mann: Well, that sort of depends. I would certainly say that—I sort of predicted that in the future. Well, I guess I predicted some things back in the ’70s. One thing is that everybody would be having a computer with them, a small computer they would carry around or wear, and that certainly came true. And the other thing that I predicted is at some point that eyeglasses will become digital eyeglass. So instead of glasses, we’d have—I predicted years and years ago that we’d all wear a digital eyeglass, you know, glass singular, not glasses plural, which is basically a glass that we look through that helps us see better. And I think that prediction will probably soon come true, and so that more and more people will wear some kind of eyeglass.
Steven Cherry: Even children.
Steve Mann: And, of course, the age threshold may be less and less.
Steven Cherry: You’ve lived your life in various ways different from everyone in the world. Some people enjoy being different, and I suspect you’re one of them, but it must be stressful too. Will it be a relief to you when someday everyone is mediating and augmenting reality?
Steve Mann: It may well be. I mean, there’s some of us, there’s, on the Glogger.mobi site, now we have more than 200 000 of us, I guess, using this—living in this world. But we’ve had small groups, you know, where we’ve had a dozen or so of us together in the same room all wearing digital eyeglass and sort of exchanging viewpoints. And it’s certainly kind of interesting, an interesting dynamic.
One of the things that’s really funny, I thought was really interesting, is we set up a protocol where one glass recognizes other glass. And we had, like, for sitting around a table, and so I would then be watching myself eating from several different viewpoints of all the other people. And one thing I found is it really helped me improve my table manners; all those things my mother used to tell me when I was a child about eating in a dignified manner suddenly made sense, because I could see how dignified or undignified I looked from other vantage points. And I think this will really change society if we have this constant sensibility of seeing the world from different viewpoints.
Steven Cherry: Well, writers, fiction writers, screenwriters spend a lot of time working on point of view, and it didn’t occur to me, but there could be some real societal changes due to having multiple or different points of view.
Steve Mann: Especially in real time. You know, when you see these things in real time—I mean, I call this “point of eye,” you know, because it’s really, you’re seeing out of somebody else’s eyes.
Steven Cherry: Very cool. Well, Steve, the IEEE charter includes the phrase “service to humanity.” I get to meet very few people who seem to have that as a personal charter, but you’re one of them. Thanks for your work, and thanks for joining us today.
Steve Mann: Yes. It’s been an honor to participate, and I hope to see lots of people out at the conference. I’m the general chair for the IEEE ISTAS 2013 conference, and certainly I want to live up to that motto. I think the IEEE is really…that model’s a great model to live by, sort of, you know, technology and service of humanity.
Steven Cherry: Very good. Thanks again.
Steve Mann: Great. Thanks for having me on.
Steven Cherry: We’ve been speaking with electrical and computer engineering professor Steve Mann about mediated reality and the vision systems of his that have predated Google Glass by more than three decades.
For IEEE Spectrum’s “Techwise Conversations,” I’m Steven Cherry.
NOTE: Transcripts are created for the convenience of our readers and listeners and may not perfectly match their associated interviews and narratives. The authoritative record of IEEE Spectrum’s audio programming is the audio version.