DIY Data Capture via Web Cam

Automate data collection with a camera and optical character recognition

24 Sep 2013

Photo: Paul Wallich

A while back, I got a cheap wireless LaCrosse brand weather station, with sensors for temperature, humidity, wind speed and direction, and rainfall. The stand-alone display shows weather conditions using seven-segment LCD numerals. The box the station came in promised that it “Connects to your PC!,” which appealed to me because I’d be able to automatically log the data and pass it around my home network. Well, it turns out that this promise could be honored only if I had a PC of the precise operating system and vintage that the station’s USB-to-wireless dongle called for. I’d also need just the right version of some proprietary weather software. Unfortunately, my Macintosh setup met none of these requirements.

Sure, if I had access to an already working installation (and a lot of time), I might have tried reverse engineering the embedded hardware and software. In theory, I could have figured out what the weather-station components were telling the dongle, what the dongle was telling the PC, and what commands might be flowing in the other direction. Then, with some more work and time, I could have reproduced that conversation on a Linux or OS X box. But I didn’t have the access or the time.

So I decided to make do with the station’s stand-alone display. I’ve been learning how to use OpenCV, the comprehensive image-processing library initially created at Intel and now supported mostly by Itseez and the open-source roboticists at Willow Garage. The library is designed to make it easy to extract information about a scene from the raw images coming from a camera. It’s available for many platforms, including Mac.

With such software at hand, and with webcams being cheap to the point of being disposable, I thought it should be a simple matter to take a picture of the screen, extract the text displaying the weather conditions, run the text through an optical character recognition (OCR) system to get data in numeric form, and log the result. Because I needed to make a log entry only every few minutes, I didn’t need a lot of computational horsepower, so I could do the processing using spare background cycles on my Mac.

An early issue was figuring out which of the many possible ways I could use OpenCV to do this most effectively. At first I thought that it would mostly be a matter of correcting the image for distortions caused by my camera’s perspective (my webcam is off to one side to let me still read the display by eye) and uneven lighting. Once I had something to pass to OCR software, the hard work would be done. However, it turns out that almost all free OCR software is designed for printed text, where each letter forms a continuous contour. It’s not at all good for numbers and letters made up of disconnected segments, as in my station’s display.

So then I was entranced by the idea of template matching: comparing small “model” images of digits with those from the webcam feed, seeing where and if they matched, and collating that with the positions of the temperature, wind speed, and other indicators on the display. But that would have meant waiting until all the digits from 0 to 9 appeared on the local weather display at least once so I could save their images to a file. Or I could have drawn sample digits by hand, but OpenCV’s standard template-matching function is unforgiving of mismatches in size or orientation.

Then I found a program called SSOCR, or Seven-Segment Optical Character Recognition, which was developed by Erik Auerswald of the Technical University of Kaiserslautern, in Germany. SSOCR makes it possible to paste one-time-password codes from a security fob into a Web page log-in screen. Optimized for the fob’s single fixed line of six digits, SSOCR turned out to be a bit too specialized. It requires a close-up image under unchanging lighting. However, the light on my weather station’s screen varies depending on the time of day, and the camera has to be far enough back to capture the station’s wider, multiline screen. So I stole some ideas about how to slice up seven-segment images from SSOCR and wrote my own recognizer, based simply on which segments had enough black pixels to be considered “on.”

I still had to correct for the camera’s perspective, which I thought would be easy, as software for solving this exact problem is already available. However, the weather-station display is made of gray plastic, and the LCD is dark gray on light gray, so there aren’t any distinctive points for the corner-finding routines normally used for this kind of correction to lock onto. Finally, I just cut out four little circles of red construction paper and glued them to the display; the correction routines have no trouble finding those. Once an image has been acquired, my OCR routine does its work, and the results are saved into a text file for further processing.

What am I going to do with all my weather data? Over the long term, I’m going to match it to the data from the nearest official weather station, so that I can figure out how the weather there correlates with the weather here. But the first thing I need to do is to create a weather page on my Mac’s personal Web server. Then I can use my tablet or phone to see just how stormy it is outside while I’m still lying in bed at the other end of the house from the station. The easiest way is to just fire up a browser, but with MIT (formerly Google) App Inventor, which allows drag-and-drop assembly of Android apps, I should need just a couple of hours to write a program—once I get yet another development environment set up.

This article originally appeared in print as “Point-and-Shoot Weather Data.”

Paul Wallich

DIY Data Capture via Web Cam

Automate data collection with a camera and optical character recognition

Superconducting "Islands" Could Lead to Magnetic Memory

Spintronic devices with magnetic memory operate quicker than electronics

The UK's ARIA Is Searching For Better AI Tech

Suraj Bramhavar is leading a program to make training more efficient

A Brief History of the World’s First Planetarium

Crowds flocked to see an artificial starry sky projected onto a dome

Agricultural IoT System Sends Power Through the Soil

Tennessee field trials powers sensors for just pennies per day

Modeling Cable Design & Power Electronics

Applications that will help ensure a consistent and long-term electricity supply

This Startup Uses the MIT Inventor App to Teach Girls Coding

Students learn computer science while designing and building apps

2D Gold Sheet Draws Graphene Comparisons

Applications expected in water-to-hydrogen tech and efficient electronics

How Field AI Is Conquering Unstructured Autonomy

Former NASA, DARPA, and DeepMind researchers are teaching robots to go anywhere

Expect a Wave of Wafer-Scale Computers

TSMC tech allows for one version now and a more advanced version in 2027

Phone Keyboard Exploits Leave 1 Billion Users Exposed

Popular Chinese-language keyboard apps reveal leaky security standards

Topics

Sections

More

For IEEE Members

For IEEE Members

IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum

Follow IEEE Spectrum

Support IEEE Spectrum