Reading The Unreadable

Data forensics tools leave the lab and enter the marketplace

3 min read

With more and more of our lives recorded on CDs and DVDs--wedding photos, the kid's chorus concert, home movies--it's inevitable that some memories will be irrevocably lost to scratched, chipped, or even broken disks.

Luckily, now there are tools to retrieve the hitherto unretrievable. The husband-and-wife team of Paul and Michelle Crowley sells a software suite of data recovery tools for damaged disks through their company, InfinaDyne, in Grayslake, Ill.

Most home movies, photos, and data can be recovered with CD/DVD Diagnostic, a US $50 utility. The software can extract files and data directly from the disk, bypassing damaged or incorrectly written portions. Although some data may be irretrievably lost, the software will recover as much as it can from the rest of the disk.

Can Diagnostic read a disk that's been cracked in half? It's happened. In desperation, one customer taped the pieces together and got his data back. "If the drive will read the disk," says Paul Crowley, "we'll get more off it than anyone else can."

The company has another product, CD/DVD Inspector, geared to the needs of law enforcement. The data recovery capabilities are the same, but Inspector has additional forensic scanning tools that tell investigators more about the data on a disk. A recently added feature lets the software work with disk loaders, so that a number of disks can be scanned at once.

Crowley says law enforcement agencies increasingly must catalog and manage large numbers of disks. "A police department might seize 1000 disks and find incriminating evidence on 10 of them," he says. "A year from now, it will have to produce just those 10 and say when and where they were captured, and how they were kept in police custody the whole time."

Law enforcement organizations, from the U.S. Federal Bureau of Investigation to police departments in Hong Kong, Germany, and the United Kingdom, are using Inspector. So is the accounting firm Deloitte and Touche, as well as Amtrak and the U.S. Environmental Protection Agency. Single-user pricing begins at $349, but most customers buy a site or organization license.

Reading a file is just the first stage to understanding it. PlatinumSolutions Inc., Reston, Va., takes the next step in data forensics. Instead of focusing on the physical recovery of files from damaged disks, its software tries to recover as much information as possible from healthy ones. It reads obscure and obsolete file formats and creates a database of information about all the files on a disk. It doesn't, however, decode encrypted files, which requires specialized equipment.

Even unencrypted files can pose a challenge. Adam Rossi, president of PlatinumSolutions, points out that if you're looking at what turns out to be an e-mail file, knowing that the name "Rossi" appears in a file doesn't tell you much. But knowing that the file consists of e-mail messages may tell you a bit more. And knowing that the name "Rossi" occurs in the "From" field of an e-mail message, for example, can be critically informative. Such metadata--information about the information being recovered--can be unexpectedly helpful.

"It turns out, for example, that video files have tons of metadata, down to camera type," Rossi says. "If you're a police detective in a child pornography case, being able to quickly identify every image file as such is helpful. But being able to find every image where the camera type is a Canon 105, because that's what a particular suspect owns, might provide the clue that breaks an investigation wide open."

A disk drive of 10 gigabytes, small by today's standards, typically has about 800 000 files, says Rossi. The average file has about 15 metadata elements, but documents and images may have as many as 30, and video files can have hundreds of elements that record such things as the time and creator of a file, along with such technical information as image size.

The PlatinumSolutions software, called the Evidentiary Metadata System, has a database of about 3000 file types. Possibly the single largest such collection in the world, it acts as a Rosetta stone in revealing the meaning of file metadata.

Rossi worries about our diminishing ability to read obsolete file types and media formats, such as old magnetic tapes and floppy disks. He would like to see an organization like the U.S. National Institute of Standards and Technology maintain a public archive of file types and their associated metadata. In the meantime, government entities such as the FBI and the U.S. Department of Defense will have to rely on private vendors like PlatinumSolutions. The company typically customizes its software and will quote prices by request.

This article is for IEEE members only. Join IEEE to access our full archive.

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of Spectrum’s articles, podcasts, and special reports. Learn more →

If you're already an IEEE member, please sign in to continue reading.

Membership includes:

  • Get unlimited access to IEEE Spectrum content
  • Follow your favorite topics to create a personalized feed of IEEE Spectrum content
  • Save Spectrum articles to read later
  • Network with other technology professionals
  • Establish a professional profile
  • Create a group to share and collaborate on projects
  • Discover IEEE events and activities
  • Join and participate in discussions