A new computer network, automated investigative tools, and more channels for sharing information, the FBI hopes to finally know what it knows
What the FBI doesn’t know can kill you. That, at least, is what we’ve been led to believe since 9/11.
Had agents in Minnesota been allowed to search Zacarias Moussaoui’s computer, had the Phoenix office memo warning of Al Qaeda members enrolling at U.S. aviation schools filtered up the chain of command, had the internal computer databases done anything but the most rudimentary searches, maybe, just maybe, things might have gone much differently 18 months ago. But for the failure to connect those proverbial dots, 3000 lives might have been saved.
The idea that 9/11 could have been prevented heightens the tragedy, of course, but also invites all kinds of speculation: if we accept the premise, then there must exist some deliberate course of action that we should now take. Experts and officials have spent the last year and a half trying to figure out what that course should be, but have reached no clear agreement.
Understandably, much attention has focused on the Federal Bureau of Investigation (FBI), both for what it could have done but didn’t, and for what it should do now. The post-9/11 revelations of mistakes and mismanagement only underscored what many had said for years: that the bureau’s fundamental organizational, cultural, and technological deficiencies have bred a swarm of high-profile gaffes [see “FBI Under Fire”] and render it unsuited to the intricacies of fighting terrorism.
As for improving, under Robert S. Mueller III, sworn in as director a week before the September 2001 attacks, the bureau has announced a broad agenda of technological initiatives and long-sought organizational reforms. Some of that is merely catch-up, like upgrading the FBI’s antiquated computational infrastructure to an acceptable, but by no means advanced, standard. More exploratory efforts, though, in investigative data warehousing and information-sharing networks, could, if successful, place the agency for a change on the technological cutting edge. As the nation’s leading agency for domestic terrorism and federal law enforcement, Mueller has said, the FBI “should be the most technologically proficient investigative agency in the world.”
First, though, he’ll have to bring the bureau into the 21st century, technically as well as culturally, and beef up its capabilities in intelligence gathering and analysis. Then he’ll have to fend off a growing chorus of critics intent on divesting the bureau of its domestic intelligence responsibilities. Also to be addressed are the concerns of privacy and civil liberties advocates. With the FBI’s powers greatly expanded under the USA Patriot Act, they fear the innovations being put into place are just the first steps in setting up a police state.
Lastly, there are the many unresolved technical questions: is it really possible to build a system that can precisely identify a crime’s precursors, when the would-be perpetrators are doing their utmost to be untraceable and unpredictable? And is the FBI the right outfit to build such a system?
It’s the leadership
In September, a blue-ribbon panel of business leaders, lawyers, and academics known as the Markle Foundation Task Force on National Security concluded that “among other things, [the FBI] has failed to develop an adequate strategic plan, has no comprehensive strategic human capital plan, has personnel with inadequate language skills, antiquated computer hardware and software, no enterprise architecture, and several disabling cultural traditions.” Three months later, the Department of Justice’s inspector general weighed in with a scathing report about the FBI’s mismanagement of its information technology (IT) programs: “The FBI continues to spend hundreds of millions of dollars on IT projects without adequate assurance that these projects will meet their intended goals.”
For all the finger-pointing, though, nobody should have been surprised by the sorry state of the FBI’s computers. The problems date back at least 10 years. “I wrote a book in 1993 about the FBI and even then I was critical about their computer systems, the fact that they had to double up on computers, and the systems were very backward,” says Ronald Kessler, author of The Bureau: The Secret History of the FBI (St. Martin’s Press, 2002). “Some people say, ‘FBI agents don’t like computers,’ but that’s not true. They all use computers at home that in many cases are better than what they have at work. It’s not the agents, it’s the leadership.”
Kessler is optimistic about Mueller, an ex-Marine and former federal prosecutor who is reputed to be reform minded and tech friendly. He is said to carry a PDA, use e-mail, and make PowerPoint presentations from his laptop—activities his predecessor, Louis Freeh, seldom if ever engaged in. After becoming the U.S. attorney for San Francisco, Mueller led an overhaul of the office’s computer system for tracking cases; the new program, called Alcatraz, is now used by all U.S. attorneys. Upon arriving at the FBI, Mueller asked that Microsoft Office be installed on his desktop. “They told him, ‘We can put it on there, but it won’t be compatible with anything else in the FBI,’” says Kessler. “He hit the roof.”
The fall of 2001 saw the start of an ambitious program of modernization, which seems to recognize that the barriers that prevent the FBI from analyzing and sharing data are as much cultural as technological. As outlined by Mueller and other agency leaders in regular appearances before Congress, these include:
- Accelerating a bureau-wide overhaul of basic computer hardware, software, and network infrastructure. The three-year, US $534 million effort known as Trilogy will eventually give each of the 11 400 FBI agents and 16 400 other employees a Dell Pentium desktop PC running Microsoft Office, with secure, high-speed connections to FBI headquarters and hundreds of field and satellite offices. In early March, Mueller announced that the first phase of this upgrade had been completed.
- Replacing the FBI’s ancient DOS-based Automated Case Support (ACS) database with a more user-friendly Windows-based system that can search on not just text but also photos, video, and audio records. Known as the Virtual Case File system, it’s set to come online by the end of 2003.
- Web versions of the bureau’s most commonly used investigative tools for accessing, organizing, and analyzing data.
- Hiring 350 intelligence analysts and 900 special agents, with special emphasis on those trained in the physical sciences, computer science, and engineering, as well as foreign languages, military intelligence, and counterterrorism.
- Initiating a pilot study for an information-sharing network among field offices located in St. Louis; San Diego. Calif.; Seattle, Wash.; Portland, Ore.; Norfolk, Va.; and Baltimore. If successful, it could link all field offices and other agencies, too.
- Creating a corps of reports officers (long part of the Central Intelligence Agency and other agencies) responsible for identifying and collecting intelligence from FBI investigations and sharing that information with the intelligence community.
A billion records and counting
Of the many announced reforms, probably the most provocative is the FBI’s plan to engage in investigative data mining and data warehousing, with a view to detecting and connecting the traces of terrorist and criminal activity. Details are still sketchy, but presumably it would copy the techniques the commercial sector uses to track and predict consumer behavior, prevent IT network break-ins, and so on. (Repeated requests for interviews for this article were declined or went unanswered by bureau press officers; FBI contractors referred all questions to the bureau.)
“Data warehousing involves connecting various datasets from various sources—transactional data from your Web site, demographics data from providers like Axciom and Experian—and then using analytical software to detect patterns in the data, so that you can personalize the services you offer or detect fraud,” explains data mining expert Jesus Mena.
There are two basic approaches, he says. “In the first, you look for outliers or deviations, things that are way outside normal behavior—somebody trying to access a computer network in the middle of the night, for example. The other is where you have a pattern of known activity and you have a signature that you try to match.”
Mena’s book Investigative Data Mining for Security and Criminal Detection (Digital Press, 2003) discusses how these commercially available techniques can be applied to law enforcement and intelligence. The FBI, he notes, has long been a customer of ChoicePoint (Atlanta, Ga.), which collects and sells consumer information. To detect criminal or terrorist behavior, he says, one would overlay that data with data from law enforcement (for example, arrest records, photographs, and fingerprints), immigration (visa records and border crossings), and intelligence (terrorist watchlists and the like).
At least in theory, this is exactly what the FBI needs in order to know what it knows. It has amassed criminal and intelligence-related data galore—over a billion records, by one bureau estimate, stored in many databases at dozens of sites. Only a fraction of the FBI’s data is in a common format that can be easily searched, analyzed, and shared. The agency’s $680 million Integrated Automated Fingerprint Identification System (IAFIS), for example, contains millions of digital fingerprint records and has cut search time from weeks to hours. But it is not directly linked to the FBI’s main network for handling case files, which is a text-only system. What’s more, many state and local agencies still lack the equipment to access the IAFIS and upload and download prints. Nor, needless to say, is there a universal interface for allowing the databases at all the agencies to talk to one another, although some data exchanging—between, for example, the FBI and the U.S. Immigration and Naturalization Service (now the Bureau of Citizenship and Immigration Services)—has begun since 9/11.
Reportedly, the bureau now maintains a production line of scanners and optical character recognition software to convert some 750 000 paper documents a day into electronic text. Key files relating to counterterrorism going back 10 years, some 40 million or so pages, have already been converted. Still, at the current rate, it will take more than three and one-half years to convert the rest. And more paper is being generated all the time.
Take the FBI’s handling of last fall’s sniper attacks in and around Washington, D.C. As described by William Hooton, assistant director of the FBI’s records management division, in a 14 November speech to the Association for Information and Image Management (Washington, D.C.), the bureau set up a phone center to field tips from the general public. Staff members duly logged each call on paper forms, which were collected every hour and taken to FBI headquarters, where they were scanned and the digital images fed into a bureau-wide database.
All the same, as an article in Federal Computer Week pointed out, a scanned handwritten note is not an electronically searchable file. That may explain why the bureau did not discover until after the fact that eyewitnesses had reported spotting the suspects’ car, including its New Jersey license plates, at a handful of the crime scenes.
“They’re seen at one crime scene and then they’re seen again at another one miles away. That’s an incident in itself—why were they there?” observes Mena. “That’s clearly a failure to connect the dots, to see a recurring pattern of sightings of a car with out-of-state plates.”
So-called free forms, of the kind used in the sniper attacks, present an enormous obstacle for data analysis, Mena says. “Someone might describe an individual as being tall or having an accent or dressed a certain way, and different investigators will enter that information differently.” The solution, he says, is “to standardize from the beginning, so that you use checklists, as opposed to free forms, to capture the data.” Text-mining software from companies like Autonomy, HNC, and IBM could then be used to categorize and organize the raw data automatically.
The FBI has not revealed whether or to what extent it has implemented such techniques. Last September, though, Mark Tanner, the FBI’s information resources manager, told Government Executive magazine that he receives “probably 10 to 15 calls or e-mails a day from [vendors] who have solutions to these problems,” but “we’re unable to really implement them... because we don’t have the infrastructure.”
Standards matter all the more when information must be shared across agencies. The Department of Justice, which oversees the FBI, actually has a standards registry for just that purpose (see https://it.ojp.gov/jsr/public/index.jsp). It covers everything from message sets (IEEE 1512) to “the Interchange of Fingerprint, Facial, Scar Mark and Tattoo (SMT) Information.” XML, the Extensible Markup Language, is one of the most widely discussed, in the FBI and elsewhere; there’s now an XML standard for rap sheets and criminal histories. Taking that one step further, a group called the Organization for the Advancement of Structured Information Standards formed a technical committee in January to develop an XML framework for sharing criminal and terrorist evidence.
Filtering data in hopes of detecting a criminal or terrorist plot is not easy, Mena cautions. [The German federal police's recent exercise in data mining proved this to be true; see sidebar, "The German Solution"]. Unlike consumers, terrorists are not prone to repetition. “So you have to anticipate new types of attacks—bombings, or bioterrorism, or other activities,” he says. And all the data mining in the world will never trace the hand-written, hand-delivered messages that Osama bin Laden’s Al Qaeda operatives allegedly use. “That’s why a combination of human knowledge and machine learning is the best approach,” Mena says.
Privacy and civil liberties advocates have a larger concern. “The FBI will now be conducting fishing expeditions using the services of the people who decide what catalogs to send you or what spam e-mail you will be interested in,” says James Dempsey, executive director of the Center for Democracy and Technology (Washington, D.C.). “The problem is, the direct marketers can only call you during dinner time or mail you another credit card offer based on that information—the FBI can arrest you.”
“We don’t want to arrive at a situation where individuals are reluctant to, let’s say, purchase a copy of the Koran from Amazon.com,” agrees Steven Aftergood, a senior research analyst at the Federation of American Scientists (Washington, D.C.). “That would be intolerable.” There need to be realistic error-correcting procedures, which in many cases do not now exist, not just for statistical or data-processing errors, but also for those introduced by “willful, deliberate abuse,” he says. “The error-correction process should not be a knock on the door from the FBI.”
Even with perfect data, data mining may yield a completely inaccurate picture. “It is all too easy to do Monday-morning quarterbacking and say ‘Why didn’t you connect the dots to see that stick of dynamite?’ when in fact the same dots could be connected just as well to show a duck or a coffee mug,” one computer expert with extensive training in intelligence work told IEEE Spectrum. Skeptical of both the technical capabilities and the political ramifications of the FBI’s expanded surveillance efforts, he believes the technology will be “totally ineffective in its professed purpose [of catching terrorists] but too effective as a domestic police state tool.”
Tyranny of the case file
Of the many wrenching revelations to emerge after 9/11, none resonated quite like the report that the FBI’s antiquated computer database could perform only single-word searches, on “flight” or “school,” say, but not “flight school.”
The story confirmed everyone’s worst fears about the FBI’s out-of-date abilities. Too bad it wasn’t true. “It’s bullshit,” says Nancy Savage, a 26-year FBI veteran based in Portland, Ore. “I just go crazy reading what I read in the press, it’s absolutely wrong.”
The database in question is a DOS-based system that runs on the FBI’s own secure network, says Savage, who is also president of the FBI Agents Association. Known as the Automated Case Support (ACS) system, it’s used by thousands of agents each day. “I can search on ‘flight schools’ and do Boolean searches, I can pull up the full text of any document that’s created in-house,” Savage says.
The ACS also shows links to records at other agencies, local, state, and federal. “I can see that Seattle has something in their files that was created in the U.S. attorney’s office on this date,” Savage explains. “If I need it, then I can call or e-mail them, and they’ll send it—it’s pretty easy.” (The notorious FBI spy Robert Hanssen was said to be a devoted ACS user, accessing not only classified records that he then sold to the Russians, but also searching on his own name and address and terms like DEAD DROP AND KGB to see if any records pointed to him.)
That said, the ACS has a multitude of problems and is slated for replacement under the Trilogy computer upgrade. Although unveiled in 1995, it’s 1980s-era technology: mainframe-based, user-unfriendly, text-only. Because it is three applications cobbled together, each with its own set of commands, a simple search can mean delving down through several DOS screens and remembering at each step which function key (no pointing and clicking here) corresponds to which command.
The ACS’s fatal flaw, though, is that it simply automated already onerous administrative chores. Over the course of its 95-year history, the FBI’s bureaucracy has devised some 900 standard forms, to be filled out for everything from recording attendance (Form 420) to filing a memorandum (Form 467) to conducting an interview (Form 302). Until very recently, the FBI’s automation approach was “to just build macros for everything,” says Savage. “If I’m working on a fugitive case, I’ve got to remember these seven macros I have to go through. You become a huge bureaucrat, doing one hour of investigation and seven hours of administration.”
The Virtual Case File system looks to be better, she notes. “You can’t just automate, you have to reengineer,” she says. This time around, experienced street agents are being brought into the development process. “Every form is being examined. Can we get rid of it? Can we do this automatically? It should make us incredibly more productive.” Data input into the system is being streamlined, and the extra bandwidth being added through the Trilogy upgrade will allow photos and video to be uploaded and downloaded. Querying the system will yield, among much else, a linked diagram of where each relevant document resides.
Still unanswered is whether the new system could help move the FBI beyond the “tyranny of the case file,” as Senator Richard Shelby (R-Ala.) terms it. In a harsh assessment of the U.S. intelligence community released in December, Shelby noted that “fundamentally, the FBI is a law enforcement organization.” Investigating crimes, though, is quite different from gathering and analyzing intelligence data in hopes of preventing terrorism. Bureau agents are trained to view information in terms of building a case—“a discrete bundle of information the fundamental purpose of which is to prove elements of crimes against specific potential defendants in a court of law.” An agent’s success is measured largely by how many arrests, prosecutions, and convictions he or she achieves.
Intelligence analysts, by contrast, have little concern about the data’s admissibility in court, tending “to reach conclusions based upon disparate fragments of data derived from widely distributed sources and assembled into a probabilistic ’mosaic’ of information.” A good intelligence worker is not looking to arrest a suspect; that would only serve to cut off vital information about the suspect’s plans and contacts and perhaps even the opportunity to recruit the suspect as a double agent.
Shelby is among those unconvinced that the FBI can reform sufficiently to become an effective intelligence agency. Among the options outlined in his report are: placing the bureau’s counterintelligence and counterterrorism programs into a stand-alone agency; creating a semi-autonomous organization that would still report to the FBI but would in every other respect be separate from its law enforcement activities; and moving domestic intelligence to the Department of Homeland Security.
“Whatever the best answer turns out to be,” Shelby concluded, “I believe some kind of radical reform of the FBI is in order—indeed, is long overdue.”
To Probe Further
Recent critiques of the FBI’s IT and intelligence capabilities include the U.S. Department of Justice Inspector General’s Federal Bureau of Investigation’s Information Technology Investments, available online at https://www.usdoj.gov/oig/audit/0309/0309.pdf, and Senator Richard Shelby’s September 11 and the Imperative of Reform in the U.S. Intelligence Committee, to be found at https://intelligence.senate.gov/shelby.pdf.