On 4 January 1998, police in London arrested a man, whom court records call “B,” on suspicion of burglary. The police swabbed the inside of the suspect’s cheek to collect a sample of his DNA.
In August, B was acquitted and released. But in September, B’s DNA profile was—accidentally and illegally—entered into the United Kingdom’s national DNA database. The system automatically compares newly loaded DNA profiles against unidentified samples obtained from crime scenes. The system found a match—a sample recovered from a 1997 rape and assault case. The police arrested B, and the government successfully prosecuted him for those crimes.
Is there anything wrong with such a turn of events? Privacy advocates say there is, as do people worried about racial discrimination. Among these are lawyers working with the American Civil Liberties Union (ACLU) and the Council for Responsible Genetics, in the United States, and with GeneWatch and Privacy International, in the United Kingdom. Law-enforcement officials and forensic scientists, on the other hand, say the use of such a tool is invaluable for solving crimes, not only to match evidence from a recent crime to an individual in the database but also to link some unsolved cases, showing that they share an as-yet-unknown perpetrator.
Since that 1998 incident, governments have been rapidly expanding the collection of DNA for databases, and changes in database-searching technology that target near matches are raising new concerns. As a result, civil libertarians and privacy advocates are lobbying for restrictions, while some scholars are pushing in the opposite direction, arguing that the only fair way of building a DNA database is to create a universal one—that is, to record the genetic profile of each citizen.
The information loaded into such databases reflects a feature of DNA known as short tandem repeats (STRs). DNA contains a sequence of paired bases, or nucleotides, of which there are four types. The human genome contains about 3 billion such base pairs, arranged into 23 pairs of chromosomes. A small subset of the long sequence creates the 20 000 or so human genes, most of which code for the proteins that determine a person’s biochemical makeup and physical characteristics. The rest—about 98 percent—is noncoding DNA. Although scientists are discovering that a surprisingly high fraction of these seemingly useless sequences may affect the body’s functions, some of them seem clearly to be meaningless artifacts of evolution.
In certain sections of the human genome, the noncoding DNA contains repeated patterns of two to five nucleotides, the number of repeats in each sequence varying by person. For forensic typing, scientists consider repeats at several loci, or positions on the genome. The number of repeats at each locus is known as an allele. People have two alleles at each locus, one from each parent, that vary in length depending on the number of repeats.
In the United States, the Combined DNA Index System (CODIS), established by the FBI in 1990 to link existing local, state, and federal systems, is based on STRs at 13 loci. In London, the Home Office currently relies on STRs at 10 loci. Although the estimated rarity is different for each DNA profile, the estimated rarities of complete profiles can be smaller than one in a trillion.
To gather DNA for forensic databases, a law-enforcement official typically swabs inside the cheek of a suspect or criminal to obtain a sample of cells. Although scientists can extract DNA from hair, semen, or blood, a cheek swab is the most efficient and least invasive way to collect a large sample of DNA. The swab goes to a laboratory, where a technician or a robotic instrument isolates the DNA from the other cellular components.
The extracted DNA goes through a second process: polymerase chain reaction, or PCR, a standard method of creating many additional copies of a selected segment of DNA. In this case, the PCR step targets all the relevant sites (10 in the UK, 13 in the United States). A genetic analyzer then separates the resulting 10 or 13 DNA fragments and measures the number of repeats in each. The numbers, one or two for each sequence, typically range from five to 20. There is just one number in some cases because a person can inherit the same number of repeats from both parents.
The DNA databases store those numbers, along with the sex of the individual. In the United States, the federal database alone contains more than 4.6 million such records. The UK’s, which started in 1995 as the world’s first national DNA database, has about the same number, drawn from a population one-fifth the size.
In the British rape and assault case , B demanded that the court exclude the DNA evidence from his trial because the police had added it into the database illegally. The trial judge agreed. The government appealed, but the Court of Appeal backed the trial judge, noting that Parliament, in establishing the national database, had created rules restricting the database to those convicted of certain crimes. Had Parliament wished to do otherwise, the appeals court argued, it could have done so. Parliament took the ruling as a call to action and in 2001 passed the Criminal Justice and Police Act, allowing law-enforcement agencies to retain DNA samples of individuals charged with a crime but not subsequently convicted.
The United States is now following the UK example. Today, FBI agents cannot legally store data from suspects who were not convicted or from individuals who volunteer their DNA samples for an investigation but are not suspects. But state officials can. Today, four states—Louisiana, Minnesota, Texas, and Virginia—mandate arrestee sampling. California voters in 2004 passed a ballot proposition that will establish by 2009 what should be the largest such database in the United States. New York Governor Eliot Spitzer has proposed including in the state database those convicted of all felonies and misdemeanors. In addition, a bill being considered in South Carolina would mandate the most aggressive arrestee-sampling program in the nation, demanding samples from those arrested for even the pettiest misdemeanors, such as shoplifting.
Some states, including California, Florida, Illinois, Missouri, and New York, though they don’t mandate arrestee sampling, already retain data that may not be added to CODIS, such as samples voluntarily given by someone to eliminate himself as a suspect. The legality of such state databases is “a cloudy area,” according to law professor David Kaye of Arizona State University in Tempe. Stephen Saloom, policy director of the Innocence Project, an organization in New York City that assists prisoners who could be exonerated through DNA testing, has called them “rogue databases.”
Meanwhile, Virginia is experiencing an echo of the B case. Members of the state crime laboratory early this year reported that they had matched a crime-scene DNA sample to stored profiles of DNA from individuals who were arrested but not convicted. Because Virginia mandates that the DNA records be expunged if the suspect is not convicted, the samples were in the database illegally. The state legislature is now considering a bill that would facilitate that record clearing but also allow matches to illegally retained samples to be used in court if they were kept in ”good faith.”
The UK case and subsequent passage of legislation in other countries illustrate the central paradox of DNA databases: inclusiveness. The more samples in a database, the more useful it potentially is at solving and preventing crimes. If the law requires a criminal conviction to allow officials to record a DNA profile, then crimes such as the rape that B carried out in 1997 go unsolved, and B perhaps goes on to commit other rapes.
The problem with inclusiveness is that there is no obvious end to it. Because people arrested for one offense have a higher-than-average probability of having committed other crimes, the inclusion of samples from all those arrested but not convicted has a crime-fighting utility. But then again, so does the inclusion of a sample of the victim, who could also be the perpetrator of another crime. And, for that matter, why wait until B acquires a burglary arrest to include his DNA sample? If it were loaded into a database at birth, he would have immediately been identified as having committed the 1997 rape.
There is no limit to the theoretical utility of adding anyone’s DNA profile to a database. Presumably, though, at some point the utility of inclusion no longer outweighs a free society’s interest in privacy. But where is that point?
When law-enforcement agencies first developed DNA databases, most country and state statutes that dealt with DNA testing mandated it for specified categories of crimes, typically murder and rape. DNA is particularly useful in solving sexual assaults, because investigators often recover semen as evidence.
As public awareness of DNA databases grew, so did the scope of the databases. Politicians could appear tough on crime by extending DNA sampling to an ever-growing array of offenses. Many such moves, however, were merely statutory; politicians did not allocate funding to enable police to do the sampling and analysis. Law-enforcement agencies, sensibly, continued to focus on the most violent offenders and did not take DNA samples from pickpockets even when the law allowed it.
Recently, however, the inexorable expansion of DNA databases has gone beyond individuals convicted of petty crimes and reached people arrested but never convicted. Meanwhile, the U.S. Justice Department is now authorized to take DNA samples from anyone detained by federal agents—which means, principally, those suspected of immigration violations.
Unlike the laws expanding the reach of DNA databases to those convicted of petty crimes, the new laws extending inclusion to arrestees not only allow such sampling, they mandate it. In California at least, that means maintenance of the arrestee DNA database may divert resources from other important tasks. In particular, many law-enforcement agencies still have backlogs of semen samples recovered from rape victims that have not been subjected to DNA testing.
A 2005 U.S. Bureau of Justice Statistics report estimated that it would take 1900 additional workers and US $70 million to reduce the forensic evidence backlog to a manageable size. And in a February 2007 interview with The New York Times, Robert Fram, chief of the FBI Scientific Analysis Section, decried the mandating of new populations to sample without any increase in resources and noted that the FBI has a backlog of 150 000 samples.
Arrestee sampling can’t possibly be a better use of resources than clearing that backlog. Most likely, such wholesale sampling would also divert money from other pressing needs, such as crime prevention and drug treatment.
Privacy advocates have other reasons for fighting against the inclusion of arrestees in DNA databases. Tania Simoncelli and Barry Steinhardt, both of the ACLU, have been particularly vocal on the subject. In the Journal of Law, Medicine, and Ethics, Simoncelli argued that “the very existence of DNA databases turns the presumption of innocence on its head,” because those included in the database are treated as potential suspects every time a new crime is investigated.
Of course, governments have long maintained databases containing the fingerprints of convicts, arrestees, and various individuals who are not criminals, including teachers and immigrants. The law considers such databases acceptable intrusions into personal liberty. But civil libertarians say that DNA samples, unlike fingerprints, include personally sensitive information to which the state should not have access: people’s ancestry, disease propensity, and perhaps even behavioral characteristics. They argue that such information could be abused by the state, by employers, or by insurance companies.
Scientists can identify weak correlations between fingerprint-pattern types and ethnicity. But people are generally more anxious about disclosing their genetic information than their fingerprints—a concern that typically generates a strong emotional response against broadly inclusive DNA databases.
Sociologist Amitai Etzioni and others who tend to value the interests of the community over those of the individual argue that broad inclusion might be a good thing. “Collecting the DNA of convicted, nonviolent felons,” Etzioni says, “may still be justified, because they have significantly lowered rights.” In his contribution to the essay collection DNA and the Criminal Justice System, Etzioni went further and argued that even “suspects have diminished rights, including much lower rights to privacy,” and therefore he sees “no obvious reason why suspects should not be tested and their DNA included in databases.”
Advocates for DNA databases also contend that because the DNA used for standard forensic profiling is noncoding DNA, the concern about genetic privacy is not an issue. But noncoding DNA may correlate with disease propensity, even if it does not cause disease, potentially allowing “tracking” of genetic diseases. But however useful such information might be, what an insurer would really want would be not just a profile, but a complete biological sample—the original cheek swab.
And there’s the rub. In all U.S. jurisdictions except Wisconsin, law-enforcement officials typically retain the samples themselves. Therefore, all the genetic information of those who are being tracked in the DNA database remains accessible to the state. There are a number of state statutes forbidding such uses of genetic information, but such laws will not necessarily remain in place.
The government could destroy the sample and record only the numeric values of its DNA profile. And that procedure could become the compromise struck between the desire for privacy and the need for crime control. But as yet, data-banking proponents are holding out against it, because such a compromise assumes that the DNA-database technology is mature. If forensic scientists develop a new scheme for DNA matching, they’ll need original samples to re-encode the existing database population. To be sure, the current systems are powerful, robust, and widely accepted, and the existence of today’s large databases is a powerful deterrent to changing the protocols. Nonetheless, the technology has advanced so rapidly during the past two decades that it would be naive to think that the existing systems represent an eternal standard.
Discrimination is another powerful argument against arrestee databases. Even convict-only databases risk being discriminatory. In the United States, courts convict some racial minorities at much higher rates than their proportion of the overall population. Criminologists are divided as to what extent the overrepresentation arises from discrimination in policing and in the courts, as opposed to a higher rate of offending, at least in the case of violent crimes. But when it comes to drug crimes, which constitute a large portion of the criminal caseload in the United States, discrimination is undisputed. And one wouldn’t want the injustice to extend to inclusion in a convict DNA database (although the harm seems far less than the damage that is done in the first place by discrimination in the criminal convictions).
When it comes to arrestee databases, however, the issue becomes more salient. Criminologists agree that racial discrimination is greater at the level of arrest than it is at the level of conviction, because arrest depends so heavily on police discretion. Arrest discrimination is not based merely on race but also on class and geography. For example, you can use, or even sell, narcotics with a far lower risk of arrest if you are rich, white, and live in the suburbs than if you are poor, black, and live in the inner city. Some demographic sectors of American society, such as poor, black, inner-city males, have shockingly low probabilities of getting through adolescence without having at least one run-in with the police. If such encounters trigger inclusion in a DNA database, the database becomes discriminatory.
To glimpse the likely outcome in the United States, look at the United Kingdom, where the database covers a much larger portion of the overall population than in the U.S. There, 37 percent of black men, 13 percent of Asian men, and 9 percent of white men have had their DNA profiles included in the national database. The figures are even starker if one considers only younger males. Approximately 77 percent of black males between 15 and 34 are in the national database, compared with 22 percent of white males in that age bracket.
Such an arrestee database tends to include the maximum number of racial minorities and the smallest number of whites [see charts, “Color Wheels ”].
As Kaye and the University of Wisconsin’s Michael Smith put it starkly in their contribution to DNA and the Criminal Justice System, “Such an ‘arrest-only’ database would have the look and feel of a universal DNA database for black males, whose already jaundiced view of law enforcement’s legitimacy is itself a threat to public safety.”
When law-enforcement officials enter new genetic records from unidentified samples recovered at crime scenes into a DNA database, the system compares them with existing profiles. Some legal scholars say that this procedure amounts to daily searches of each person in the database—no different from stopping drivers for pat-downs without warrants. Other experts maintain that because the individuals aren’t aware of the searches, no harm is done.
The risks might seem remote now, but even so, perhaps they should be borne by all citizens equally.
One risk, the possibility of false incrimination, either through DNA planting or laboratory error, is less remote. There simply isn’t good current data on the false-positive error rate for DNA profiling. But although forensic DNA-profiling technology is robust, reports of recent errors abound. And it’s not just the laboratories generally considered poor (like the police crime lab in Houston) but also those regarded as among the nation’s finest (such as the FBI’s and the Virginia State Department of Forensic Sciences) that are making mistakes. The errors, documented by Professor William Thompson, of the University of California, Irvine, and others, have led to wrongful convictions.
Planting DNA is possible as well, and it is likely to become increasingly easy and cheap to do, allowing more people to learn how. Of course, the planting of evidence is not new. But because DNA evidence commands such enormous trust and is conceived as scientific, the potential hazards of evidence tampering would be particularly pernicious. Again, perhaps the risks of such mistakes or malfeasance should be borne equally.