T H E   N I H    C A T A L Y S T     S E P T E M B E R  –  O C T O B E R   2002

NIH Team Harnesses Bioinformatics, Genetics to the Task

by Celia Hooper

On September 13, 2001, with the world stunned by the devastation of the World Trade Center, the Office of the Chief Medical Examiner for the City of New York stood amidst another desperate landscape—perhaps the biggest challenge that has ever faced the U.S. forensics community.

The office had no guide for collecting the data that would be needed in the grim task of identifying unknown thousands of victims, most of whom would not be identified by traditional methods but, rather, by genetic markers amplified from bits of otherwise indistinguishable tissue.

Two-Way Calls: For Help and To Help

Along with frantic inquiries from the victims’ families and politicians calling on their behalf, there were numerous offers of help—DNA sequencing, genetics expertise, metadata analysis, bargain rates for mitochondrial DNA analysis. Even sorting out the offers of help seemed overwhelming amidst the clamor and chaos.

The New York medical examiner’s office (OCME) turned to DNA forensics experts at the National Institute of Justice (NIJ), part of the Department of Justice. The daunting task of assembling a brain trust to advise the OCME fell to NIJ’s Lisa Forman, a Bethesda resident and one-time NIH guest researcher.

In late September, while describing to a friend the vast, complex, and sometimes marginal genetic data that somehow had to be pieced together, there came a spark of recognition: similarities with the Human Genome Project.

At her friend’s suggestion, Forman put in calls to the National Human Genome Research Institute (NHGRI) and the National Center for Biotechnology Information (NCBI), asking their respective directors, Francis Collins and David Lipman, for help.

As it turned out, HHS and NHGRI had already been among those exploring channels for offering expertise, sequencing—whatever was needed. The same was true for the International Genetic Epidemiology Society and the American Society of Human Genetics.

Thus it was that a small group of NIH scientists eventually found their way to Forman’s brain trust—the Kinship and Data Analysis Panel (KADAP). They joined the panel with others from academia, private and public DNA testing labs, the National Institute of Standards and Technology, the Armed Forces Institute of Pathology, the NIJ, and OCME.

Leslie Biesecker (left) and Steve Sherry

A Transforming Experience

Elizabeth Pugh
Joan Bailey-Wilson

NHGRI’s Leslie Biesecker says "scientific advisory committee" doesn’t seem adequate to describe the group. "The intellectual process has been so dynamic, so sober, so focused on this incredible task," Biesecker says. "This group of people has such a commitment and desire to do this as well as it can possibly be done—it goes beyond any group of scientists that I’ve ever been involved with. It’s been an amazing process."

KADAP member Elizabeth Pugh, director of bioinformatics and statistical genetics at the Center for Inherited Disease Research (a center funded by NIH through a contract with the Johns Hopkins School of Medicine in Baltimore), says she has been similarly inspired by the staffs of the OCME and the New York State Police (NYSP), which have worked closely with the KADAP.

"I am continually awed at the knowledge, expertise, dedication, and commitment of the OCME and NYSP personnel who are on the front lines of this unprecedented identification process," Pugh remarks. "It has been a privilege to assist them in whatever small ways I could."

The KADAP has met bi-monthly since October 2001, and the two other NIH-affiliated staff on the panel—Joan Bailey-Wilson and Steve Sherry—have, like Biesecker and Pugh, been struck by the experience.

Bailey-Wilson, head of the statistical genetics section of NHGRI’s Inherited Disease Research Branch, notes that the KADAP was extra work for everyone. "We are incredibly busy, but this was something we felt like we had to do. There isn’t much we could do, but this was something. We had the skills."

NCBI’s Sherry has found working synergistically with his fellow KADAP members and the fruits of the group’s efforts profoundly rewarding. "It’s just been a real privilege to work with these people," he told The NIH Catalyst.

"Working with SNPS in medical genetics and therapeutics is something we do all the time at NCBI. It leads physicians to provide better health care for someone, sometime," says Sherry. "But we here [in NCBI] don’t see that, except as a cluster of intense use of the databases—where a gene was found in months, not years, resulting in a fast track for treatment.

"But the forensic experience is different. Every conclusion is a connection of a lost loved one to a family. I do it for them. It brings closure."

The Magnitude of the Problem

At the first meeting of the brain trust, the OCME conveyed the magnitude of the problems to the group. "The sheer scale of the project was comparable to 15 simultaneous airplane crashes, but [at the time] we didn’t know how many people were missing," Sherry recalls.

The number of people killed at the Pentagon was known, thanks to sign-in procedures; passengers on the airliner that crashed in Pennsylvania, as in other airline disasters, were listed on a manifest. But at the World Trade Center, "It took months just to figure out who was missing," he says.

The KADAP also realized that in the confusion following the disaster, there were procedural difficulties that could impede victim identification. These included blocked communications between computer systems and a jumble of numbering systems for victim and family samples, storage files, artifacts, and analysis data. The form for identifying relationships between relatives and presumed victims had been ambiguous, the instructions to families unclear.

Making Connections

In its next meetings, the group quickly defined and assumed tasks: creating a booklet for family members of victims to help with a second round of reference sample collections; identification of tools that might be useful for massive genetic matching and sorting; programs to straighten out numbering systems and help computer systems talk to one another; and, perhaps most important, clear, logical organization of all the efforts. John Snyder of the OCME likened the work of the group to "building an airplane while trying to fly it."

Cover of the booklet for families produced by NHGRI and the National Institute of Justice for the New York City Office of the Chief Medical Examiner

Biesecker, working with Kathy Hudson, Jane Ades, and Derryl Leja from NHGRI and Robin Wilson-Jones from the NIJ, dove into the task of producing a booklet to help families understand molecular forensic identification—how it works, its limitations, and how they could contribute to the identification process—what Biesecker calls "making the connection between a high-tech process and the emotional needs of a family."

The booklet, written in Spanish and English, simply and with dignity, was produced in record time: 15,000 copies were delivered to New York around New Year’s Day. The pamphlet is already being circulated, adapted, and copied by law enforcement agencies and other groups across the country.

Bailey-Wilson, Pugh, and Sherry joined others in reviewing software that could be adapted for matching and kinship analysis. NIJ’s Forman says the group considered public-domain programs used in airline crashes, in identifying victims of war in Bosnia, and in paternity and criminal cases, as well as a program developed by the U.S. armed forces. "It was pretty clear that none of the programs came close," Forman recalls. "We didn’t have the right tools."

Pugh and Sherry went to work with NIJ’s contractors Amanda Sozer and Steve Niezgoda and with the program developers to adapt the programs to the new demands of the World Trade Center identifications.

Sherry carefully thought through and mapped the information flow needed to coordinate the far-flung parts of the process, from data collection to analysis, reporting, and quality assurance.

Since last autumn, data have been pouring through the complex pipeline, with its architecture heavily influenced by pipeline designs used in the Human Genome Project. Time pressures in getting the system operating were intense as families awaited confirmation that their loved ones had died so that they could hold funerals.

Increasing Complexity

Sherry says that the work plan prioritized the easiest identifications, then moved on to the most challenging. The easy matches were those in which there was a reliable reference sample of a victim’s DNA from before the disaster and an uncompromised sample from the World Trade Center site. The hardest matches are still in the pipeline and heading for some of the most exhaustive and cutting-edge techniques ever applied to forensic DNA.

Forman says scientists working with the identification project developed and built on established tools for high-throughput and nonstandard mitochondrial DNA analysis and more stringent extraction of nuclear DNA needed for severely degraded and co-mingled samples. "The tools were improved because they had to be," Bailey-Wilson says.

These more complex data and more distant kinship matching needs have, in turn, led to new computational challenges. "This analysis being done leads to an incredible informatics challenge," says Forman. "Steve Sherry has just been phenomenal . . . in stretching the power of the technologies."

But Forman says pushing the technology to tease out the most difficult identifications had not initially been envisioned. She had originally imagined the project would be completely finished with the application of standard techniques. Then, at a meeting in June, members of a family who’d lost their brother, James Cartier, in the disaster spoke briefly and poignantly to the group.

"They said that without some identification of their brother, they could not rest. In that 15 minutes, I completely changed my mind about how far we had to go to ‘completely finish,’" Forman says. "There will be a point when science has no more answers, but everything has to be tried."

A New Discipline

Looking toward a September 9–10 meeting of the group in New York, Forman was amazed at the list of "deliverables" that had flowed from the group’s work.

Beyond the incredibly daunting task of identifying hundreds of victims was the booklet for families and improved forensic tools—which will all be placed in the public domain.

In addition, the group has drafted specific and complete directions for handling comparable natural or manmade disasters in the future and begun to develop guidelines for improving training in forensic science.

In fact, Forman believes the work of the group and Sherry’s insight may have inadvertently spawned a valuable new discipline—computational forensics. "At one point Steve naively said, ‘put your computational forensics people on this problem.’ We all just looked at each other. Until that moment, such a thing didn’t exist."

Collection of victim samples ended on May 31, 2002. By August, the easiest identifications had been made, with a few thousand samples still flowing through the pipeline. Application of some of the new tools to the most challenging samples remained, as did NIH’s participants in the identification project.

"I think all four of us from NIH have a commitment to stick with the process as long as our input is needed," Biesecker says. "We are in it for the long haul. We are pushing the boundaries of the science."

Return to Table of Contents