Is U.S. History Becoming History?

Government is increasingly conducted on the computer, but there's no plan to preserve the history made on a computer. While the National Archive struggles to catch up with the times, historians fear valuable records are being lost. By Jeffrey Benner.

The workings of government in the first decades of the information era have been poorly recorded, archiving experts say. Years of valuable public records may have already been lost, creating a gap in the country's historical record.

Archivists, government watchdog groups and investigative reporters worry that unless the problem is solved, the lack of information could make it more difficult to hold government officials accountable for their decisions and policies.

"We know less about information in the information age," said Patrice McDermott, a records management analyst with government watchdog agency OMB Watch. "It's not just government, corporations have to deal with this too."

Records management experts say the problem started around 1985, when U.S. government agencies began using e-mail and word-processing programs as they changed the way they conducted business. But they did it without a system for preserving electronic files.

Under the old system, multiple copies of correspondence and documents were carefully filed away. But now that nearly all government operations use electronic documents, the old "paper trail" of how policies and regulations developed, and who made them, has been lost.

Now, experts say, only the final draft is saved, making it more difficult to understand how decisions were made, who made them, and why -- the very information most crucial to historians and investigative reporters.

"The way it used to work is that when you created a document, it circulated with five carbons that were filed in different places," McDermott said.

But once the PC came into common use in the early 1990s, things changed.

"People started storing things on their own disks, willy-nilly," McDermott said. "I'm sure agencies have made print copies of the final documents, but the carbons of who had checked the document and how it was marked up are missing. So, reporters and investigators will have no record of how a policy came into being."

The problem doesn't look to be solved anytime soon. Much of the blame is falling on the shoulders of the National Archive Records Administration (NARA) -- the agency charged with recording the history of government.

More than 15 years after government agencies started to use e-mail, NARA still does not require that e-mails and other important digital records be stored in their original form. Agencies can print them out and then destroy their electronic originals.

As for the estimated 26 million U.S. government webpages, there are no archiving guidelines at all. However, just days before the Bush administration took office, NARA instructed all agencies to take a snapshot of its websites and submit the data to NARA on CD-ROM by March 20.

The lack of an effective system for archiving electronic records troubles Scott Armstrong, an investigative journalist who cut his teeth in the Watergate Era.

As founder of the National Security Archive -- a massive collection of declassified U.S. government documents on topics ranging from the Cuban missile crisis to the Iran Contra affair -- Armstrong has extensive experience reconstructing events from decades-old files.

Armstrong believes by the time the problem is solved, there will be a 25-year hole in the historical record. He's fond of calling it the "Carlin Gap," in mocking honor of John Carlin, archivist of the United States since 1992.

"The gap is enormous," Armstrong said. "I estimate they're preserving less than 1 percent of the electronic documents, and somewhere between 50 and 75 percent of the kind of records previously (in the paper era) archived are being lost."

Lewis Bellardo, deputy archivist of the United States, said he didn't know how much data has been lost government-wide, but he used a telling example to acknowledge that there was a problem.

Due to a server glitch, NARA lost some of its internal agency records in the summer of 1999. "We had an e-mail loss here ourselves," Bellardo said. "If that's the case with us, it's probably not just us."
Until now, NARA has dealt with computer glitches, obsolete computer systems and other flaws in the same manner that most people do: a printer. NARA guidelines allow government agencies to print out electronic documents that are worth saving, and then erase the electronic originals.

But NARA critics such as Armstrong argue that hard copies are an impractical and ineffective way to preserve electronic databases. First, there's way too much information; e-mail from President Clinton's White House years totaled approximately 40 million pages alone. Second, like a file cabinet dumped onto a table, files taken out of a searchable, cross-referenced database and put in a box lose much of their value as historical records.

Armstrong, whose successful fight to save the e-mail record of the Iran-Contra scandal from destruction launched the legal battles over electronic records, points to the continued reliance on paper as an indication of how much information has already been lost.

"The problem with printing out," he said, "is that most stuff doesn't really get printed out, and even if it does, you can't use it the way you would use electronic information. We're moving briskly back into the 20th century. We're moving backwards from 1985. We're not using electronic information at all.

"Historians are going to be shocked."

The consensus among advocates is that NARA has waited too long to attack the problem.

"They started about 15 years too late," McDermott said. "They dug in their heels."

David Bearman agrees. He wrote the electronic archiving regulations used at the United Nations and he is president of the Archives & Museum Informatics, a private consulting firm. "In the last couple years, (NARA) has been making a move toward finding practical solutions, but those moves are late and haven't been very effective."

Bellardo, NARA's second in command, said the agency is trying to get a good electronic records management system in place, but needs at least a few more years. He cited lack of funds, the massive scope of implementing a government-wide management system for electronic data and technical barriers as reasons for the delay.

"We would like to have moved faster," he said. "Our resources are limited, and we have followed a crawl-before-you-walk approach. Only a small percentage of electronic documents need to be kept, but even that small percent is an enormous challenge."

NARA's annual budget is about $300 million.

Even critics of the archives concede that NARA doesn't have an easy job. The agency has to figure out how to translate different kinds of data -- from e-mail to spreadsheets to Web pages -- into a format that will last and is performed in a way that won't strip the information of its context.

A major part of the problem -- from the archivists' perspective -- is that, when reconstructing history from written records, the original file cabinet is just as important as the files inside. The electronic equivalent of the "file cabinet" is the original database program.

"A basic premise of archiving is, 'Don't destroy the order of the original record,'" Bearman said.

NARA points to its joint project with the San Diego Supercomputer Center as evidence that it takes the issue seriously. The project is to develop a "persistent" electronic archive that will preserve both files and the meta-data needed to make sense of them years from now. NARA hopes to have a prototype of the new archive running by 2004.

But even when it's completed, the government will still have to make major changes in the way it manages information in order to take advantage of the new archive. Both NARA and its critics agree that the ultimate goal is to use desktop publishing software that tags files with standardized meta-data when the files are created, not later on.

Without this kind of system, the records -- even if they are saved -- won't make much sense. No one is sure how long it will be until such a system is in place. Hence, the gap.

"There is a substantial amount of data fundamentally at risk," Bearman said. "Most hardware becomes obsolete in four to five years, and most data is unrecoverable in twice that time. A lot has been lost."