When the Oxford English Dictionary appears on the Web next March, it will be more than just a pixelated version of its stodgy old self.
The dictionary, widely regarded as the ultimate authority on the English language, is undergoing its first complete revision in its 120-year history. But since revisiting each word in the lexicon will take the editors until 2010, they will upload the current edition — along with 1,000 new and revised entries — to the Web next year and incorporate the changes quarterly.
See also: Learn English the MS Way
In its online form, the OED will be a living history of the contemporary English language, replete with online citations, an open-source-style word submission process, antedatings culled from Web research, and a number of new entries for words that were born on the Net.
“The Web has had a permeating effect on the editorial side of what we do,” said OED chief editor John Simpson. Online databases, in particular, are playing an important role in the OED revision, changing the nature of lexicographical research.
One of the most difficult tasks in revising the dictionary is antedating, or finding the earliest documented use of each word. Each entry includes a citation, a quote establishing its first documented use. In the past, editors had to sift through every book in the Library of Congress searching for these antedatings.
Now when OED editors research a word, they make use of the full-text historical databases available online — like The Making of America at the University of Michigan — to see whether there are usable early references to the term.
“We wouldn’t have been able to do that 10 to 15 years ago,” said Simpson.
Linguaphile Fred Shapiro of Yale University is using the online American journal archive JSTOR, a database of full-text journal articles from a wide range of disciplines, to help the OED in its antedatings. Shapiro, with the support of a grant from the Andrew W. Mellon foundation, is checking the earliest uses of terms in JSTOR against the earliest uses of the same terms in the OED. The history of many words will be rewritten based on his findings.
“Pastrami,” for instance, is listed in the OED with a first-use reference of 1940, in a letter written by Groucho Marx. But the word can now be documented as occurring at least 20 years earlier. And the adjective “Byronian,” listed in the OED as having a first reference of 1822, appears to have made its entrance into the language with the help of Byron himself (a renowned egoist) 10 years earlier.
But while the editors are making full use of the Web for research, when it comes to actually citing online material, they are sticking to the dictionary’s 19th century roots.
“Traditionally, the dictionary has been keen to use texts that people can, in 100 years or so, refer back to,” said Simpson.
Editors worry that online sources may disappear, and so are wary of linking to them.
“We’re being quite conservative about using online sources. At the moment, we’re not citing email correspondence or sites on the Web,” said Simpson.
“That never satisfies the Net freaks, but it does satisfy those who are concerned with people in the future being able to refer back to the sources.”
A few exceptions will be made for newspapers and academic journals that have online versions. Those sources will be cited with the words “online edition” in brackets.
One of the chief editor’s pet peeves about online citations is the unwieldy nature of the URL. Simpson is loathe to include what he calls a “string of gobbledygook” to a word’s listing.
“I think it’s a great pity that online sources aren’t more elegantly citable. At the moment, there’s a whole string of messes and slashes, and in book publishing, elegance and practicality need to be considered.”
Over 1,000 revisions and new words will be incorporated into the online version, and the full revision is expected to be double the length of the current edition. Many of those changes will be culled from email submissions sent in by the general public — a kind of open-source approach to the overwhelming revision process the editors face. It’s the same process the editors used to create the first edition over 120 years ago.
The call for new words and new meanings of old words is all part of Simpson’s plan to modernize the OED. Simpson says the revised work will represent a broader coverage of English, incorporating slang, dialect words, and terms that have made their way into the lexicon by way of the Net.
“There is a whole rafter of words that either come from the Internet or where the Internet has given them new meanings,” said Simpson. The OED’s editors are on the lookout for Web words that have been around for at least four or five years and for which they can find documentary evidence.
“Words like flame or flaming, something that’s been around since at least the early ’80s, will make it into the dictionary.”
Other Net-born words that are sure to make the final cut: emoticon, bookmark, Perl, browser, search engine, hot-link, gopher, spider, VRML, and Web site.
The Web is doing more than just adding words to the lexicon. It’s affecting usage, said Simpson.
“Those in the computer community are shaking the bars of the cage and trying to move language on.”
Verb usage is tending toward the short and simple, he said. Words like “go” and “get” are used more often than verbs like “surround.” Conversely, the Web may be ushering in more noun and adjective compounds. Consider: “A London-based high-security e-commerce company…”
But Simpson takes care not to come across as a language snob.
“People used to say that email impoverishes language because it tends to not have capitalization, punctuation, and words get spelled wrong. The standards may be different, but it has certainly encouraged writing and communication. And that means a faster development of language change,” he said.
“There are always different levels of language used in the community, and a hubris of differing opinions about what is good and bad,” Simpson said. “There isn’t a unified state view of language.”