Archives and Excavations (slides)

Roger B. Blumberg
Department of Computer Science &
The Sheridan Center for Teaching and Learning
Brown University
rbb@brown.edu

prepared for Excavating the Archive: New Technologies of Memory (archive.parsons.edu) -- June 3, 2000





"Radio, television, the telephone are exclusively methods of achievement; motion pictures, photography, the phonograph -- authentic archives -- are methods of achievement and retention."

from Adolfo Bioy Casares. The Invention of Morel, trans. Ruth L. C. Simms of La invencion de Morel, (Austin: University of Texas Press, 1985), p. 68.

1














































"The facinating part of excavation is that no two jobs are ever precisely alike and very few are completely predictable. Yet, despite this, excavating methods fall into some broad general categories [e.g. Bulk-Pit, Bulk Wide-Area, Loose Bulk, Limited Area Vertical, Trenching, Rock Excavation], and in this volume an attempt is made, for the first time, to classify these types and to indicate the equipment best used, as well as the recommended methods of deploying that equipment to the greatest ... advantage."


from A. Brinton Carson. General Excavation Methods (New York: F.W. Dodge Corporation, 1961), p. v.

2.











































"As archaeologists, we do not believe that there is one past, knowable and acceptable to everyone, but rather we acknowledge that there are many interpretations of the past to which different individuals or groups -- for a wide range of different reasons -- choose to subscribe. For archaeologists, how valid any particular interpretation is obviously depends on how it fits the ever-increasing body of archaeological (and other Western science-based) knowledge. As interpreters, we also believe we have an obligation to base our work on the most up-to-date information and data available."


from Peter G. Stone and Phillipe G. Planel. The Constructed Past: Experimental archaeology, education and the public (New York: Routledge, 1999), p. 1.

3.











































"And in reality? -- in reality I'm the archivist at one of America's most prestigious institutions of higher learning, where I oversee a collection of rare books and manuscripts, the notes and letters of dead writers and other prominenti, and boxes of miscellany donated by eccentric graduates. This archive, housed in a quiet wing of the main library, is among the finest anywhere; and I am its guardian."


from Martha Cooley The Archivist (New York: Little Brown, 1998), pps. 5-6

4.











































"Materials not open to the public, however, are another story. Now and then some unscrupulous researcher will ask for a 'quick look' at items that remain under lock and key until a specified date. This pushiness instantly annoys me, though it no longer surprises me. With such researchers I assume a weary, antagonized look as I explain that certain bequests arrive with clear restrictions on accessibility. Violating those limits is a form of grave-robbing. Yes: the images that come to me are those of exhumation, the unearthing of something meant to lie fallow -- something that will appear waxy and lifeless if brought to light too soon.

Of course I don't put it in just those terms. But the message gets across to anyone who thinks I'll pick up the shovel and dig for him."


from Martha Cooley The Archivist (New York: Little Brown, 1998), pps. 6-7

6.











































"4.1 Personal Memex

Returning to the research challenges, the sixth problem is to build a personal Memex. A box that records everything you see, hear, or read. Of course it must come with some safeguards so that only you can get information out of it. But, it should on command, find the relevant event and display it to you. The key thing about this Memex is that it does not do any data analysis or summarization, it just returns what it sees and hears. 6. Personal Memex: Record everything a person sees and hears, and quickly retrieve any item on request. Since it only records what you see and hear, personal Memex seems not to violate any copyright issues [15]. It still raises some difficult ethical issues. If you and I have a private conversation, does your Memex have the right to disclose our conversation to others? Can you sell the conversation without my permission? But, if one takes a very conservative approach: only record with permission and make everything private, then Memex seems within legal bounds. But the designers must be vigilant on these privacy issues. Memex seems feasible today for everything but video. A personal record of everything you ever read is about 25 GB. Recording everything you hear is a few terabytes. A personal Memex will grow at 250 megabytes (MB) per year to hold the things you read, and 100 gigabytes (GB) per year to hold the things you hear. This is just the capacity of one modern magnetic tape or 2 modern disks. In three years it should be one disk or tape per year. So, if you start recording now, you should be able to stick with one or two tapes for the rest of your life. Video Memex seems beyond our technology today, but in a few decades, it will likely be economic. High visual quality would be hundreds times more -- 80 terabytes (TB) per year. That is a lot of storage, eight petabytes (PB) per lifetime. It will continue to be more than most individuals can afford. Of course, people may want very high definition and stereo images of what they see. So, this 8 petabyte could easily rise to ten times that. On the other hand, techniques that recognize objects might give huge image compression. To keep the rate to a terabyte a year, the best we can offer with current compression technology is about ten TV- quality frames per second. Each decade the quality will get at least 100x better. Capturing, storing, organizing, and presenting this information is a fascinating long-term research goal.

4.2 World Memex
What about Bush?s vision of putting all professionally produced information into Memex? Interestingly enough, a book is less than a megabyte of text and all the books and other printed literature is about a petabyte in Unicode. There are about 500,000 movies (most very short). If you record them with DVD quality they come to about a petabyte. If you scanned all the books and other literature in the Library of Congress the images would be a few petabytes. There are 3.5 million sound recordings (most short) which add a few more petabytes. So the consumer- quality digitized contents of the Library of Congress total a few petabytes. Librarians who want to preserve the images and sound want 100x more fidelity in recording and scanning the images, thus getting an exabyte. Recording all TV and radio broadcasts (everywhere) would add 100 PB per year. Michael Lesk did a nice analysis of the question ?How much information is there?? He concludes that there are 10 or 20 exabytes of recorded information (excluding personal and surveillance videotapes) [16]. An interesting fact is that the storage industry shipped exabyte of disk storage in 1999 and about 100 exabytes of tape storage. Near-line (tape) and on-line (disk) storage cost between a 10 k$ and 100 k$ per terabyte. Prices are falling faster than Moore?s law - storage will likely be a hundred times cheaper in ten years. So, we are getting close to the time when we can record most of what exists very inexpensively. For example, a lifetime cyberspace cemetery plot for your most recent 1 MB research report or photo of your family should cost about 25 cents. That is 10 cents for this year, 5 cents for next year, 5 cents for the successive years, and 5 cents for insurance. Where does this lead us? If everything will be in cyberspace, how do we find anything?"

from Jim Gray. What Next? A Dozen Information-Technology Research Goals, Microsoft Technical Report MS-TR-99-50, June 1999. Electronic version at http://www.icsi.berkeley.edu/~nchang/cs182/TuringTalk.txt


















































"Dialectical materialism and the Nietschean doctrine of the will to power succeeded in bringing about a subversion of values that both lightened our burden and tempered our souls. But they have now lost their power of contagion. Both tendencies are essentially a drive for more, but as this awesome energy accelerates, its force decreases. Today the best expression of this drive for more is not thought (art or politics) but technology."

from Octavio Paz. "Nihilism and Dialectics," in Alternating Current, trans. by Helen Lane (Arcade Publishing, 1980), orig. 1967, p. 121)



















© 2000 Roger B. Blumberg