« Back to the main CS 300 website

Project SRC: Time Machine

Partner Choice Due Friday, April 7th at 11:59 PM (EST)
Part 1 Due Friday, April 14th at 6:00 PM (EST)
All Parts Due Friday, April 21th at 6:00pm (EST)


Introduction

Each of us produces immense amounts of digital data today, and we all use software in doing so. In particular, if you use a note-taking app, that app might store your notes in its own proprietary file format, and the app itself will assume that it runs on a current operating system (e.g., iOS, Android, macOS, or Windows) and widely-used hardware architecture (e.g., x86-64).

But what happens if you, old and wise, a few decades from now, decide to look at your old CS 300 notes to relive your college days? The app may be long since defunct, your laptop or smartphone from your college days may have moved on to greener pastures, and you’ll probably be using devices that run a different operating system on a different hardware architecture (e.g., ARM64 or one of its successors).

If your college notes aren’t exciting enough, consider NASA’s lost Apollo guidance software and their hunt for old processors on eBay: despite enabling some of humanity’s greatest technological achievements, NASA faced difficulty maintaining their space shuttles with antiquated software.

The preservation of digital artifacts and the software needed to access them is already an acute challenge, and will only grow in importance as more critical infrastructure comes to depend on software. In this assignment, you’ll explore some of the motivations behind software preservation, and do a hands-on exploration of the challenges of software preservation.

Partners (🚨Action Required🚨)

You will choose a partner for the written portion of this assignment. Alternatively, if you would prefer to be randomly assigned a partner, we will pair you with a classmate. To help us determine groups, everyone should fill out this form by 11:59pm on Friday, March 24! Within the form, you will either write down your chosen partner’s CS login, or opt into random assignment.

Learning Objectives and Answer Expections

This project will help you:

Answer Expectations: Strong answers to the written questions in this assignment will be characterized by the quality of your arguments. You can ensure you get full credit by making explicit claims and supporting them with evidence along clear lines of reasoning. You should not worry about length—anything between a few sentences and 1-2 short paragraphs is perfectly acceptable.

Partner Expectations: You will work with a partner for the written section of this assignment only. Please note that you should still submit your project individually! Feel free to discuss any of the written questions with your partner. However, you should only submit the same response for questions that you have been instructed to complete together.

Assignment installation

Ensure that your project repository has a handout remote. Type:

$ git remote show handout

If this reports an error, run:

$ git remote add handout https://github.com/csci0300/cs300-s23-projects.git

Then run:

$ git pull
$ git pull handout main

This will merge our timemachine folder with your repository.

Once you have a local working copy of the repository that is up to date with our stencils, you are good to proceed. You’ll be handing in your work for this project to thetimemachine directory in the working copy of your projects repository.

Infrastructure Help

Part 1: Reading Old Files

You, a budding digital preservationist, start browsing digital collections (as one does), and stumble upon an archive with an interesting file:

Task: Download the archive.

Eager to make your mark on history, you take a peek at what it is… and are met with almost complete gibberish. What’s going on?

oh no!

The problem you’re facing is one all too common in the digital preservation world. Until recently, much of the focus has been on data preservation; but preserving data alone can often miss key components of the environment that are necessary to interpret the data meaningfully. Be it notes or 3D modeling, data formats are often integrated with the software used to create, view, and manage them, rendering them useless without the necessary software. Digital preservationists have increased their focus on software to address this problem.

After some research, you discover that the file is actually a WordPerfect file. Since the file doesn’t seem to be working with your current software, you deduce that it’s from an earlier time: the DOS era (mid-1980s to mid-1990s).

Assignment

How do we run a program that was written for MS-DOS in the 1980s on today’s computers? First of all, we need a processor and hardware that can run the program. Fortunately, WordPerfect ran on x86 machines (amongst other architectures in use at the time), so as long as we can get a copy of a WordPerfect executable, we should be good, right? Not so fast.

The WordPerfect executable, like all programs, makes assumptions about the syscalls available and the kernel that runs underneath. So, we also need a DOS kernel and the ability to run it on modern hardware! In this instance, we’ll emulate a DOS kernel, rather than actually installing an ancient operating system on your computer.

Emulators are one of the most common methods of interacting with legacy software. Essentially, an emulator recreates the original environment needed by the software, allowing it to run on a modern computer. In our case, we will use DOSBox to emulate a DOS environment. This will give us a platform to run WordPerfect and open our file.

Note: For this, and all following installation steps, you should not use the course container.

Download DOSBox

  1. Install DOSBox on your local machine: follow the installation instructions for your operating system (Mac OS X, Windows).
    (If you want the full retro experience, press Alt-Enter (Opt-Enter on Mac) to enable fullscreen when running DOSBox.)
  2. Create a folder on your host computer (e.g. on Mac, ~/DOSBOX), then start DOSBox. You’ll see a MS-DOS command prompt (Z:\>). Mount the folder you created as the C: drive with Z:\> MOUNT C <FULL-PATH-TO-FOLDER> inside DOSBox (For instance, Z:\> MOUNT C ~/cs300/DOSBOX.) Note that this mounting functionality is something the emulator provides; it wasn’t available in the original DOS running on a physical computer.
  3. Now, any file within that folder will automatically appear in the DOSBox emulated environment. Switch to the C: drive with the command Z:\> C:.
  4. DOSBox rebinds the Ctrl-Fn shortcuts to special actions, such as Screenshot, Record Video/Audio, etc. These are helpful for DOSBox’s configurations, but unfortunately, they interfere with the WordPerfect program. You will need to remap all of the special actions so that you can use the Ctrl-Fn buttons freely. To start, open the DOSBox Keymapper with Ctrl-F1 (or Ctrl-Shift-F1 / Ctrl-Cmd-F1).
You should now see a screen like this!

Help, my KeyMapper is not showing up!
Alternatively: Help, I messed up my KeyMapper!
  1. If you cannot open the KeyMapper using any of the three commands listed in Step 4, you likely have a problem with your KeyMapper. If you did manage to get to the KeyMapper, but changed something and now can’t get back, this will also be helpful!
  2. We will be resetting the KeyMapper to it’s default values. Please follow the instructions that correspond to your OS.
  3. IF YOU ARE ON MAC: On your terminal, cd into the .dmg file containing your download of DOSBox. Then, type in cd dosbox.app, then cd Contents, and then cd MacOS. From here, you can reset your KeyMapper by entering ./DOSBox -resetmapper.
  4. IF YOU ARE ON WINDOWS: Locate the folder where DOSBox is installed (typically C:\Program Files (x86)\DOSBox-0.74-3). Then, run the Reset KeyMapper.bat script by double-clicking it.
  5. Now, your KeyMapper should be reset! Go ahead and try the three commands in Step 4 again.
  1. One by one, select each of the items in the bottom-right table (containing ShutDown, Cap Mouse, Fullscreen, etc.). For each item, except for Fullscreen and Mapper, you should ensure that both “mod1” and “mod2” are selected. Refer to the image below for an example of what this looks like.
Remapping Example

  1. Save and exit out of the Keymapper; your keys should behave normally now.

Download WordPerfect

  1. Time to download WordPerfect 5.1! Since WordPerfect was developed in 1979, it would normally be downloaded from a physical floppy disk. Nowadays, we can cut some corners and download the software online as a floppy disk .img file. Unfortunately, many modern computers are not able to easily interact with this file type. Instead, we’ve provided you with the pre-extracted installation files.

Task: Download ALL of the installation files.

  1. Make a directory titled “WP” inside your mounted DOSBOX folder, and copy all of the installation files to that directory. Additionally, move the secret-files you downloaded earlier into your DOSBOX (not your DOSBOX/WP) folder.
  2. DOSBOX does not reflect changes to your mounted folder immediately in some situations. Restart DOSBOX at this point, mounting the folder again.
  3. Inside DOSBOX, navigate to your WP directory, and run INSTALL. This will start the WordPerfect installation process. Press y or <Enter> when prompted. You only need to install the core program and the help files. Make sure you respond y to features described as essential (including help/utility files), and n to all questions about installing extra features (printer drivers, graphics, etc).
  4. During installation, a new directory called WP51 should be created in your DOSBOX folder. This is your WordPerfect directory! Copy CS300REF.WPD into this folder.

Success! WordPerfect should now be installed.

  1. index into your WordPerfect directory, WP51, using cd WP51 from within your mounted folder. You can run WordPerfect on a file by typing WP <file-name> at the C:\WP51> prompt.
  2. Familiarize yourself with WordPerfect’s interface. Then, complete the tasks listed in the CS300REF.WPD document.

Your instructions are inside the document. You should copy any files produced into the timemachine directory of your CS 300 projects repo. Enjoy!

Reflection

Wahoo! We did it; the old documents were saved from disappearing forever into the void, and you now have a working DOS environment to experiment with more legacy software. (If you’re interested, websites like DOSGames and WinWorld contain numerous DOS-era software and games; try one out, and experience what technology felt like in the 90s[1].) Now, if someone discovers a stash of George R. R. Martin’s unfinished novels, you’ll be adequately equipped to handle his WordStar files[2].

The process you went through, albeit very simplified, is a real problem digital archivists face every day when handling decades-old data formats. In the next part of the assignment, as we explore additional challenges to and alternative methods of software preservation, keep in mind some of the difficulties you faced while setting up preservation software and interacting with legacy applications.

Part 2: Hardware Emulation

In Part 1, we explored a solely software emulator, DOSBox. While DOSBox provides a DOS kernel emulation on an x86 machine, it is certainly possible to boot up DOS on modern machines, as DOS environments (e.g., MS-DOS or IBM PC DOS) are compatible with today’s Intel x86 CPUs.

However, it is not always so straightforward. For example, Library of Congress archivists who worked to preserve digital data of nobel laureate Nina Federoff faced some serious challenges: her data was created with the MacDraw Plus and Hypercard programs, which require an Apple Mac OS 9 environment. Mac computers prior to Mac OS 10 used the PowerPC instruction set, and software for them is incompatible with modern x86 or ARM64 computers. Indeed, with new hardware like Apple’s M1 chips and the accompanying transition to the ARM64 architecture, it is entirely possible that in a few years or decades, software applications written for our current x86 architectures will be incompatible with the hardware of the day. Apple M1 Mac users are already acutely aware of this issue[3].

Of course, one option to access applications for specific architectures is to purchase a physical retro-computer; for instance, old PowerPC Macintoshes can be found on eBay. However, this is clearly neither durable nor feasible; physical hardware eventually breaks, and rare retro-hardware may be difficult for digital archivists to access—even NASA has had difficulty. What if, like software, we could achieve some sort of emulation, but instead of different hardware components (i.e., different CPU architectures)?

Enter hardware emulation. In this part, we’ll explore one such example: Atari 2600! Released in 1977, the Atari 2600 quickly dominated the market, becoming synonymous with video games and sparking the growth of the entire industry; following its decline, its games have become favorites in retro gaming communities (and have even found use in a rather surprising modern application: deep reinforcement learning).

Assignment

The Atari 2600 operating system used the MOS Technology 6502 instruction set, a long-since defunct CPU; thus, we’ll use the Stella emulator.

Apple M1/ARM64 notes

The Stella emulator for this part of the assignment is not officially compatible with M1 machines, and the download page says that it is “Intel only”.

However, our testing on M1 devices suggests that it works just fine. The reason for this is that Apple built an x86-64 emulation mode into the hardware of M1 processors (“Rosetta 2”). M1 processors can dynamically translate the machine code of x86-64 executables into ARM64 machine code as they run them (though this does come with some slowdown). So, if you have an M1, you’re running one emulator (Rosetta 2) to run another emulator (Stella), and there are no fewer than three architectures involved: ARM64 (hardware), x86-64 (emulated by Rosetta 2), and MOS-6502 (emulated by Stella)!

Task:

  1. Download the emulator for your operating system.
  2. Acquire some ROMs for Atari 2600 systems (Stella provides some guidelines here, or you can choose from this list).
  3. Choose any game, and play it for a bit. If you are having trouble starting your game, don’t be afraid to check the Stella documentation!

Now answer the following question:

Q1: In the README.md file in timemachine in your project repository, describe your experience with the game. Here are some guiding questions (you don’t need to follow them strictly, but do demonstrate that you’ve explored a game):

Hardware emulation provides a complete infrastructure for digital preservation, but it is also technically difficult to execute: all technical specifications and digital logic, down to precise clock cycles and analog elements, must be recreated by a hardware emulator. Moreover, hardware is constantly changing.

For example, Apple switched from Intel processors to Apple Silicon in 2020. If you have an M1 or M2 MacBook, your computer uses this new architecture! This means that many emulators designed for Intel hardware will not work on your device.

Task: Answer the following question (again, in timemachine/README.md).

Q2: What happens to hardware emulators when new architectures come along? How does this affect the feasibility of hardware emulation as a method of software preservation?

Part 3: Social Context (Partner)

Answer Expectations

Partner Expectations

Society has plenty of experience with preserving valuable historical artifacts outside of the digital realm: whether art, architecture, film, or archaeological conservation, the importance and process of preserving our cultural heritage is well established. But preservation in the digital context is less well understood.

Task: First, familiarize yourselves with approaches to preserving our cultural heritage.

  1. Read this article about ethical considerations surrounding preservation; then, read this case study about modern historical preservation techniques, priorities, and cultural and financial shortcomings.

Next, consider the efforts and cultural implications in the digital domain.

  1. Read pages 12-22 of this Library of Congress report (An Executable Past: The Case for a National Software Registry). As you’re reading, consider the justifications for and approaches to software preservation, and compare them with your notions of traditional preservation.

In the assignment, you explored two common ways emulation assists software preservation; while not the only option, emulation has seen the most success as a preservation tool. As interest grows and we continue to improve our toolbox, our ability to tackle technical challenges as a software preservation community increases. However, technical difficulties are just one aspect to consider when dealing with preservation.

Representation

As we reckon with our digital legacy, it’s important to consider what we choose to represent.

Task: With your partner, discuss the following and write up some key points from your conversation. (again, in timemachine/README.md).

Q3: How does software preservation compare to other forms of preservation that society already engages in today? What standards should we apply, and how does this compare to the standards used for other types of preservation? For example, you could consider the preservation of art or architecture in your answer.

Legality

Another challenge in software preservation is legality. If you look into it, WordPerfect and the Atari ROMs were commercial software at the time when they were produced; to use them, you must obtain a license and/or purchase the software. Yet, preservation efforts have made these programs freely available on the internet. Digital archivists who host such artifacts face legal uncertainties every day.

Archivists frequently operate with abandonware — software that, while technically still proprietary and protected by copyright, has been ignored by a potentially defunct manufacturer. Some manufacturers (if they’re still around) actively help abandonware sites, or at least tolerate them. However, this is not always the case; some manufacturers, like Microsoft or Nintendo, have pursued legal challenges against digital archives, which resulted in some major sites shutting down.

Task: For the following question, coordinate with your partner to take on opposing, or at least conflicting views. Then, respond individually in timemachine/README.md. When you are done, come together and discuss your responses! Together, write up 3-4 bullet points from your conversation (also in timemachine/README.md). If you are able to come to a consensus, make sure to include your conclusion and explain how you reached it. If not, explain where your positions clashed.

Q4: Should digital libraries and preservationists receive legal protections, and if so, what might this look like? When supporting you answer, be sure to consider:

Tying it Together

So far, companies and government institutions have been rather uninvolved in software preservation efforts. As software proliferates and grows in complexity, and more of it moves to web services that have proprietary server-side code, it will become increasingly difficult for individuals and non-profit organizations to preserve digital artifacts on their own.

Task: For this activity, each partner should play the role of a different stakeholder in software preservation (e.g. the government, a company/software developer, an individual/non-profit, or a consumer). Your response should go in timemachine/README.md.

Q5: Take turns sharing what you think “your” responsibility towards creating and preserving digital content should be, and what your partner’s should be. Then, see if you are able to come to a consensus. If you are, write your conclusions about what each stakeholders’ responsibility should be. If not, write a summary of your conversation, and explain where your positions clash.

Here are some potential dimentions to consider (although feel free to explore different directions):

Maintaining legacy software is an expensive, laborous, and ever-growing task, and the financial incentive for effective preservation is not always high. Furthermore, thoroughly maintaining all of our old software requires substantial use of online resources and programming ability. As such, it is worth discussing the true importance of software preservation, weighed against these high costs.

One way to look at this conversation is to define where software preservation falls between public good and expensive taste. In Michael J. Rushton’s paper on Expensive Tastes and Public Funding for the Arts, he defines expensive taste as:

“…those held by a person who, compared with the general population, in order to achieve a given level of welfare, needs to have available for consumption a good (or a few goods) that is only available at a high price. Suppose, for example, George only enjoys an art form that is expensive to experience, when most of the population is satisfied by cultural offerings more cheaply obtained, and further assume that this art deeply matters to George in terms of his wellbeing and capability for enjoying a fully satisfying life.”

Meanwhile, public goods are commodities that benefit all citizens, and should therefore be made publicly available. Services that qualify vary by country, and might include public education, national defense, and healthcare. Furthermore, once an item is a public good, it will be:

“… made available to all members of a society. Typically, these services are administered by governments and paid for collectively through taxation. …The two main criteria that distinguish a public good are that it must be non-rivalrous and non-excludable. Non-rivalrous means that the goods do not dwindle in supply as more people consume them; non-excludability means that the good is available to all citizens.” (Investopedia)

Task: As with Question 4, coordinate with your partner to take on opposing, or at least conflicting views. After you respond individually in timemachine/README.md, come together and share what you wrote! You should then write a few sentences responding to your partner’s position, and include that in your README.md as well.

Q6: Is legacy software an expensive taste or a public good? Should this impact our approach to software preservation? Be sure to explain your reasoning.

Final Remarks

Congratulations 🎉 you’ve completed the SRC assignment for CS 300! We hope you’ve developed an appreciation for the difficulty and importance of software preservation, as well as some common techniques and considerations, and how they relate to the technical operating systems and hardware topics we discuss in the course.

Handing In

Please hand in the files and answers for this assignment via Git in your cs300-s23-projects-YOURNAME repository. Put your answers into the README.md file in the timemachine/ subdirectory of your project repository, and also put all other files from this assignment into that directory.

By 6:00 PM on Friday, April 21st, you must have filled in the file README.md in the timemachine directory in your projects repo, and pushed the files produced by Part 1 of this assignment.

Grading breakdown

This assignment is worth 3% of your total course grade.


This assignment was created for CS 300.


  1. One application that we found particularly interesting was Sid Meier’s Civilization 1; even in the 90s, games were truly quite sophisticated! ↩︎

  2. WordStar was the first word processor that offered textual WYSIWYG functionality; it preceded WordPerfect, and dominated the market until WordPerfect eventually took over. ↩︎

  3. Indeed, our course Docker container originated in part to support the new M1 Macs; the old virtualization software, VirtualBox, worked only on x86 machines, and was incompatible with ARM. ↩︎