PS1: Network Coordinates with PlanetLab

Due: night of September 24 (midnight)

In this problem set, you will write a set of tools to construct a network coordinate system, and investigate the ability of this system to predict geographic coordinates for oblivious hosts.
We have broken your task into three stages:

Preliminaries.

PlanetLab uses ssh as its remote login protocol, thus you'll have to have ssh set up and an RSA (v2) key available to access PlanetLab sites. We recommend you create a separate RSA key for PlanetLab instead of using your CS dept key. The PlanetLab infrastructure takes some time to distribute ssh keys so we recommend you sign up early.

To create an ssh key, simply type the following on any CS dept machine: ssh-keygen -t rsa -f pl_key This creates your public/private keypair as pl_key/pl_key.pub. Following this, visit the PlanetLab website and enter your details for a new account.

Select Brown University as your site, and make sure you state you are not a PI (Principal Investigator) or technical contact.

After you've registered, you'll need to log into your account. You can do this with the login button at the top left of the PlanetLab homepage. Click manage keys on the left toolbar, and upload your newly created ssh key. Soon after, you should be able to login to planetlab nodes in the brown_lsns slice. To see the list, visit the "manage nodes" option, select the brown_lsns slice.

For example: ssh -i pl_key -l brown_lsns earth.cs.brown.edu

After logging in, you will all be using the same home directory (you are the "user" brown_lsns). Everything you do that creates files should do so in a subdirectory named by your (Brown) username.

1) Collect pairwise latencies.

Gather the n x n matrix of ping-times between all members of the brown_lsns slice. You may find a list of these sites here.

Write a program that accepts a file of foreign nodes to ping. It should ping each one, except itself (pausing slightly between each one), and record a latency figure. It should accept a number indicating how many times it should go through the list. The output should be all ping times (not just the lowest ping-time to each host).

Next, you should write another program that coordinates the execution of the first program on all members of the slice, and then pulls the gathered data from each site to a "base station".

While we do not require you to write in any particular programming language, we recommend the use of a scripting language such as Perl, Python, or even shell script. Please note that PlanetLab slices by default contain very limited software (i.e. next to none). It is your responsibility to set up whatever you'll need to run your program and mirror it across all members of the slice. Avoid using PlanetLab as a compile farm. Compile your programs at Brown, and copy them over to members of the brown_lsns slice.

Finally, note that your program will have to be tolerant to network, and control errors. Examples include lost ping packets, asymmetric ability to ping hosts and failures to login to PlanetLab sites (some sites implement a login quota for a slice), file transfer failures or timeouts, etc.

2) Compute network coordinates

Once you have gathered your ping data at the base station, write a program to compute network coordinates for all members of the slice using the "centralized Vivaldi algorithm". Here you will use the lowest ping-time between any two particular machines.

Using the same data, simulate the distributed Vivaldi algorithm by supplying random samples from all ping times. You should begin your simulation with all nodes near (but not on!) the origin with a small (but not zero!) height vector. Experiment with varying values of the adaptive timestep, and try to find one that works well in practice.

Repeat the last two experiments, but use the latitude and longitude of each machine as its initial coordinates (again, with a small height vector). Latitude and longitude for each site may be found here in brown_lsns.txt.

Compare the results from all your experiments, in terms of convergence times and the squared error metric.

3) Map back to geography

Propose a way to map an arbitrary network coordinate back to geographic coordinates.

Write a program that obtains network-coordinates for an oblivious host. It should coordinate a set of pings from brown_lsns hosts to the address in question, and derive network coordinates for it. Note your mapping mechanism above should be persistent, i.e. you should simply reuse the all-pairs pings data you collected previously.

Use the "map back" procedure you proposed to guess the geographic coordinates of the oblivious machine.

Resources

You might find these tools useful in completing your assignment.

Handing in your assignment

You should hand in:

Email us (jj and cce) a .tar.gz or .zip file containing the above on Wednesday night.