We recommend opening an incognito or private browsing window to complete all of the steps in. This is because you may have different Google accounts, and by opening a private browsing window, you ensure that you are always logged into exactly one account (your brown.edu account).
IMPORTANT: Please use your brown.edu account, since cs.brown.edu accounts do not have access to GCP.
IMPORTANT: Make sure when you open the page that it actually has you logged into your brown.edu account. If not using a private browsing window, it's likely that after clicking the link, you will be logged into your default, non-Brown Google account (if you have one). This will cause issues! So make sure to switch to your brown.edu account.
Note: This screenshot doesn't include the billing account option, but for you, it may appear.
Make sure your project is active before proceeding:
In the "Service" drop-down menu, click "None" to select none of the options, then select "Compute Engine API". If this option does not appear in the drop-down menu, follow these instructions:
- Open the navigation menu (the three bars by the Google Cloud Platform logo)
- Hover over "APIs & Services"
- Click "Dashboard"
- Click the "+ ENABLE APIS AND SERVICES" button.
- Search for "Compute Engine API"
- Click on "Compute Engine API"
- Wait for the "ENABLE" button to appear, then click on it.
- Once the API is enabled, go back to step 1, and you should now be able to find the Compute Engine API quota option.
In the "Metric" drop-down menu, again click "None", then select "GPUs (all regions)". It might be easiest to use the search bar in the drop-down menu to search for this option.
Go to the navigation menu (again, the three bars next to the Google Cloud Platform logo).
Hover over “Compute Engine", then click "VM instances".
Click the "CREATE INSTANCE" button.
Fill out the beginning of the form as follows:
us-east1 (South Carolina)
Machine configuration: Set to
4 vCPUs and
15 GB of memory
The choice of GPU is ultimately up to you. The Tesla K80 is powerful enough for this assignment, and is one of the cheapest options. You can choose to use another, more powerful GPU, but be wary of the hourly cost of the VM instance. You have $50 of credits at your disposal. You can view the hourly cost of this instance in the top-right corner of the form.
The VM instance will take some time to be created, and then it will be launched.
On the "VM instances" page, starting and stopping instances can be achieved by selecting the three dots at the end VM instance option, and selecting (surprise) "Start" or "Stop".
As you can see in this image, the "Start" option is not available, since the instance is already started. This is indicated by the green checkmark.
Make sure to always shut down your VM instance when you're done using it. You should have plenty of credits to complete the project, unless you leave your instance running for a day or two straight. The typical VM setup will be around $0.50 / hour.
You can SSH into your VM instance by clicking its "SSH" button on the VM instances page. This will pop-up a window that will eventually load a shell. Immediately from the time you create your VM, you have access to the virtual environment built specifically for Project 4.
Do not use your own virtual environment. Use the one we provide, as it provides the correct Tensorflow version that will work with the VM's GPU.
You can activate this environment by entering:
If you find SSH-ing through the browser clunky or annoying, you can explore the gcloud command line tool for interfacing with your GCP instance. This includes both ssh and scp programs for editing/transferring files to your instance via your own computer's command line.
You can use
git to clone your project code to the VM. If you haven't done this from the command line before, it's very simple. On your GitHub project's webpage, there should be a big green button that says "Clone or Download". When you click it, a link will show up that you can copy to your clipboard. In your VM's command line, enter
git clone <link> to clone your project to the current directory. Git will likely ask you for your GitHub information. Follow the steps they provide. Once all of this is done, there are a couple of options for actually editing your code:
gitto transfer code changes to the VM, where training happens.
For part 2 of Project 4, you will have to download the pre-trained weights for VGG16. You can do this by running the following command in your project's "code/" directory:
You will also have to transfer model weights from GCP to your computer. This can be done several different ways:
readlink -f <file's name>(don't include the brackets). To upload files, you can click the gear icon again and select "Upload file". This option for transferring files can be a little slow, but it's simple and straightforward.
.h5files (the file extension of saved model weights). This should give you about 1 GB of large file storage in your GitHub repo. You can use
git add <model weight file>to singularly add your best weights for storage in your repo, which can then be pulled to your local machine.
gcloudcommand line tool, you can use
gcloud compute scpto transfer files between your instance and your computer. This is probably the fastest way to transfer files between GCP and your computer, but it takes some effort to set up.
Tensorboard is a super useful tool for visualizing the progress of your training. We have built Project 4 to create Tensorboard logs automatically. You can see graphs of training/test set loss, training/test set accuracy, visualizations of your network classifying preprocessed images, and (if you use the
--confusion flag when running
run.py) a confusion matrix.
Important note: While training in Project 4, your Tensorboard logs will be written to a directory named
logs/. If you do not copy these logs to a directory of another name after training, the next training session will store model logs alongside your old logs in
logs/, and Tensorboard will visualize this in a somewhat weird way. So make sure to save your
logs/directory under a different name after training if you'd like to display it in Tensorboard later.
There are a couple ways to view your Tensorboard logs on GCP.
Once you are finished training a model, go to the VM instances page on the GCP console.
Click on the "Activate Cloud Shell" icon in the top-right corner of the page. It should look like this:
This should pop up a console at the bottom of the page. Type in the following command, filling in the appropriate fields as they pertain to your setup:
Make sure it's using the correct zone/region of your instance if it asks.
Next, launch Tensorboard on port 8080:
Now click the "Web Preview button" at the top-right of the Cloud Shell window:
Select "Preview on port 8080". This should launch the Tensorboard webpage in a new tab in your browser:
As an example of this method, here are the commands I had to use, given that my VM instance's name is
This method is especially useful if you'd like to view your training in real time.
Go to the VM instances page on the GCP console.
At the far right of where your instance is listed, click the three dots, then select "View network details"
Click "Firewall rules" at the left of the page:
Click "Create Firewall Rule" at the top of the page.
Everything in the form that appears can be left as default except for the following:
Now you should be able to launch Tensorboard in an SSH session with you VM instance and view it online using the external IP address of your instance (which can be found listed in the VM instances page) and the port number
6006. For example in my SSH session, I've navigated to my
code/ directory. Here (with the Python virtual environment activated) I launch Tensorboard with the command:
--bind_all flag is important to have here.
Now in my browser, I type in
<external IP>:6006, and voila! Tensorboard loads.
This method can be used live during training by either having two SSH sessions running at once, with one training the network and the other running Tensorboard, or by using a program like tmux (should come pre-installed on your VM instance) to run both the network training routine and Tensorboard in the same SSH session.
Don't visualize your Tensorboard logs on GCP; if you have Tensorboard installed on your computer, you can simply transfer your logs to your computer and run Tensorboard from there. In your browser, you'd enter
localhost:<port> to access the webpage.
Many thanks to the TA staff of cs1470 (Fall 2019), specifically Josh Levin, Patrick Zhang, and Eleonora Kiziv, whose GCP guide this is modeled after.