This is a coursework with new use of Google Cloud Platform, so please expect a few bumps in the mechanics. If you get stuck, please post on Piazza or visit TA hours, and we will help you through.
We will design and train convolutional neural networks (CNNs) for scene recognition using the TensorFlow system. Remember scene recognition with bag of words, which achieved 50 to 70% accuracy on 15-way scene classification? We're going to complete the same task on the 15 scenes database with deep learning and obtain a higher accuracy.
Task 1: Train a CNN to recognize scenes with a provided architecture and a small dataset of 1,500 training examples. This isn't really enough data, so we will try to:
sklearn.preprocessing—please implement standardization yourself.
Task 2: Fine tune the VGG16 pre-trained CNN to recognize scenes, where the CNN was pre-trained on ImageNet. Fine tuning is the process of using pre-trained weights and only a few gradient updates to solve your problem, which itself might have a slightly different output. In our case, VGG was trained on ImageNet (1000-way classification) but we wish to use it for 15-way scene recognition. You will have to download the pre-trained weights from
/course/cs143/pretrained_weights/vgg16.npy, or we have a copy of the model that you can download into Google Cloud Platform by calling
$> wget http://cs.brown.edu/courses/csci1430/proj4/vgg16.npy.
These are the two most common approach to recognition problems in computer vision today: either train a deep network from scratch—if you have enough data—or fine tune a pre-trained network.
writeup/writeup.tex. Please compile it into a PDF and submit it along with your code. We conduct anonymous TA grading, so please don't include your name or ID in your writeup or code.
Be sure to analyze in your writeup.pdf whether your extra credit has improved classification accuracy. Each item is "up to" some amount of points because trivial implementations may not be worthy of full extra credit. Some ideas:
We will use Google Cloud Platform through GCP Tutorial. Each account can only get a coupon once. Please make sure you are using your brown.edu account, not your cs.brown.edu or accounts of other domains. You will receive $50 credit to use. To stop consuming your free credit unnecessarily, please shut down the VM instance after using it each time.
If you have used up all the coupons for GCP, you can also use Google Colab. Google Colab provides a platform with free GPUs. We can use Google Cloud Platform through the Colab Tutorial.
We have two additional Python virtual environments with TensorFlow and TensorPack installed. These can be used like our existing
/course/cs143/cs1430_env environment, but they reside instead at
/course/cs143/tf_gpu. BUT! There are very few department lab machines with GPUs! We will not tell you which ones, because this is not recommended. You will potentially run into all kinds of trouble, like other people remotely logging into your machine, setting off a GPU job, and then this killing your GPU job because the card ran out of memory.
We will not help you with these issues! Please use Google Cloud Platform!
We will use TensorFlow through Python. This is all set up on the departmental machines. For a personal machine, please visit the TensorFlow Website for installation instructions. We will also use an additional library called TensorPack, which provides convenience functions.
For example, James' installation process on his laptop on Windows through PowerShell:
pip3to install TensorFlow:
PS> pip3 install --upgrade tensorflow
pip3to install TensorPack:
PS> pip3 install --upgrade tensorpack
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
If you have an NVIDIA GPU on a local machine and want to use it, it's a little more complicated to set up; please venture for yourself.
Before we start building our own deep convolutional networks, please look at Getting Started with TensorFlow. Please also go through the basic classification tutorial here, and the CNN on MNIST example here.
Project description and code by Aaron Gokaslan, James Tompkin, James Hays. Originally solely by James Hays, but translated to TensorFlow from MatConvNet by Aaron. GCP and Colab setting up by Ruizhao Zhu, Zhoutao Lu and Jiawei Zhang. We also referenced materials from Brown's CSCI1470 Deep Learning and welcomed the suggestions of its previous TA Philip Xu.