This is a new coursework, so please expect a few bumps in the mechanics. TensorFlow code (with TensorPack functions) will look very different from MATLAB, and much of this project is about familiarizing yourself with these sytems. If you get stuck, please post on Piazza or ask a TA, and we will do our best to help you through! Further, this project has some waiting around, as CNNs take a long time to train.
We will design and train convolutional neural networks (CNNs) for scene recognition using the TensorFlow system. Remember scene recognition with bag of words, which achieved 50 to 70% accuracy on 15-way scene classification? We're going to complete the same task on the 15 scenes database with deep learning and obtain a higher accuracy. We will try the two most common approach to recognition problems in computer vision today: training a deep network from scratch—if you have enough data—and fine tuning a pre-trained network.
Task 0: Install TensorFlow and TensorPack, and familiarize yourself with the stencil code. This will take time—take it slow, learn to follow the code flow.
Task 1: Train a CNN to recognize scenes with the provided architecture and a dataset of 1,500 training examples. This isn't really enough data to gain high accuracy given the number of parameters, so we will try to:
class Scene15
._build_graph()
get_data()
._build_graph()
.Training for this part might take 20-30 minutes—40 seconds per epoch on James' laptop CPU.
Possibly-helpful links: http://tensorpack.readthedocs.io/en/latest/modules/dataflow.imgaug.html, https://www.tensorflow.org/tutorials/layers
+50pts: Achieve at least 50% test accuracy (for any training epoch) on the 15 scene database, and complete each of the requested features.
Task 2: Fine tune the VGG-F pre-trained CNN to recognize scenes, where the CNN was pre-trained on ImageNet. For this, begin by downloading the VGG-16.npy model and placing it in your code directory. Or, if you're running on a department machine, then please feel free to uncomment the dept file system location of this file in run.py __main__.
Training for this part will take many hours—two hours per epoch on James' laptop CPU. Leave it to run overnight, and use the function to resume training.
+20pts: Achieve at least 85% test accuracy (for any training epoch) on the 15 scene database.
Task 3: TensorBoard. TensorBoard is a locally-hosted Web-based interface to assess models. Use TensorBoard to visualize your loss and error. Each training session is saved as a set of logs and model weights (which you can then use on new examples).
$> tensorboard --logdir=train_log/run
+5 pts: Use this information and the visual outputs in your write up.
Task 4: Write up. We provide you with a LaTeX template writeup/writeup.tex
. Please compile it into a PDF and submit it along with your code. We conduct anonymous TA grading, so please don't include your name or ID in your writeup or code.
+5 pts: For your write up.
-5*n pts: Lose 5 points each time you do not follow the hand-in instructions.
The following is an outline of the stencil code:
run.py
. The top level function for data loading and network training. This is what you will run, e.g., $> python run.py --task 1 --gpu -1
. Arguments are explained at the top of the file, plus you can inspect them in __main__. If you run this starter code unmodified, then it will train a simple network that achieves ~40% accuracy after 30 epochs—somewhat better than tiny images and nearest neighbor baseline, but not as good as HOG + bag of words + linear SVM.parameters.py
. Contains all the tweakable parameters; feel free to add your own if you wish.your_model.py
. This is your model for Task 1, to be trained from scratch.vgg_model.py
. This is the VGG-16 model for Task 2, to be fine tuned.Be sure to analyze in your writeup.pdf whether your extra credit has improved classification accuracy. Each item is "up to" some amount of points because trivial implementations may not be worthy of full extra credit. Some ideas:
A GPU isn't required to complete the project. If you're happy to multitask, then you can complete this project by 'checking in' on your training. TensorPack even has a function to send you an email or SMS when your training is done (add to the TrainConfig callbacks).
However, a GPU will speed up training dramatically. CIT Sun Lab has machines in the 6th row with GPUs that can be used for training. Further, you can investigate how to schedule jobs on the department's grid GPU machines. These are a limited resource, so please play nice.
We will use TensorFlow through Python. This is set up on the departmental machines via Python virtual environments.
$> source /course/cs1430/tf_cpu/bin/activate
$> source /course/cs1430/tf_gpu/bin/activate
For a personal machine, please visit the TensorFlow Website for installation instructions. We will also use an additional library called TensorPack, which provides convenience functions.
Windows: James' installation process on Windows with Python 3.6, through PowerShell:
pip3
to install TensorFlow:$> pip3 install --upgrade tensorflow
pip3
to install TensorPack:$> pip3 install --upgrade tensorpack
$> python
>>> import tensorflow as tf
>>> hello = tf.constant('Hello, TensorFlow!')
>>> sess = tf.Session()
>>> print(sess.run(hello))
If you have a personal NVIDIA GPU and want to use it to speed up processing, then it's a little more complicated to set up; please venture for yourself using the TensorFlow documentation.
Recommended:
Project description and code by Aaron Gokaslan, James Tompkin, James Hays. Originally solely by James Hays, but translated to TensorFlow from MatConvNet by Aaron.