Getting Started with Neural Models

The aim of this assignment is to get you started with PyTorch, one of the most popular frameworks for deep learning on robotics. Please install PyTorch and follow the Torch Vision Object Detection tutorial to fine-tune a person detection system on one of the existing PyTorch datasets. The tutorial is here. After you complete the tutorial and train the model, please hook it up to your computer’s web camera to make a live demo, so that you can take an image from your computer’s camera and draw bounding boxes around the people in the image.

Give a demo of this capability to the course staff in class on Thursday 1/29, running on your laptop. We will also ask you to

  • What are the stats of the hardware on your laptop?
  • How long does the model take to train on your laptop?
  • How well does it work on data from the web camera, qualitatively?
  • Report the frame rate of the demo. Can you run it at camera framerate, at 30Hz? Or does it run slower at 1hz or even once per minute?
  • Please add the above information to the table on the class wiki. You will have to create a github account and request to joint the github project to have edit permissions on the wiki.
  • Optional: Collect a dataset of 5 images from your web camera. Annotate the images with ground-truth bounding boxes. Report the quantitative performance on this dataset.