Lab 6: Web Scraping

Overview:

For this week’s lab we’ll be web scraping the Brown academic calendar for information about event names and dates!

Part 1: Getting Started

To start this lab, go to the academic calendar at this link . In your broswer, pull up the web inspector. To do this, right click the page and select inspect element (we recommend that you use Firefox or Google Chrome). Take a second to go through the tag information and notice where information is located.

Part 2: Scraping the Site

Create a file called lab6.py and complete the following tasks:

  • Copy down the following code:
from bs4 import BeautifulSoup
import requests

calendar_url = "https://www.brown.edu/about/administration/registrar/academic-calendar"
calendar_page = BeautifulSoup(
   requests.get(calendar_url).content, features="html.parser"
)


def scrape_events(page: BeautifulSoup) -> dict:
   # TODO: Write function that creates a dict (key is date and value is event name)
  • Write scrape_events (We’re expecting this function to return a dictionary containing the information from the specified tags)
  • Run scrape_events in your terminal with the argument for the function as the BeautifulSoup object containing the academic calendar page.
  • Verify that some of the values in the dictionary line up with the keys that we had in the inspector
  • Notify your TA that you have completed the Lab

Part 3: You’re all set!

That’s all for lab this week! Be sure to ask any questions as this will be the same Web Scraping format that we follow for Project 3 and may be helpful in the final project as well.