Lab 06
Lab 6: Web Scraping
Overview:
For this week’s lab we’ll be web scraping the Brown academic calendar for information about event names and dates!
Part 1: Getting Started
To start this lab, go to the academic calendar at this link . In your broswer, pull up the web inspector. To do this, right click the page and select inspect element (we recommend that you use Firefox or Google Chrome). Take a second to go through the tag information and notice where information is located.
Part 2: Scraping the Site
Create a file called lab6.py
and complete the following tasks:
- Copy down the following code:
from bs4 import BeautifulSoup
import requests
calendar_url = "https://www.brown.edu/about/administration/registrar/academic-calendar"
calendar_page = BeautifulSoup(
requests.get(calendar_url).content, features="html.parser"
)
def scrape_events(page: BeautifulSoup) -> dict:
# TODO: Write function that creates a dict (key is date and value is event name)
- Write scrape_events (We’re expecting this function to return a dictionary containing the information from the specified tags)
- Run scrape_events in your terminal with the argument for the function as the BeautifulSoup object containing the academic calendar page.
- Verify that some of the values in the dictionary line up with the keys that we had in the inspector
- Notify your TA that you have completed the Lab
Part 3: You’re all set!
That’s all for lab this week! Be sure to ask any questions as this will be the same Web Scraping format that we follow for Project 3 and may be helpful in the final project as well.