We'll start by defining a small program that doesn't work how we want.
from dataclasses import dataclass
from typing import List
from datetime import date
L = [1, 2, 3, 4]
for e in L:
if e == 2 or e == 3:
L.remove(e)
# expect [1, 4]
print(L)
Huh... we're getting a list of [1, 3, 4]
, but we thought we programmed to remove 2 and 3!
We can write this weird behavior as a function to better see what is going on. Students had a couple hypotheses about what was happening.
true
(due to or?).remove
from dataclasses import dataclass
from typing import List
from datetime import date
#L = [1, 2, 3, 4]
def mystery(L: List):
for e in L:
if e == 2 or e == 3:
L.remove(e)
return L
# expect [1, 4]
print(mystery([1, 2, 3, 4]))
We first tried writing some test cases:
print(mystery([1, 2, 3, 4]))
print(mystery([1, 3, 2, 4]))
print(mystery([1, 2, 3, 4, 3, 2, 1, 3]))
Python Memory
label | slot |
---|---|
1021 | 1 |
1022 | 2 |
1023 | 3 |
1024 | 4 |
Now when we run remove, we could imagine it looking something like:
Python Memory
label | slot |
---|---|
1021 | 1 |
1022 |
|
1023 | 3 |
1024 | 4 |
But Python can't leave the list like this! We need to fix up our list by shifting things up.
Python Memory
label | slot |
---|---|
1021 | 1 |
1022 | 3 |
1023 | 4 |
1024 |
|
However, Python has already checked location 1022 so the 3
gets skipped over.
Tl;dr Don't modify the data structure you're using a for loop on while you're for looping over it. To get around this, just build up a new list.
Consider if we were to want to write code to gather votes entered at a prompt while the program is running. We first need to introduce you to a couple new ideas that will help with this. The first of these new things is input()
.
This will create a prompt for a user to input data. This gets collected as a string.
input("enter a number \n")
This creates some problems if we were to want to do math with user input. Such that the below creates a TypeError since you're trying to add a string to a number.
4 + input("enter a number \n")
We can get around this using type conversions. You can use a type name to convert from one type to another. See below:
4 + int(input("enter a number \n"))
Alright. We know how to gather one input from a user, but our hypothetical wanted to collect and collate votes from a number of different users. When you're asking people to input, you generally give them an escape. Below we can illustrate doing this with the string done
.
votes = []
def cast_vote():
"""record vote and repeat, unless user enters 'done'"""
v = input("Enter your vote: \n")
if v != "done":
votes.append(v)
cast_vote()
cast_vote()
print(votes)
This is all stuff we've seen before, using a repeated call to the function. The difference here is that we needed some special signal to indicate "we're done with data input".
However, this is not how a Python programmer would program this. Python programmers would rather do something similar to a for loop. We want to have this program run for a while
before it terminates (see what we did there?)
while
loop¶If we don't know upfront the length of our data, the while
loop is a good choice to let us emulate the behavior above in a more elegent way. We can do this like so:
votes2 = []
v = input("Enter your vote: \n")
while v != "done":
votes2.append(v)
v = input("Enter your vote: \n")
print(votes2)
csv.reader
¶In the real world, people will not give you nicely formated data. You'll likely have to deal with .csv filetypes. To work with this kind of data type, you'll need to be able to read that into your code.
Roughly what we want to do is tell Python to feed us rows one at a time so we can construct data as we want it. We can use this to illustrate the differences between for
and while
loops:
In the case of a .csv file, there is a predictable amount of data. Python can look and know that for a given .csv, there is a specific number of rows. We will therefore be using for loops to read in a .csv file.
The .csv file we are using in these notes can be found here. Make sure you put this file in the same folder as your code.
from datetime import date
import csv
@dataclass
class Reading:
type: str
when: date
level: float
location: str
# open csv file
with open("weather.csv", newline='') as csvfile:
weatherreader = csv.reader(csvfile, dialect='excel', delimiter=",")
# convert each row to a Reading
for row in weatherreader:
print(row)
The above illustrates a proof of concept--that Python recognizes the file and can show us the data contained within. We now need to do some cleanup. Chiefly this includes:
from datetime import date
import csv
@dataclass
class Reading:
type: str
when: date
level: float
location: str
# need a list to store readings in
readings = []
# open csv file
with open("weather.csv", newline='') as csvfile:
weatherreader = csv.reader(csvfile, dialect='excel', delimiter=",")
# convert each row to a Reading
for row in weatherreader:
# we're putting off doing that date for now, just set something up to test
readings.append(Reading(row[2], date(2000, 1, 2), row[3], row[1]))
# print out first 2 elements to look
print(readings[0])
print(readings[1])
This illustrates some of the downsides of Python not having a type checker. We want our level to be a float value, but it is being read in as a string! We need to fix this by converting to a float. We can convert, we can use float()
. However, we will have an issue trying to convert "Data" (from the header row) to a float. We can get rid of the header by skipping over it with the line next(weatherreader)
.
from datetime import date
import csv
@dataclass
class Reading:
type: str
when: date
level: float
location: str
# need a list to store readings in
readings = []
# open csv file
with open("weather.csv", newline='') as csvfile:
weatherreader = csv.reader(csvfile, dialect='excel', delimiter=",")
# skip over header row
next(weatherreader)
# convert each row to a Reading
for row in weatherreader:
# we're putting off doing that date for now, just set something up to test
readings.append(Reading(row[2], date(2000, 1, 2), float(row[3]), row[1]))
# print out first 2 elements to look
print(readings[0])
print(readings[1])
Now we can focus on collecting and inputing the date value. The date
function needs dates input as date(2016, 1, 18)
; however, our data provides date info as 1/18/16. We can pull out some info using .split()
where, given a character, we can split up data between occurances of that character. An example would be:
"1/5/16".split("/")
from datetime import date
import csv
@dataclass
class Reading:
type: str
when: date
level: float
location: str
# need a list to store readings in
readings = []
# open csv file
with open("weather.csv", newline='') as csvfile:
weatherreader = csv.reader(csvfile, dialect='excel', delimiter=",")
# skip over header row
next(weatherreader)
# convert each row to a Reading
for row in weatherreader:
# convert dates
date_list = row[0].split("/")
new_date = date(int("20" + date_list[2]), int(date_list[0]), int(date_list[1]))
# we're putting off doing that date for now, just set something up to test
readings.append(Reading(row[2], new_date, float(row[3]), row[1]))
# print out first 2 elements to look
print(readings[0])
print(readings[1])
Now, we should also know how to print out a .csv file. Using our votes construct from before.
# we will continue this function tomorrow
votes = ["a", "b", "a", "a", "c", "d", "b"]
with open('election.csv', 'w', newline='') as csvfile:
# election.csv indicates the name of the new file
# 'w' indicates we will be writing to the file
# newline = '' indicates how we want to end a line
voteswriter = csv.writer(csvfile, delimiter = ',')
for cand in list(set(votes)):
# list(set(some_list)) == L.distinct(some_list) from Pyret
voteswriter.writerow(cand, votes_for(cand, voteslist))