# More program performance

## Big-O notation

Last time, we talked about analyzing the performance of programs. When
deciding whether a function would run in constant, linear, or quadratic
time, we ignored the constants and just looked at the biggest term. We can
define this somewhat more formally using *big-O* notation.

Let’s look again at the last function we considered last time:

def distinct(l: list) -> list: d = [] for x in l: if x not in d: d.append(x) return d

We decided that this function’s running time is quadratic in its input (in the worst case). For a list of length \(n\), we calculated the number of operations as \(((n * (n - 1)) / 2) + n + 2\).

A computer scientist would say that on a list of \(n\) elements, this function runs in \(O(n^2)\) time. The formal definition looks like this:

If we have (mathematical, not Python!) functions \(f(x)\), then \(f(x)\) is in \(O(g(x))\) if and only if there are constants \(x_0\) and \(C\) such that for all \(x > x_0\), \(f(x) < C*g(x)\).

For the function above, our constants could be, say, \(x_0=4\) and \(C=4\).

In this class, we won’t expect you to rigorously prove that a function’s running time is in a particular big-O class. We will, though, use (for instance) \(O(n)\) as a shorthand for “linear.”

As a bit of practice, here are our rainfall solutions:

# version 1 def average_rainfall(sensor_input: lst) -> float: number_of_readings = 0 total_rainfall = 0 for reading in sensor_input: if reading == -999: return number_of_readings / total_rainfall elif reading >= 0: number_of_readings += 1 total_rainfall += rainfall # version 2 def list_before(l: list, item) -> list: result = [] for element in l: if element == item: return result result.append(element) return result def average_rainfall(sensor_input: lst) -> float: readings_in_period = list_before(sensor_input, -999) good_readings = [reading for reading in period if reading >= 0] return sum(good_readings) / len(good_readings)

What are the worst-case running times of each solution?

See the lecture capture for details; for inputs of size \(n\), both solutions run in \(O(n)\) time.