Foundations of Mathematics
Summer Program , Columbia University
July 2001 -- Notes for Week #3
Roger B. Blumberg

Session I -- Session II

For the first two weeks of the program we've talked about finding analytic solutions to various problems, and proving that those solutions are true.

By way of review, we do both a direct proof and a proof by mathematical induction of problem 5a on the first homework set.

In the last weeks of the program we want to develop a mathematical theory of counting and discrete probability, building on the ideas from set theory that you already know. We'll start by reviewing those ideas.

1. We start by giving a rough definition of "set" as "a well-defined collection of elements listed in any order, without repetition." We can define sets in different ways; e.g.:

A = { 0, 2, 4, 6, 8, 10 }
B = { x | x is a perfect square less than 30}

We then make clear what we mean by "subset", "union", "intersection", "complement", "cardinality", "Cartesian product" and the "empty set".

After a brief digression about the questions that arise when counting elements in transfinite sets, we prove, by induction, the following:

A set with n elements has 2^n subsets.

Proof:

I. For a set with a single element, A, the subsets are: ø and {A}.

II. If a set with n elements has 2^n subsets, then a set with (n+1) elements has 2^(n+1) subsets. Here it is sufficient to recognize that when we introduce an additional element, each of the 2n subsets will exist with and without that element. Therefore, the number of subsets will be doubled; there will be 2 * 2^n subsets or 2^(n+1) subsets.

Can you think of another way to prove this? If we represent a subset as a vector of n binary elements, and we show that there are obviously 2^n subsets, is that a proof?

The set of subsets is often called the "power set."

We now move on the the development of a theory of counting:

Consider a box with 17 ants, another box with 23 bees, and a third box with 1,437 cockroaches. If the insects are all distinguishable (e.g. all the ants have different names and personalities), how many different ways are there to choose a combination of 1 ant, 1 bee and 1 roach? The point of the question is to see why we multiply 17, 23 and 1,437 to get the answer, rather than add them.

(A handout with the theorems is distributed.)

Theorem 1: (The Fundamental Theorem of Counting): Suppose the set A1 contains n1 members, that the set A2 contains n2 members, the set A3 contains n3 members, ......., and the set A(m) contains n(M) members. Then the number of ways to choose one member from each of the m sets is equal to:

n1 x n2 x n3 x ........... x n(M)

Suppose we have a set of letters: { B, A, T, H, E }. How many different ways can we arrange these letters (into five-letter "words")?

Theorem 2: The number of permutations of n members of a set is equal to n!.

This is clear enough in cases like BATHE and COLUMBIA, but what about FUNNY or MISSISSIPPI? We need to modify the theorem to deal with the problem of "equivalence classes".

Example: Given the seven letters: E,V,E,R,E,S, and T, how many distinct arrangements are possible? First we note that if all the elements of the set were different letters, the answer would be: 7! But in this case 7! is too large, because it counts each legitimate arrangement more than once. In fact, by considering the different ways we can arrange the Es in a single 7-letter string, we see that there are 3! equivalent arrangements for each one arrangement we wish to count. Therefore, the set of 7! permutations is too big by a factor of 3!. Therefore, the number of distinct arrangements will be: 7!/3!

We next look at the case of counting permutations of n elements in subsets of r elements. For example, how many ways are there to arrange the letters in BATHE into three-letter words?

Theorem 3: The number of permutations of n members of a set arranged in r places is equal to:

We can prove this by simple expansion, showing that the formula yields a product equal to the product we get using the first two theorems to solve the problem for a set with n elements arranged in r places. In other words, we can show that the formula expands to:

n x (n-1) x (n-2) x .......... x (n-r+1)

What about cases that involve equivalence classes, when r < n? How many distinct permutations of three letters can be made from GENERAL?

Finally, and because we recognize that counting problems do not always (or even usually) consider order a significant factor, we define the combination of n elements taken r at a time.

Theorem 4: The number of combinations of n members of a set, chosen r at a time, is equal to:

We can derive this formula from Theorems 2 and 3, by recognizing that for every combination of r elements there are r! permutations of those elements.

We can see the use of the combination in a variety of letter and card problems. For example, how many different 5-card hands can be dealt from a deck with 52 cards? The answer: C(52,5). How many different combinations of 3 letters can be chosen from an alphabet of 26 letters? C(26,3) -- note that this is not the same question as asking how many different 3-letter words can be made from this alphabet.

The last problem: How many 5-letter words can be made using the English alphabet, provided that there are no repeated letters and exactly two of the letters are vowels? This is a problem with involves both permutations and combinations: permutations to deal with the arrangements of the consonants, and the arrangement of the vowels; and combinations to deal with the choice of two positions to hold the vowels.

For next time: How many different ways are there to divide a group of 20 people in half (i.e. into two committees of 10)?

Last time, we built up a theory of counting from the basics of set theory. We derived formulas for permutations and combinations, and today the goal is to continue and broaden the interpretation of permutations and combinations. One thing to recognize immediately, is that in the course of making permutations or combinations we are really just making different kinds of subsets.

1. How many ways are there to divide a group of 20 people into two groups of 10? into four groups of 5?

Here we emphasize that choosing and making subsets are really the same operation. C(n,r) is sometimes referred to as the "Binomial number" because when we choose r elements from a set of n elements, we create two subsets: one with r elements and the other with (n-r) elements. It follows, of course, that C(n,r) is equal to C(n, n-r).

2. Given the 26 letters of the English alphabet, how many different arrangements of 7 different letters can we make that contain exactly 2 vowels?

Here we have a use for both combinations and permutations, since we first need to choose 2 places from 7 to hold the vowels, and then figure out the number of ways to arrange the vowels and the five consonants. The answer, then, is C(7,2) x P(5,2) x P(21,5).

Consider the famous diagram, due to Polya, and figure out how many different ways there are to spell ABRACADABRA:

```                              A
B B
R R R
A A A A
C C C C C
A A A A A A
D D D D D
A A A A
B B B
R R
A
```

Here the trick is to recognize that in figuring out the different paths, we are actually figuring out the number of ways we can divide the ten "moves" into two sets of five: the set of "soutwest" moves and the set of "southeast" moves.

4. How many different ways are there to choose 3 letters from BANANA?

This is perhaps trickier than the previous examples of permutations with repeaters, because not all duplicate combinations will be counted in C(6,3) the same number of times. Is there an alternative to taking C(3,3) and adding the combinations with 2 As, 2 Ns and 3 As?

A. We begin the next phase of our investigation of combinatorials with the following questions:

1. How many ways are there to make subsets of 5 from a set with 10 members?
2. In a binary coded message of length 10, how many different messages can we make that have exactly 5 1's?
3. In a 2-D plane graph, how many non-decreasing paths are there connecting (0,0) and the point (10,5)?

Perhaps the most interesting thing about C(n,r) is the various interpretations to which it can be put; the answers show that C(n,r) can be thought of as:

• the number of possible subsets of r members from a set of n members.
• the number of possible binary messages of length n with exactly r 1's.
• the number of non-decreasing 2-D paths from (0,0) to (r,n-r).

Yet another interpretation of C(n,r) is:

• the coefficient of the a^rb^(n-r) term in the expansion of (a+b)^n. In other words:

In fact, we can think of the binomial expansion as a demonstration of the different ways to distribute n objects into two subsets: a and b. For example, consider the expansion of (a+b)^3:

(a+b)^3 = 1a^3b^0 + 3a^2b^1 + 3a^1b^2 + 1a^0b^3

Each term in the expansion shows a different way to distribute the 3 objects into two subsets, with the exponent of a showing the number of objects in one subset, and the exponent of b showing the number of objects in the second subset. The sum of the exponents always equals n, of course. The coefficient of each term then gives us the number of ways we can make those subsets; for example, the coefficient of the a^2b^1 term is 3, or C(3,2).

Next, we want to look at the connection between the combination and the properties of the Pascal's Triangle:

```
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1

```

We notice:

1) Every term is the sum of the terms flanking it in the row above.
2) Each row is a palindrome.
3) Second number in each row tells you the row number.
4) Every term is C(n,r) where n is the row number and r is the place in the row.
5) The terms in each row are the coefficients of (a+b)^n.
6) Every term except the first and the last is a multiple of n iff n is prime.
7) The sum of the terms in each row is 2^n.
8) Distribution of odd and even numbers seems predictable.

Once we recognize that each term in the Triangle has a combinatorial interpretation (observation #4), many of the patterns are easily explained:

A) Combining 1 and 4 we get: C(n,r) = C(n-1,r-1) + C(n-1,r). We can prove this by induction, of course.

B) 2 is explained by the equality C(n,r) = C(n,n-r), which we can show by expansion.

C) 3 is explained by noting that C(n,1) = C(n,n-1) = n

D) If we let a=1 and b=1, then 7 follows from 5. We also know why the sum of the terms in each row, n, should equal the number of subsets that can be made from a set with n elements.

E) Use 4 to prove 6.

F) If we know 6 and 7, what do we know about 2^n -2 whenever n is prime?

G) What happens when we investigate #8?

Finally, we derive the multinomial coefficient from the binomial coefficient. We find:

B. How does the mathematics of counting relate to the calculation of probabilities?

Example: Using the 26 letters of the English alphabet, I generate a 6-letter "word" (or string) at random. What is the chance that the word has no repeated letters and contains exactly two vowels?

We are used to figuring out the number of ways to do such things, but how do we go from there to probability statements? For now, we will adopt the basic definition of the probability of an event, E:

where "E" can be a simple or complex event.

Next time we'll begin with a discussion of conceptualizing and computing probabilities.