How to Read, Understand and Study Proofs

(This talk was given on March 31, 2014, at the University of Toronto to a class of mostly MAT 137 students. It was standing room only!)

In my first year of undergrad I was bad at proofs. In my second year of undergrad I was terrible at proofs. In my third year I was okay at proofs, but I was terrible at studying proofs.

The way I used to learn proofs was by memorizing the words in the textbook’s proof, word by word, with almost no understanding. I knew math, and I was fairly good at problems, but I just couldn’t get any purchase when it came to learning proofs.

Eventually I started to pick up various “tricks” and strategies for learning proofs. This talk is aimed at me in first year, and what I needed to hear so that I could have studied proofs better. (“I no proof good.”)

We’ll look at the basics of proof reading, the idea of definition unwinding and clever ideas, and finally we’ll present a general method for reading proofs.

Gloops and Snicks

Let’s prove the following theorem:

Theorem: A Gloop added to a Snick is a Snick.

Proof. Well we have no idea how to proceed because we don’t know what these things are! Before you start a proof you need to make sure that you understand the precise definitions of all of the words being used. How can you expect to manipulate things if you don’t know what they are?

So here are the definitions:

Definition. A Gloop is something of the form $\vert_a^0$ where $a$ is an integer and a Snick is something of the form $\vert_b^1$ where $b$ is an integer.

Okay, now back to the proof. Let $x = \vert_a^0$ be a gloop and let $\vert_b^1$ be a snick (where $a,b$ are integers). Now $x + y = \vert_a^0 + \vert_b^1$ and we don’t know how to continue, because we don’t know how to add these things. So we need to understand that first.

Definition. An integer of the form $2a + i$ (where $i$ is 0 or 1) can be written as $\vert_a^i$ .

Now $x + y = \vert_a^0 + \vert_b^1 = (2a + 0) + (2b + 1) = 2(a + b) + 1 = \vert_{a+b}^1$ , which is a Snick.

[END OF PROOF]

I’m sure you’ve all seen this before. This is just the proof that an even number added to an odd number is an odd number.

Lesson

You need to understand the words before you go through proofs. Look up the relevant definitions!
This proof required no insight or clever ideas; it is called “a proof by unwinding defintions”.
You need to justify each step. (In principle) proofs should be air-tight.
Don’t make any assumptions about how you can manipulate objects.

Definition Unwinding and the Clever Idea

Our goal when looking at proofs is partly to identify which parts of the proof are definition unwinding, and which parts actually required a new idea. In your first couple of years in math most proofs will require only one or zero clever ideas. Let’s try to identify the clever idea (if there is one) in the proof of the (modified) Squeeze theorem for functions.

Theorem. Suppose $l(x), u(x)$ are differentiable functions (‘l’ is for lower, and ‘u’ is for upper) and $f(x)$ is any function such that $l(x) < f(x) < u(x)$ for all $x \neq a$ , and suppose
$\displaystyle \lim_{x \rightarrow a} l(x) = L = \lim_{x \rightarrow a} u(x)$

THEN $\lim_{x \rightarrow a} f(x) = L$ .

Proof. Let’s go through this proof as brainlessly as possible, trying only to unwind definitions and make obvious algebraic manipulations.

We want to prove that $\lim_{x \rightarrow a} f(x) = L$ , so we start by fixing an $\epsilon > 0$ , and we start looking for a $\delta >0$ such that
$\displaystyle 0 < \vert x-a \vert < \delta \Rightarrow \vert f(x) - L \vert < \epsilon.$

Let’s find that $\delta$ ! The only places we know to look are at $\lim_{x \rightarrow a} l(x) = L = \lim_{x \rightarrow a} u(x)$ , so let’s say what those mean:

There is a $\delta_1 >0$ such that
$\displaystyle 0 < \vert x-a \vert < \delta_1 \Rightarrow \vert l(x) - L \vert < \epsilon.$
In this case we could also conclude that
$\displaystyle - \epsilon < l(x) - L < \epsilon$
and so
$\displaystyle L - \epsilon < l(x) < L + \epsilon$
(Nice! By being totally brainless we have found an upper and a lower bound for $l(x)$ .)

Now let’s look at the other limit we know: $\lim_{x \rightarrow a} u(x) = L$ . This says:

There is a $\delta_2 >0$ such that
$\displaystyle 0 < \vert x-a \vert < \delta_2 \Rightarrow \vert u(x) - L \vert < \epsilon.$
In this case we could also conclude that
$\displaystyle - \epsilon < u(x) - L < \epsilon$
and so
$\displaystyle L - \epsilon < u(x) < L + \epsilon$

Now it looks like we’ve done all the unwinding we can. Although now we have things that we can combine. Let’s try to combine some of our inequalities to get that what we want.

For $x \neq a$ we have $l(x) < f(x) < u(x)$ , and for $0 < \vert x-a \vert < \delta_1$ we get $L-\epsilon < l(x) < f(x) < u(x)$ .

For $x \neq a$ we have $l(x) < f(x) < u(x)$ , and for $0 < \vert x-a \vert < \delta_2$ we get $l(x) < f(x) < u(x) < L+ \epsilon$ .

If we had all three of these assumptions ( $x \neq a$ , $0 < \vert x-a \vert < \delta_1$ and $0 < \vert x-a \vert < \delta_2$ ) we would have
$\displaystyle L- \epsilon < f(x) < L + \epsilon$

which would mean $\vert f(x) - L \vert < \epsilon$ . So what should we set $\delta$ to be so that if $0 < \vert x - a \vert < \delta$ then we get the three assumptions we want?

Hmmm… Seems like I have to start thinking now. After a bit of thought we see that we can let $\delta$ be any number that is positive, but less than (or equal) to both $\delta_1$ and $\delta_2$ . (If we want to get fancy we can set $\delta$ to be the smaller of the two numbers, i.e. $\delta = \min \{\delta_1, \delta_2 \}$ .)

And that’s it! We’re done!

[END OF PROOF]

So… was there a clever idea in that proof? Probably the only thing that required thought was our choice of $\delta$ . Even then it was staring us right in the face.

The heart of this proof (i.e. the only thing that required any thinking) is the choice of $\delta = \min \{\delta_1, \delta_2 \}$ . This is what I would write in my notes, and this is how I would think of this proof. In fact, if you talk to any third year math student and remind them that this is a “ $\min \{\delta_1, \delta_2 \}$ “-proof then they should instantly be able to start deriving it.

Notice how thinking about this proof in this way takes almost no space in your brain? That’s how we want to start thinking of proofs: Find the idea hidden in them. This also helps us to think of some proofs as being easy even though they might take an entire page of (brainless) calculations. In some ways, this is the key to learning science: discovering and identifying the key, important, clever ideas or perspectives for problems.

Here’s an exercise for you to try. Prove the following theorem, knowing that it is a “ $\delta = \min \{\delta_1, \delta_2 \}$ “-proof:

Theorem. If $\lim_{x \rightarrow a} f(x) = L$ and $\lim_{x \rightarrow a} g(x) = M$ then $\lim_{x \rightarrow a} f(x) + g(x) = L + M$ .

Lesson

Identify the one (or zero or two) clever idea(s) in a proof. Think of the rest as brainless manipulation.
Judge the difficulty of a proof based on the cleverness of the ideas, not the length of brainless calculations.

Playing with Proofs

Before we get into a general method for reading proofs, let’s learn how to play around with proofs and theorems.

In our version of the Squeeze Theorem we assumed that our functions $u(x)$ and $l(x)$ were differentiable; did we use this anywhere? Is is possible that we could instead prove this theorem by only assuming that $u(x)$ and $l(a)$ were continuous? Maybe we can drop that requirement too? Are there other assumptions that we can weaken that still allow us to prove the same conclusion?

When we see a proof that of the form: “some assumptions” imply “some conclusions” we should always ask “Are some of these assumptions unnecessary? Can I get the same conclusions with fewer assumptions? Can I get better conclusions with the same conclusions?”

In our Squeeze Theorem we can see that assuming that $u(x)$ and $l(x)$ are merely functions (not necessarily continuous or differentiable) still allows us to get the same conclusion.

What assumptions cannot be weakened? In the case of the squeeze theorem we see that if we only assume that $\lim_{x \rightarrow a} l(x) = L < U = \lim_{x \rightarrow a} u(x)$ and $l(x) < f(x) < u(x)$ , then we can’t conclude anything about $\lim_{x \rightarrow a} f(x)$ – It could be that the limit doesn’t even exist! In this case we should try to find examples where we can’t get the conclusion. For example, take $a = 0, l(x) = -1, u(x) = 1$ and $f(x) = \sin (\frac{1}{x})$ (for $x \neq 0$ , and $f(0) = 0$ ).

In problem sets this year we’ve tried to get you to think about theorems and proofs this way, but in the future you will need to be proactively thinking about theorems and proofs like this- We won’t hold your hand forever!

Exercise. Play around with the theorem “A Gloop number of Snicks added together is a Gloop”. Prove it and then see that you can get a better theorem by using weaker assumptions.

Lesson

Look for unneeded assumptions in theorems and try to use the same (or a similar) proof to get the same conclusion.
If you can’t use the same proof with weaker assumptions then try to find an example where the conclusion fails.
Look through proofs to see if you can get stronger conclusions with the same assumptions.

Reading Through a Proof

As you progress through your mathematics education you will start to notice that proofs will get longer, more subtle and generally harder to understand. In your first calculus course you often saw 2 or 3 proofs in a single lecture. In some third-year and fourth-year course some proofs will take an entire lecture, week (or in some painful cases) an entire month. In these cases it becomes increasingly important to break down a proof in to manageable chunks. There is an old saying: “How do you eat an elephant? One bite at a time.”

I’m going to lay out a couple of guidelines. Don’t think of this as an algorithm for reading proofs, but as a series of useful strategies.

Understand what the words in the theorem mean. (Look up definitions!)
Understand what the conclusion of the theorem says.
Reword the theorem into everyday language.
Try to draw a picture of the theorem (if possible).
Decide if you think the assumptions are reasonable or too strong. (The theorem “Every pet hedgehog that belongs to Mike is pokey.” has assumptions that are too strong; Hedgehogs should probably be pokey even if they don’t belong to Mike.)
Think of examples that satisfy the assumptions of the theorem. Do they also satisfy the conclusion?
Think of really weird examples that satisfy the assumptions of the theorem. Do they also satisfy the conclusion?
Think of even weirder examples. (Seriously, think of a ton of diverse examples before you go to a proof.)

The thing to notice is that we have spent a long time just thinking about the statement of the theorem, even before we start trying to understand a proof.

Our next step is to make guesses (again before having looked at the proof)

Make a guess as to the proof format. (Direct, Contrapostive, proof by contradiction, Induction.)
Write down what you think are the first two or three lines of the proof.
Keep writing down lines until you get stuck. (Hopefully now you understand more about the theorem.)
If possible, try to prove the theorem in special cases (while trying to avoid using anything special about the special case).

Now we can take a look at the proof. (Notice how much work we’ve already done!)

Take a quick overview of the proof.
Identify what the format of the proof is.
Identify the key proof-specific definitions used.
Identify which portions of the proof are brainless definition unwinding and which parts are clever ideas.
Break the proof down into smaller chunks. (Usually chunks will be of the form “Claim-Proof of Claim”.)
Try again to prove each of the claims on your own.
Go through each claim and do a line-by-line analysis. Try to identify which lines you don’t believe. (These are good questions to ask!)

Now let’s say you’ve done this and you’re still stuck. You just don’t understand what’s going on. Here’s the strategy:

Start with a specific example that you know fails the conclusion of the theorem.
Go through the proof line-by-line and try to do what the proof tells you to do with your specific example.
When the proof stops working you will have a better understanding of the proof.

After the fact:

Wait a couple days then try to prove the theorem again.
Try writing the theorem in a different format. Don’t like the way this are positioned? See if you can switch it up.
Explain the theorem and proof to your grandmother.

A Specific Example – The Alternating Series Test

Wow, that’s a big list of things. Let’s see how this applies to a specific example. Recall the Alternating Series Test, which says:

Theorem. If $(a_n)$ is a positive, decreasing sequence of real numbers that converges to $0$ , then $\sum (-1)^n a_n$ converges.

Ok, so let’s see: Do we understand what a positive, decreasing sequence is? Do we know what it means for a sequence to converge to $0$ ? What are some good examples:

$\sum \frac{(-1)^n}{n^2}$ we know converges and satisfies all the requirements.
$\sum \frac{(-1)^n}{n}$ is a weird example because we know that $\sum \frac{1}{n}$ diverges.
$\sum (-1)^n$ is a mischievious example that we know does not converge. (The terms don’t go to $0$ though.)

The assumptions in the theorem seem reasonable because we know many examples of positive, decreasing sequences converging to $0$ .

This theorem is saying that for an alternating series, to check convergence we only need to check that the terms go to $0$ . This is in sharp contrast to $\sum \frac{1}{n}$ which is a series where the terms go to $0$ , but the series itself diverges. Why does the $(-1)^n$ matter? Maybe it is giving us some cancellation?

We think the proof will probably just be a direct proof. Writing down the first couple lines of a proof will either lead us to getting stuck, or we might realize that even partial sums and odd partial sums matter.

Trying to prove special case is a bit tricky, and we will often make false proofs. Remember the “proof” that
$\displaystyle 0 = (1-1)+(1-1)+ \ldots = 1-1+1-1 + \ldots = 1 + (-1+1) + (-1+1) + \ldots = 1$
So we need to be careful.

Now looking at the proof we see that we can break the proof up roughly into:

Look at the partial sums with even index.
- Show that these converge using the monotonic bounded sequence theorem.
  - Show that it is decreasing.
  - Show that it is positive.
Look at the partial sums with odd index.
- Use the fact that consecutive partial sums only differ by an $a_n$ .
- Look at the limit of differences of consecutive partial sums.

The clever idea really seems to be break the series up into even index partial sums and odd index partial sums.

When I went through this proof I got really stuck. I just couldn’t believe it! It seemed like the proof proved that $\sum (-1)^n$ converges! When I went through the proof slowly I saw exactly where the $\lim a_n = 0$ was used.

The Proof Philosophy

Be skeptical. Be ornery. Be obstinate. Be careful. Be stubborn.

Every proof you see should be air-tight and you should be extremely skeptical while learning them. Assume that the professor, or textbook, or website that is showing you the proof is filled with lies, some subtle, some overt. Don’t accept arguments based on authority (“I’m a smart person. Believe me.”) and only accept a proof if you understand every line of the proof.

Keep a checklist of every theorem and fact you’ve used in your first year calculus course. There should be checkmarks next to all of the theorems. Shockingly there aren’t! There won’t be a checkmark next to one thing: The Intermediate Value theorem. “Pfft! So what?” So what?! You use the Intermediate Value Theorem to prove the Extreme Value Theorem, the Mean Value Theorem and these are used to prove various other theorems in calculus. What if the Intermediate Value Theorem is false?!

Well you’ll be happy to know that it is indeed true (“I’m a smart person. I’ve seen it. Believe me.”) but the tools needed to prove it are developped in a second year calculus course.

This extremely skeptical attitude seems bizarre, but it is very liberating. All of a sudden you are no longer the powerless victim. You, yes you, are now the arbiter of truth. You get to decide if you should believe something. You get to ask for more evidence. You get to confront the establishment.

Now, go and be free. Get stuck. Struggle. Succeed. Prove some stuff. You can do it.

Gloops and Snicks

Definition Unwinding and the Clever Idea

Playing with Proofs

Reading Through a Proof

A Specific Example – The Alternating Series Test

The Proof Philosophy

Share this:

Related