VI. Infinity: recursion, induction, infinite descent
Mathematical induction – i.e. proof by recurrence – is... imposed on us, because it is... the affirmation of a property of the mind itself.
Henri Poincaré (1854–1912)
Allez en avant, et la foi vous viendra. (Press on ahead, and understanding will follow.)
Jean le Rond d’Alembert (1717–1783)
Mathematics has been called “the science of infinity”. However, infinity is a slippery notion, and many of the techniques which are characteristic of modern mathematics were developed precisely to tame this slipperiness. This chapter introduces some of the relevant ideas and techniques.
There are aspects of the story of infinity in mathematics which we shall not address. For example, astronomers who study the night sky and the movements of the planets and stars soon note its immensity, and its apparently ‘fractal’ nature – where increasing the detail or magnification reveals more or less the same level of complexity on different scales. And it is hard then to avoid the question of whether the stellar universe is finite or infinite.
In the mental universe of mathematics, once counting, and the process of halving, become routinely iterative processes, questions about infinity and infinitesimals are almost inevitable. However, mathematics recognises the conceptual gulf between the finite and the infinite (or infinitesimal), and rejects the lazy use of “infinity” as a metaphor for what is simply “very large”. Large finite numbers are still numbers; and long finite sums are conceptually quite different from sums that “go on for ever”. Indeed, in the third century BC, Archimedes (c. 287–212 BC) wrote a small booklet called The Sand Reckoner, dedicated to King Gelon, in which he introduced the arithmetic of powers (even though the ancient Greeks had no convenient notation for writing such numbers), in order to demonstrate that – contrary to what some people had claimed – the number of grains of sand in the known universe must be finite (he derived an upper bound of approximately 8×1063 The influence wielded by ideas of infinity on mathematics has been profound, even if we now view some of these ideas as flights of fancy –
- from Zeno of Elea (c. 495 BC – c. 430 BC), who presented his paradoxes to highlight the dangers inherent in reasoning sloppily with infinity,
- through Giordano Bruno (1548–1600), who declared that there were infinitely many inhabited universes, and who was burned at the stake when he refused to retract this and other “heresies”,
- to Georg Cantor (1845–1918) whose groundbreaking work in developing a true “mathematics of infinity” was inextricably linked to his religious beliefs.
In contrast, we focus here on the delights of the mathematics, and in particular on how an initial doorway into “ideas of infinity” can be forged from careful reasoning with finite entities. Readers who would like to explore what we pass over in silence could do worse than to start with the essay on “infinity” in the MacTutor History of Mathematics archive:
http://www-history.mcs.st-and.ac.uk/HistTopics/Infinity.html.
The simplest infinite processes begin with recursion – a process where we repeat exactly the same operation over and over again (in principle, continuing for ever). For example, we may start with 0, and repeat the operation “add 1”, to generate the sequence:
Or we may start with 20 = 1 and repeat the operation “multiply by 2”, to generate:
Or we may start with 1.000000 ⋯, and repeat the steps involved in “dividing by 7” to generate the infinite decimal for :
We can then vary this idea of “recursion” by allowing each operation to be “essentially” (rather than exactly) the same, as when we define triangular numbers by “adding n” at the nth stage to generate the sequence:
In other words, the sequence of triangular numbers is defined by a recurrence relation:
and
when
We can vary this idea further by allowing more complicated recurrence relations – such as that which defines the Fibonacci numbers:
and
when
All of these “images of infinity” hark back to the familiar counting numbers.
- We know how the counting numbers begin (with 0, or with 1); and
- we know that we can “add 1” over and over again to get ever larger counting numbers.
The intuition that this process is, in principle, never–ending (so is never actually completed), yet somehow manages to count all positive integers, is what Poincaré called a “property of the mind itself”: that is, the idea that we can define an infinite sequence, or process, or chain of deductions (involving digits, or numbers, or objects, or statements, or truths) by
- specifying how it begins, and by then
- specifying in a uniform way “how to construct the next term ”, or “how to perform the next step”.
This idea is what lies behind “proof by mathematical induction”, where we prove that some assertion P(n) holds for all – so demonstrating infinitely many separate statements at a single blow. The validity of this method of proof depends on a fundamental property of the positive integers, or of the counting sequence
namely:
The Principle of Mathematical Induction: If a subset S of the positive integers
- contains the integer “1”,
and has the property that - whenever an integer k is in the set S, then the next integer is always in S too,
then S contains all positive integers.
6.1. Proof by mathematical induction I
When students first meet “proof by induction”, it is often explained in a way that leaves them feeling distinctly uneasy, because it appears to break the fundamental taboo:
never assume what you are trying to prove.
This tends to leave the beginner in the position described by d'Alembert's quote at the start of the chapter: they may “press on” in the hope that “understanding will follow”, but a doubt often remains. So we encourage readers who have already met proof by induction to take a step back, and to try to understand afresh how it really works. This may require you to study the solutions (Section 6.10), and to be prepared to learn to write out proofs more carefully than, and rather differently from, what you are used to.
When we wish to prove a general result which involves a parameter n, where n can be any positive integer, we are really trying to prove infinitely many results all at once. If we tried to prove such a collection of results in turn, “one at a time”, not only would we never finish, we would scarcely get started (since completing the first million, or billion, cases leaves just as much undone as at the start). So our only hope is:
- to think of the sequence of assertions in a uniform way, as consisting of infinitely many different, but similar–looking, statements P(n), one for each n separately (with each statement depending on a particular n); and
- to recognise that the overall result to be proved is not just a single statement P(n), but the compound statement that “P(n) is true, for all ”.
Once the result to be proved has been formulated in this way, we can
- use bare hands to check that the very first statement is true (usually P(1)); and
- try to find some way of demonstrating that,
– as soon as we know that “P(k) is true, for some (particular, but unspecified) ”,
– we can prove in a uniform way that the next result is then automatically true.
Having implemented the first of the two induction steps, we know that P(1) is true.
The second bullet point above then comes into play and assures us that (since we know that P(1) is true), P(2) must be true.
And once we know that P(2) is true, the second bullet point assures us that P(3) is also true.
And once we know that P(3) is true, the second bullet point assures us that P(4) is also true.
And so on for ever.
We can then conclude that the whole sequence of infinitely many statements are all true – namely that:
“every statement P(n) is true”,
or that
“P(n) is true, for all .”
In other words, if we define S to be the set of positive integers n for which the statement P(n) is true, then S contains the element “1”, and whenever k is in S, so is ; hence, by the Principle of Mathematical Induction we know that S contains all positive integers.
At this stage we should acknowledge an important didactical (rather than mathematical) ploy in our recommended layout here. It is important to underline the distinction between
(i) the individual statements P(n) which are the separate ingredients in the overall statement to be proved, namely:
“P(n) is true, for all ”,
where infinitely many individual statements have been compressed into a single compound statement, and
(ii) the induction step, where we
– assume some particular P(k) is known to be true, and
– show that the next statement is then automatically true.
To underline this distinction we consistently use a different “dummy variable” (namely “k”) in the latter case. This distinction is a psychological ploy rather than a logical necessity. However, we recommend that readers should imitate this distinction (at least initially).
6.2. ‘Mathematical induction’ and ‘scientific induction’
The idea of a “list that goes on for ever” arose in the sequence of powers of 4 back in Problem 16, where we asked
Do the two sequences arising from successive powers of 4:
- the leading digits:
and
- the units digits:
really “repeat for ever” as they seem to?
This example illustrates the most basic misconception that sometimes arises concerning mathematical induction – namely to confuse it with the kind of pattern spotting that is often called ‘scientific induction’.
In science (as in everyday life), we routinely infer that something that is observed to occur repeatedly, apparently without exception (such as the sun rising every morning; or the Pole star seeming to be fixed in the night sky) may be taken as a “fact”. This kind of “scientific induction” makes perfect sense when trying to understand the world around us – even though the inference is not warranted in a strictly logical sense.
Proof by mathematical induction is quite different. Admittedly, it often requires intelligent guesswork at a preliminary stage to make a guess that allows us to formulate precisely what it is that we should be trying to prove. But this initial guess is separate from the proof, which remains a strictly deductive construction. For example,
the fact that “1”, “”, “”, “”, etc. all appear to be successive squares gives us an idea that perhaps the identity
P(n):
is true, for all .
This guess is needed before we can start the proof by mathematical induction. But the process of guessing is not part of the proof. And until we construct the “proof by induction” (Problem 231), we cannot be sure that our guess is correct.
The danger of confusing ‘mathematical induction’ and ‘scientific induction’ may be highlighted to some extent if we consider the proof in Problem 76 above that “we can always construct ever larger prime numbers”, and contrast it with an observation (see Problem 228 below) that is often used in its place – even by authors who should know better.
In Problem 76 we gave a strict construction by mathematical induction:
- we first showed how to begin (with say);
- then we showed how, given any finite list of distinct prime numbers , it is always possible to construct a new prime (as the smallest prime number dividing ).
This construction was very carefully worded, so as to be logically correct.
In contrast, one often finds lessons, books and websites that present the essential idea in the above proof, but “simplify” it into a form that encourages anti–mathematical “pattern–spotting” which is all–too–easily misconstrued. For example, some books present the sequence
as a way of generating more and more primes.
(a) Are 3, 7, 31, 211 all prime?
(b) prime?
(c) Is prime? A
We have already met two excellent historical examples of the dangers of plausible pattern–spotting in connection with Problem 118. There you proved that:
“if is prime, then n must be prime.”
You then showed that , , , are all prime, but that is not. This underlines the need to avoid jumping to (possibly false) conclusions, and never to confuse a statement with its converse.
In the same problem you showed that:
“if is to be prime and , then a must be even, and b must be a power of 2.”
You then looked at the simplest family of such candidate primes namely the sequence of Fermat numbers :
It turned out that, although fo, are all prime, and although Fermat (1601–1665) claimed (in a letter to Marin Mersenne (1588–1648)) that all Fermat numbers are prime, we have yet to discover a sixth Fermat prime!
There are times when a mathematician may appear to guess a general result on the basis of what looks like very modest evidence (such as noticing that it appears to be true in a few small cases). Such “informed guesses” are almost always rooted in other experience, or in some unnoticed feature of the particular situation, or in some striking analogy: that is, an apparent pattern strikes a chord for reasons that go way beyond the mere numbers. However those with less experience need to realise that apparent patterns or trends are often no more than numerical accidents.
Pell’s equation (John Pell (1611–1685)) provides some dramatic examples.
- If we evaluate the expression “” for , we may notice that the outputs never give a perfect square. And this is to be expected, since the next square after is
and this is always greater than .
- However, if we evaluate “” for we may observe that the outputs never seem to include a perfect square. But this time there is no obvious reason why this should be so – so we may anticipate that this is simply an accident of “small” numbers. And we should hesitate to change our view, even though this accident goes on happening for a very, very, very long time: the smallest value of n for which gives rise to a perfect square is apparently
6.3. Proof by mathematical induction II
Even where there is no confusion between mathematical induction and ‘scientific induction', students often fail to appreciate the essence of “proof by induction”. Before illustrating this, we repeat the basic structure of such a proof.
A mathematical result, or formula, often involves a parameter n, where n can be any positive integer. In such cases, what is presented as a single result, or formula, is a short way of writing an infinite family of results. The proof that such a result is correct therefore requires us to prove infinitely many results at once. We repeat that our only hope of achieving such a mind–boggling feat is
- to formulate the stated result for each value of n separately: that is, as a statement P(n) which depends on n;
- then to use bare hands to check the “beginning” – namely that the simplest case (usually P(1)) is true;
- finally to find some way of demonstrating that,
– as soon as we know that P(k) is true, for some (unknown) ,
– we can prove that the next result is then automatically true.
We can then conclude that
“every statement P(n) is true”,
or that
“P(n) is true, for all ”.
Problem 229 Prove the statement:
“52n+2 – 24n – 25 is divisible by 576, for all ”.
When trying to construct proofs in private, one is free to write anything that helps as ‘rough work'. But the intended thrust of Problem 229 is two–fold:
to introduce the habit of distinguishing clearly between
(i) the statement P(n) for a particular n, and
(ii) the statement to be proved – namely “P(n) is true, for all ”; and
to draw attention to the “induction step” (i.e. the third bullet point above), where
(i) we assume that some unspecified P(k) is known to be true, and
(ii) seek to prove that the next statement must then be true.
The central lesson in completing the “induction step” is to recognize that:
to prove that is true, one has to start by looking at what says.
In Problem 229 says that
“ is divisible by 576”.
Hence one has to start the induction step with the relevant expression
and look for some way to rearrange this into a form where one can use P(k) (which we assume is already known to be true, and so are free to use).
It is in general a false strategy to work the other way round – by “starting with P(k), and then fiddling with it to try to get ”. (This strategy can be made to work in the simplest cases; but it does not work in general, and so is a bad habit to get into.) So the induction step should always start with the hypothesis of .
The next problem invites you to prove the formula for the sum of the angles in any polygon. The result is well–known; yet we are fairly sure that the reader will never have seen a correct proof. So the intention here is for you to recognise the basic character of the result, to identify the flaws in what you may until now have accepted as a proof, and to try to find some way of producing a general proof.
Problem 230 Prove by induction the statement:
“for each , the angles of any n–gon in the plane have sum equal to radians.”
The formulation certainly involves a parameter ; so you clearly need to begin by formulating the statement P(n). For the proof to have a chance of working, finding the right formulation involves a modest twist! So if you get stuck, it may be worth checking the first couple of lines of the solution.
No matter how P(n) is formulated, you should certainly know how to prove the statement P(3) (essentially the formula for the sum of the angles in a triangle). But it is far from obvious how to prove the “induction step”:
“if we know that P(k) is true for some particular , then must also be true”.
When tackling the induction step, we certainly cannot start with P(k)! The statement says something about polygons with sides: and there is no way to obtain a typical –gon by fiddling with some statement about polygons with k sides. (If you start with a k–gon, you can of course draw a triangle on one side to get a –gon; but this is a very special construction, and there is no easy way of knowing whether all –gons can be obtained in this way from some k–gon.) The whole thrust of mathematical induction is that we must always start the induction step by thinking about the hypothesis of – that is in this case, by considering an arbitrary –gon and then finding some guaranteed way of “reducing” it in order to make use of P(k).
The next two problems invite you to prove some classical algebraic identities. Most of these may be familiar. The challenge here is to think carefully about the way you lay out your induction proof, to learn from the examples above, and (later) to learn from the detailed solutions provided.
Problem 231 Prove by induction the statement:
“ holds, for all ”
The summation in Problem 231 was known to the ancient Greeks. The mystical Pythagorean tradition (which flourished in the centuries after Pythagoras) explored the character of integers through the ‘spatial figures’ which they formed. For example, if we arrange each successive integer as a new line of dots in the plane, then the sum “” can be seen to represent a triangular number. Similarly, if we arrange each odd number in the sum “” as a “k-by-k reverse L-shape”, or gnomon (a word which we still use to refer to the L-shaped piece that casts the shadow on a sundial), then the accumulated L-shapes build up an n by n square of dots - the “1” being the dot in the top left hand corner, the “3” being the reverse L-shape of 3 dots which make this initial “1” into a 2 by 2 square, the “5” being the reverse L-shape of 5 dots which makes this 2 by 2 square into a 3 by 3 square, and so on. Hence the sum “” can be seen to represent a square number.
There is much to be said for such geometrical illustrations; but there is no escape from the fact that they hide behind an ellipsis (the three dots which we inserted in the sum between “5” and “”, which were then summarised when arranging the reverse L-shapes by ending with the words “and so on”). Proof by mathematical induction, and its application in Problem 231, constitute a formal way of avoiding both the appeal to pictures, and the hidden ellipsis.
Problem 232 The sequence
is defined by its first two terms , and by the recurrence relation:
.
(a) Guess a closed formula for the nth term un.
(b) Prove by induction that your guess in (a) is correct for all .
Problem 233 The sequence of Fibonacci numbers
is defined by its first two terms , and by the recurrence relation:
when .
Prove by induction that, for all ,
Problem 234 Prove by induction that
is divisible by 19, for all .
Problem 235 Use mathematical induction to prove that each of these identities holds, for all :
(a)
(b)
(c)
(d)
(e) .
Problem 236 Prove by induction the statement:
“, for all ”.
We now know that, for all :
.
And if we sum these “outputs” (that is, the first n natural numbers), we get the nth triangular number:
.
The next problem invites you to find the sum of these “outputs”: that is, to find the sum of the first n triangular numbers.
Problem 237
(a) Experiment and guess a formula for the sum of the first n triangular numbers:
(b) Prove by induction that your guessed formula is correct for all .
A We now know closed formulae for
“”
and for
“”.
The next problem hints firstly that these identities are part of something more general, and secondly that these results allow us to find identities for the sum of the first n squares:
for the first n cubes:
and so on.
Problem 238
(a) Note that
.
Use this and the result of Problem 237 to derive a formula for the sum:
.
(b) Guess and prove a formula for the sum
.
Use this to derive a closed formula for the sum:
.
It may take a bit of effort to digest the statement in the next problem. It extends the idea behind the “method of undetermined coefficients” that is discussed in Note 2 to the solution of Problem 237(a).
Problem 239
(a) Given distinct real numbers
find all possible polynomials of degree n which satisfy
for some specified number b.
(b) For each , prove the following statement:
Given two labelled sets of real numbers
,
and
,
where the ai are all distinct (but the b need not be), there exists a polynomial fn of degree n, such that
We end this subsection with a mixed bag of three rather different induction problems. In the first problem the induction step involves a simple construction of a kind we will meet later.
Problem 240 A country has only 3 cent and 4 cent coins.
(a) Experiment to decide what seems to be the largest amount, N cents, that cannot be paid directly (without receiving change).
(b) Prove by induction that “n cents can be paid directly for each ”.
Problem 241
(a) Solve the equation . Calculate , and check that is also an integer.
(b) Solve the equation . Calculate , and check that is also an integer.
(c) Solve the equation . Calculate , and check that is also an integer.
(d) Solve the equation , where k is an integer. Calculate z2, and check that is also an integer.
(e) Prove that if a number z has the property that is an integer, then is also an integer for each .
Problem 242 Let p be any prime number. Use induction to prove:
“ is divisible by p for all ”.
6.4. Infinite geometric series
Elementary mathematics is predominantly about equations and identities. But it is often impossible to capture important mathematical relations in the form of exact equations. This is one reason why inequalities become more central as we progress; another reason is because inequalities allow us to make precise statements about certain infinite processes.
One of the simplest infinite process arises in the formula for the “sum” of an infinite geometric series:
(for ever).
Despite the use of the familiar-looking “+” signs, this can be no ordinary addition. Ordinary addition is defined for two summands; and by repeating the process, we can add three summands (thanks in part to the associative law of addition). We can then add four, or any finite number of summands. But this does not allow us to “add” infinitely many terms as in the above sum. To get round this we combine ordinary addition (of finitely many terms) and simple inequalities to find a new way of giving a meaning to the above “endless sum”. In Problem 116 you used the factorisation
to derive the closed formula:
.
This formula for the sum of a finite geometric series can be rewritten in the form
.
At first sight, this may not look like a clever move! However, it separates the part that is independent of n from the part on the RHS that depends on n; and it allows us to see how the second part behaves as n gets large:
when , successive powers of r get smaller and smaller and converge rapidly towards 0,
so the above form of the identity may be interpreted as having the form:
— (an “error term”).
Moreover if , then the “error term” converges towards 0 as . In particular, if , the error term is always positive, so we have proved, for all :
and
the difference between the two sides tends rapidly to 0 as .
We then make the natural (but bold) step to interpret this, when , as offering a new definition which explains precisely what is meant by the endless sum
(for ever),
declaring that, when ,
(for ever) = .
More generally, if we multiply every term by a, we see that
(for ever) = .
Problem 243 Interpret the recurring decimal 0.037037037 ··· (for ever) as an infinite geometric series, and hence find its value as a fraction.
Problem 244 Interpret the following endless processes as infinite geometric series.
(a) A square cake is cut into four quarters, with two perpendicular cuts through the centre, parallel to the sides. Three people receive one quarter each - leaving a smaller square piece of cake. This smaller piece is then cut in the same way into four quarters, and each person receives one (even smaller) piece - leaving an even smaller residual square piece, which is then cut in the same way. And so on for ever. What fraction of the original cake does each person receive as a result of this endless process?
(b) I give you a whole cake. Half a minute later, you give me half the cake back. One quarter of a minute later, I return one quarter of the cake to you. One eighth of a minute later you return one eighth of the cake to me. And so on. Adding the successive time intervals, we see that
(for ever) = 1,
so the whole process is completed in exactly 1 minute. How much of the cake do I have at the end, and how much do you have?
Problem 245 When John von Neumann (1903-1957) was seriously ill in hospital, a visitor tried (rather insensitively) to distract him with the following elementary mathematics problem.
Have you heard the one about the two trains and the fly? Two trains are on a collision course on the same track, each travelling at 30 km/h. A super-fly starts on Train A when the trains are 120 km apart, and flies at a constant speed of 50 km/h - from Train A to Train B, then back to Train A, and so on. Eventually the two trains collide and the fly is squashed. How far did the fly travel before this sad outcome?
6.5. Some classical inequalities
The fact that our formula for the sum of a geometric series gives us an exact sum is very unusual - and hence very precious. For almost all other infinite series - no matter how natural, or beautiful, they may seem - you can be fairly sure that there is no obvious exact formula for the value of the sum. Hence in those cases where we happen to know the exact value, you may infer that it took the best efforts of some of the finest mathematical minds to discover what we know.
One way in which we can make a little progress in estimating the value of an infinite series is to obtain an inequality by comparing the given sum with a geometric series.
Problem 246
(a)(i) Explain why
,
so
.
(ii) Explain why are all , so
(b) Use part (a) to prove that
, for all .
(c) Conclude that the endless sum
(for ever)
has a definite value, and that this value lies somewhere between and 2.
The next problem presents a rather different way of deriving a similar equality. Once the relevant inequality has been guessed, or given (see Problem 247(a) and (b)), the proof by mathematical induction is often relatively straightforward. And after a little thought about Problem 246, it should be clear that much of the inaccuracy in the general inequality arises from the rather poor approximations made for the first few terms (when n = 1, when n = 2, when n = 3, etc.); hence by keeping the first few terms as they are, and only approximating for , or , or , we can often prove a sharper result.
Problem 247
(a) Prove by induction that
, for all .
(b) Prove by induction that
for all .
The infinite sum
(for ever)
is a historical classic, and has many instructive stories to tell. Recall that, in Problems 54, 62, 63, 236, 237, 238 you found closed formulae for the sums
and for the sums
.
Each of these expressions has a “natural” feel to it, and invites us to believe that there must be an equally natural compact answer representing the sum. In Problem 235 you took this idea one step further by finding a beautiful closed expression for the sum
When we began to consider infinite series, we found the elegant closed formula
.
We then observed that the final term on the RHS could be viewed as an “error term”, indicating the amount by which the LHS differs from , and noticed that, for any given value of r between −1 and +1, this error term “tends towards 0 as the power n increases”. We interpreted this as indicating that one could assign a value to the endless sum
(for ever) = .
In the same way, in the elegant closed formula
the final term on the RHS indicates the amount by which the finite sum on the LHS differs from 1; and since this “error term” tends towards 0 as n increases, we may assign a value to the endless sum
(for ever) = 1.
It is therefore natural to ask whether other infinite series, such as
(for ever)
may also be assigned some natural finite value. And since the series is purely numerical (without any variable parameters, such as the “r” in the geometric series formula), this answer should be a strictly numerical answer. And it should be exact - though all we have managed to prove so far (in Problems 246 and 247) is that this numerical answer lies somewhere between and 1.68.
This question arose naturally in the middle of the seventeenth century, when mathematicians were beginning to explore all sorts of infinite series (or “sums that go on for ever”). With a little more work in the spirit of Problems 246 and 247 one could find a much more accurate approximate value. But what is wanted is an exact expression, not an unenlightening decimal approximation. This aspiration has a serious mathematical basis, and is not just some purist preference for elegance. The actual decimal value is very close to
.
But this conveys no structural information. One is left with no hint as to why the sum has this value. In contrast, the eventual form of the exact expression suggests connections whose significance remains of interest to this day.
The greatest minds of the seventeenth and early eighteenth century tried to find an exact value for the infinite sum - and failed. The problem became known as the Basel problem (after Jakob Bernoulli (1654-1705) who popularised the problem in 1689 - one of several members of the Bernoulli family who were all associated with the University in Basel). The problem was finally solved in 1735 - in truly breathtaking style - by the young Leonhard Euler (1707-1783) (who was at the time also in Basel). The answer
illustrates the final sentence of the preceding paragraph in unexpected ways, which we are still trying to understand.
In the next problem you are invited to apply similar ideas to an even more important series. Part (a) provides a relatively crude first analysis. Part (b) attacks the same question; but it does so using algebra and induction (rather than the formula for the sum of a geometric series) in a way that is then further refined in part (c).
Problem 248
(a)(i) Choose a suitable r and prove that
.
(ii) Conclude that
for every ,
and hence that the endless sum
(forever)
can be assigned a value “e” satisfying .
(b) (i) Prove by induction that
, for all .
(ii) Use part (i) to conclude that the endless sum
(forever)
can be assigned a definite value “e”, and that this value lies somewhere between 2.5 and 3.
(c) (It may help to read the Note at the start of the solution to part (c) before attempting parts (c), (d).)
(i) Prove by induction that
for all .
(ii) Use part (i) to conclude that the endless sum
(for ever)
can be assigned a definite value “e”, and that this value lies somewhere between 2.6 and 2.75.
(d) (i) Prove by induction that
.
(ii) Use part (i) to conclude that the endless sum
(for ever)
can be assigned a definite value “e”, and that this value lies somewhere between 2.708 and 2.7222... (forever).
We end this section with one more inequality in the spirit of this section, and two rather different inequalities whose significance will become clear later.
Problem 249 Prove by induction that
, forall .
Problem 250 Let a, b be real numbers such that , and . Prove by induction that
, for all .
Problem 251 Let x be any real number . Prove by induction that
, for all
6.6. The harmonic series
The great foundation of mathematics is the principle of contradiction, or of identity, that is to say that a statement cannot be true and false at the same time, and that thus A is A, and cannot be not A. And this single principle is enough to prove the whole of arithmetic and the whole of geometry, that is to say all mathematical principles.
Gottfried W. Leibniz (1646-1716)
We have seen how some infinite series, or sums that go on for ever, can be assigned a finite value for their sum:
We say that these series converge (meaning that they can be assigned a finite value).
This section is concerned with another very natural series, the so-called harmonic series
It is not entirely clear why this is called the harmonic series. The natural overtones that arise in connection with plucking a stretched string (as with a guitar or a harp) have wavelengths that are the basic wavelength, or of the basic wavelength, and so on. It is also true that, just as each term of an arithmetic series is the arithmetic mean of its two neighbours, and each term of a geometric series is the geometric mean of its two neighbours, so each term of the harmonic series after the first is equal to the harmonic mean (see Problems 85., 89.) of its two neighbours:
Unlike the first two series above, there is no obvious closed formula for the finite sum
Certainly the sequence of successive sums
does not suggest any general pattern.
Problem 252 Suppose we denote by S the “value” of the endless sum
(i) Write out the endless sum corresponding to “”.
(ii) Remove the terms of this endless sum from the endless sum S, to obtain another endless sum corresponding to “” .
(iii) Compare the first term of the series in (i) (namely ) with the first term in the series in (ii) (namely 1); compare the second term in the series in (i) with the second term in the series in (ii); and so on. What do you notice?
The Leibniz quotation above emphasizes that the reliability of mathematics stems from a single principle – namely the refusal to tolerate a contradiction. We have already made explicit use of this principle from time to time (see, for example, the solution to Problem 125.). The message is simple: whenever we hit a contradiction, we know that we have “gone wrong” – either by making an error in calculation or logic, or by beginning with a false assumption. In Problem 252. the observations you were expected to make are paradoxical: you obtained two different series, which both correspond to “”, but every term in one series is larger than the corresponding term in the other! What one can conclude may not be entirely clear. But it is certainly clear that something is wrong: we have somehow created a contradiction. The three steps ((i), (ii), (iii)) appear to be relatively sensible. But the final observation “” (since , , etc.) makes no sense. And the only obvious assumption we have made is to assume that the endless sum
can be assigned a value “S”, which can then be manipulated as though it were a number.
The conclusion would seem to be that, whether or not the endless sum has a meaning, it cannot be assigned a value in this way. We say that the series diverges. Each finite sum
has a value, and these values “grow more and more slowly” as n increases:
- the first term immediately makes the sum = 1
- it takes 4 terms to obtain a sum > 2;
- it takes 11 terms to obtain a sum > 3; and
- it takes terms before the series reaches a sum > 10.
However, this slow growth is not enough to guarantee that the corresponding endless sum corresponds to a finite numerical value.
The danger signals should already have been apparent in Problem 249., where you proved that
The term tends to 0 as n increases; so the finite sums grow ever more slowly as n increases. However, the LHS can be made larger than any integer K simply by taking K2 terms. Hence there is no way to assign a finite value to the endless sum
Problem 253.
(a)(i) Explain why
(ii) Explain why
(iii) Extend parts (i) and (ii) to prove that
(iv) Finally use the fact that, when ,
to modify the proof in (iii) slightly, and hence show that
(b)(i) Explain why
(ii) Explain why
(iii) Extend parts (i) and (ii) to prove that
(c) Combine parts (a) and (b) to show that, for all , we have the two inequalities
Conclude that the endless sum
cannot be assigned a finite value.
The result in Problem 253.(c) has an unexpected consequence.
Problem 254 Imagine that you have an unlimited supply of identical rectangular strips of length 2. (Identical empty plastic CD cases can serve as a useful illustration, provided one focuses on their rectangular side profile, rather than the almost square frontal cross-section.) The goal is to construct a ‘stack’ in such a way as to stick out as far as possible beyond a table edge. One strip balances exactly at its midpoint, so can protrude a total distance of 1 without tipping over.
(a) Arrange a stack of n strips of length 2, one on top of the other, with the bottom strip protruding distance beyond the edge of the table, the second strip from the bottom protruding beyond the leading edge of the bottom strip, the third strip from the bottom protruding beyond the leading edge of the strip below it, and so on until the strip from the bottom protrudes distance beyond the leading edge of the strip below it, and the top strip protrudes distance 1 beyond the leading edge of the strip below it (see Figure 10). Prove that a stack of n identical strips arranged in this way will just avoid tipping over the table edge.
(b) Conclude that we can choose n so that an arrangement of n strips can (in theory) protrude as far beyond the edge of the table as we wish – without tipping.
The next problem illustrates, in the context of the harmonic series, what is in fact a completely general phenomenon: an endless sum of steadily decreasing positive terms may converge or diverge; but provided the terms themselves converge to 0, then the the corresponding “alternating sum” – where the same terms are combined but with alternately positive and negative signs – always converges.
Figure 10: Overhanging strips, n = 10.
Problem 255
(a) Let
(where the final operation is “+” if n is odd and “-” if n is even).
(i) Prove that
for all .(ii) Conclude that the endless alternating sum
can be assigned a value s that lies somewhere between and .
(b) Let
be an endless, decreasing sequence of positive terms (that is, for all ). Suppose that the sequence of terms converges to 0 as .
(i) Let
(where the final operation is “+” if n is odd and “−” if n is even). Prove that
(ii) Conclude that the endless alternating sum
can be assigned a value s that lies somewhere between and .
Just as with the series
we can show relatively easily that
can be assigned a value s. It is far less clear whether this value has a familiar name! (It is in fact equal to the natural logarithm of 2: “”.) A similarly intriguing series is the alternating series of odd terms from the harmonic series:
You should be able to show that this endless series can be assigned a value somewhere between and ; but you are most unlikely to guess that its value is equal to . This was first discovered in 1674 by Leibniz (1646-1716). One way to obtain the result is using the integral of from 0 to 1: on the one hand the integral is equal to evaluated when (that is, ); on the other hand, we can expand the integrand as a power series , integrate term by term, and prove that the resulting series converges when . (It does indeed converge, though it does so very, very slowly.)
The fact that the alternating harmonic series has the value seems to have been first shown by Euler (1707-1783), using the power series expansion for .
6.7. Induction in geometry, combinatorics and number theory
We turn next to a mixed collection of problems designed to highlight a range of applications.
Problem 256 Let , . Prove by induction that has at least k distinct prime factors.
Problem 257
(a) Prove by induction that n points on a straight line divide the line into parts.
(b)(i) By experimenting with small values of n, guess a formula for the maximum number of regions which can be created in the plane by an array of n straight lines.
(ii) Prove by induction that n straight lines in the plane divide the plane into at most regions.
(c)(i) By experimenting with small values of n, guess a formula for the maximum number of regions which can be created in 3-dimensions by an array of n planes.
(ii) Prove by induction that n planes in 3-dimensions divide space into at most regions.
Problem 258 Given a square, prove that, for each , the initial square can be cut into n squares (of possibly different sizes).
Problem 259 A tree is a connected graph, or network, consisting of vertices and edges, but with no cycles (or circuits). Prove that a tree with n vertices has exactly edges.
The next problem concerns spherical polyhedra. A spherical polyhedron is a polyhedral surface in 3-dimensions, which can be inflated to form a sphere (where we assume that the faces and edges can stretch as required). For example, a cube is a spherical polyhedron; but the surface of a picture frame is not. A spherical polyhedron has
- faces (flat 2-dimensional polygons, which can be stretched to take the form of a disc),
- edges (1-dimensional line segments, where exactly two faces meet), and
- vertices (0-dimensional points, where several faces and edges meet, in such a way that they form a single cycle around the vertex).
Each face must clearly have at least 3 edges; and there must be at least three edges and three faces meeting at each vertex.
If a spherical polyhedron has V vertices, E edges, and F faces, then the numbers V, E, F satisfy Euler's formula
For example, a cube has vertices, edges, and faces, and .
Problem 260
(a) (i) Describe a spherical polyhedron with exactly 6 edges.
(ii) Describe a spherical polyhedron with exactly 8 edges.
(b) Prove that no spherical polyhedron can have exactly 7 edges.
(c) Prove that for every , there exists a spherical polyhedron with n edges.
Problem 261 A map is a (finite) collection of regions in the plane, each with a boundary, or border, that is `polygonal' in the sense that it consists of a single sequence of distinct vertices and – possibly curved – edges, that separates the plane into two parts, one of which is the polygonal region itself. A map can be properly coloured if each region can be assigned a colour so that each pair of neighbouring regions (sharing an edge) always receive different colours. Prove that the regions of such a map can be properly coloured with just two colours if and only if an even number of edges meet at each vertex.
Problem 262 (Gray codes) There are sequences of length n consisting of 0s and 1s. Prove that, for each , these sequences can be arranged in a cyclic list such that any two neighbouring sequences (including the last and the first) differ in exactly one coordinate position.
Problem 263 (Calkin-Wilf tree) The binary tree in the plane has a distinguished vertex as `root', and is constructed inductively. The root is joined to two new vertices; and each new vertex is then joined to two further new vertices – with the construction process continuing for ever (Figure 11). Label the vertices of the binary tree with positive fractions as follows:
- the root is given the label
- whenever we know the label of a ‘parent’ vertex, we label its `left descendant' as , and its `right descendant' .
(a) Prove that every positive rational occurs once and only once as a label, and that it occurs in its lowest terms.
(b) Prove that the labels are left-right symmetric in the sense that labels in corresponding left and right positions are reciprocals of each other.
Problem 264 A collection of n intervals on the x-axis is such that every pair of intervals have a point in common. Prove that all n intervals must then have at least one point in common.
6.8. Two problems
Problem 265 Several identical tanks of water sit on a horizontal base. Each pair of tanks is connected with a pipe at ground level controlled by a valve, or tap. When a valve is opened, the water level in the two connected tanks becomes equal (to the average, or mean, of the initial levels). Suppose we start with tank T which contains the least amount of water. The aim is to open and close valves in a sequence that will lead to the final water level in tank T being as high as possible. In what order should we make these connections?
Figure 11: A (rooted) binary tree.
Problem 266 I have two flasks. One is ‘empty’, but still contains a residue of a dangerous chemical; the other contains a fixed amount of solvent that can be used to wash away the remaining traces of the dangerous chemical. What is the best way to use the fixed quantity of solvent? Should I use it all at once to wash out the first flask? Or should I first wash out the flask using just half of the solvent, and then repeat with the other half? Or is there a better way of using the available solvent to remove as much as possible of the dangerous chemical?
6.8. Infinite descent
In this final section we touch upon an important variation on mathematical induction. This variation is well-illustrated by the next (probably familiar) problem.
Problem 267 Write out for yourself the following standard proof that is irrational.
(i) Suppose to the contrary that is rational. Then for some positive integers m, n. Prove first that m must be even.
(ii) Write , where is also an integer. Show that n must also be even.
(iii) How does this lead to a contradiction?
Problem 267. has the classic form of a proof which reaches a contradiction by infinite descent.
- We start with a claim which we wish to prove is true. Often when we do not know how to begin, it makes sense to ask what would happen if the claim were false. This then guarantees that there must be some counterexample, which satisfies the given hypothesis, but which fails to satisfy the asserted conclusion.
- Infinite descent becomes an option whenever each such counterexample gives rise to some positive integer parameter n (such as the denominator in Problem 267.(i)).
- Infinite descent becomes a reality, if one can prove that the existence of the initial counterexample leads to a construction that produces a counterexample with a smaller value of the parameter n, since repeating this step then gives rise to an endlessly decreasing sequence of positive integers, which is impossible (since such a chain can have length at most n).
- Hence the initial assumption that the claim was false must itself be false – so the claim must be true (as required).
Proof by “infinite descent” is an invaluable tool. But it is important to realise that the method is essentially a variation on proof by mathematical induction. As a first step in this direction it is worth reinterpreting Problem 267. as an induction proof.
Problem 268 Let be the statement:
“ cannot be written as a fraction with positive denominator ”.
(i) Explain why is true.
(ii) Suppose that is true for some . Use the proof in Problem 267. to show that must then be true as well.
(iii) Conclude that is true for all , whence must be irrational.
Problem 268. shows that, in the particular case of Problem 267. one can translate the standard proof that “ is irrational” into a proof by induction. But much more is true. The contradiction arising in step 3. above is an application of an important principle, namely
The Least Element Principle: Every non-empty set S of positive integers has a smallest element.
The Least Element Principle is equivalent to The Principle of Mathematical Induction which we stated at the beginning of the chapter:
The Principle of Mathematical Induction: If a subset S of the positive integers
then S contains all positive integers.
- contains the integer “1”, and has the property that
- whenever an integer k is in the set S, then the next integer is always in S too,
Problem 269
(a) Assume the Least Element Principle. Suppose a subset T of the positive integers contains the integer “1”, and that whenever k is in the set T, then is also in the set T. Let S be the set of all positive integers which are not in the set T. Conclude that S must be empty, and hence that T contains all positive integers.
(b) Assume the Principle of Mathematical Induction. Let T be a non-empty set of positive integers, and suppose that T does not have a smallest element. Let S be the set of all positive integers which do not belong to the set T. Prove that “1” must belong to S, and that whenever the positive integer k belongs to S, then so does . Derive the contradiction that T must be empty, contrary to assumption. Conclude that T must in fact have a smallest element.
To round off this final chapter you are invited to devise a rather different proof of the irrationality of .
Problem 270 This sequence of constructions presumes that we know – for example, by Pythagoras' Theorem – that, in any square , the ratio
“diagonal: side” .
Let be a square. Let the circle with centre Q and passing through P meet at . Construct the perpendicular to at , and let this meet at .
(i) Explain why . Construct the point to complete the square .
(ii) Join . Explain why .
(iii) Suppose . Conclude that .
(iv) Prove that , and hence that , .
(v) Explain how, if m, n can be chosen to be positive integers, the above sequence of steps sets up an “infinite descent” – which is impossible.
6.10. Chapter 6: Comments and solutions
Note: It is important to separate the underlying idea of “induction” from the formal way we have chosen to present proofs. As ever in mathematics, the ideas are what matter most. But the process of struggling with (and slowly coming to understand why we need) the formal structure behind the written proofs is part of the way the ideas are tamed and organised.
Readers should not be intimidated by the physical extent of the solutions to this chapter. As explained in the main text it is important for all readers to review the way they approach induction proofs: so we have erred in favour of completeness – knowing that as each reader becomes more confident, s/he will increasingly compress, or abbreviate, some of the steps.
(a) Yes.
(b) Yes.
and , so we only need to check prime factors up to .(c) No.
and so we might have to check all possible prime factors up to . However, we only have to start at [Why?], and checking with a calculator is very quick. In fact factorises rather prettily as .
229. Note: The statement in the problem includes the quantifier “for all ”.
What is to be proved is the compound statement
“ is true for all ”.
In contrast, each individual statement refers to a single value of n.
It is essential to be clear when you are dealing with the compound statement, and when you are referring to some particular statement , or , or .
Let be the statement:
“ is divisible by ”.
• is the statement:
“ is divisible by ”.
That is:
“ is divisible by ”,
which is evidently true.
- Now suppose that we know is true for some . We must show that is then also true.
To prove that is true, we have to consider the statement . It is no use starting with . However, since we know that ) is true, we are free to use it at any stage if it turns out to be useful.
To prove that is true, we have to show that
“ is divisible by ”.
So we must start with and try to “simplify” it (knowing that “simplify” in this case means “rewrite it in a way that involves ”).
The first term on the RHS is a multiple of , so is divisible by (since we know that P(k) is true); the second term on the RHS is a multiple of 242 = 576; and the third term on the RHS is the expression arising in P(1), which we saw is equal to 576.
the whole RHS is divisible by
the LHS is divisible by , so is true.
Hence
- is true; and
- whenever is true for some , we have proved that must be true.
is true for all .
230. Let be the statement:
“the angles of any p-gon, for any value of p with , have sum exactly radians”.
is the statement:
“the angles of any triangle have sum radians”.
This is a known fact: given triangle ΔABC, draw the line through A parallel to , with X on the same side of as B and Y on the same side of as C. Then and (alternate angles), so
- Now we suppose that is known to be true for some . We must show that is then necessarily true.
To prove that is true, we have to consider the statement : that is,
“the angles of a p-gon, for any value of p with , have sum exactly radians”.
This can be reworded by splitting it into two parts:
“the angles of any p-gon for have sum exactly radians;”
and
“the angles of any (k + 1)-gon have sum exactly radians”.
The first part of this revised version is precisely the statement , which we suppose is known to be true. So the crux of the matter is to prove the second part – namely that the angles of any (k + 1)-gon have sum .
Let be any (k + 1)-gon.
[Note: The usual move at this point is to say “draw the chord to cut the polygon into the triangle (with angle sum (by ), and the k-gon (with angle sum (by ), whence we can add to see that has angle sum . However, this only works
- if the triangle “sticks out” rather than in, and
- if no other vertices or edges of the (k+ 1)-gon encroach into the part that is cut off – which can only be guaranteed if the polygon is “convex”.
So what is usually presented as a “proof” does not work in general.
If we want to prove the general result – for polygons of all shapes – we have to get round this unwarranted assumption. Experiment may persuade you that “there is always some vertex that sticks out and which can be safely “cut off”; but it is not at all clear how to prove this fact (we know of no simple proof). So we have to find another way.]
Consider the vertex , and its two neighbours and .
Imagine each half-line, which starts at , and which sets off into the interior of the polygon. Because the polygon is finite, each such half-line defines a line segment , where X is the next point of the polygon which the half line hits (that is, X is one of the vertices , or a point on one of the sides ).
Consider the locus of all such points X as the half line swings from (produced) to (produced). There are two possibilities: either
(a) these points X all belong to a single side of the polygon; or
(b) they don't.
(a) In the first case none of the vertices or sides of the polygon intrude into the interior of the triangle , so the chord separates the (k + 1)-gon into a triangle and a k-gon . The angle sum of is then equal to the sum of (i) the angle sum of the triangle and (ii) the angle sum of – which are equal to and respectively (by ). So the angle sum of the (k + 1)-gon is equal to as required.
(b) In the second case, as the half-line rotates from to , the point X must at some instant switch from lying on one side of the polygon to lying on another side; at the very instant where X switches sides, must be a vertex of the polygon.
Because of the way the point X was chosen, the chord does not meet any other point of the (k + 1)-gon , and so splits the (k + 1)-gon into an m-gon (with angle sum by ) and a (k - m + 3)-gon (with angle sum by ). So the (k + 1)-gon has angle sum as required.
Hence is true.
is true for all .
231. Let be the statement
.
- LHS of ; RHS of . Since these two are equal, is true.
Suppose that is true for some particular (unspecified) ; that is, we know that, for this particular k,
.
We wish to prove that must then be true.
Now is an equation, so we start with the LHS of and try to simplify it in an appropriate way to get the RHS of :
If we now use , which we are supposing to be true, then the first bracket is equal to k2, so this sum is equal to:
Hence holds.
Combining these two bullet points then shows that “ holds, for all ”.
(a) The only way to learn is by trying and failing; then trying again and failing slightly better! So don't give up too quickly. It is natural to try to relate each term to the one before. You may then notice that each term is slightly less than 3 times the previous term.
(b) Note: The recurrence relation for involves the two previous terms. So when we assume that is known to be true and try to prove , the recurrence relation for will involve and , so needs to be formulated to ensure that we can use closed expressions for both these terms. For the same reason, the induction proof has to start by showing that both and are true.
Let be the statement:
“ for all m, ”.
- LHS of ; RHS of . Since
these two are equal, is true.
combines , and the equality of and ; since these two are equal, is true.
- Suppose that is true for some particular (unspecified) ; that is,
we know that, for this particular k,
“ for all m, .”
We wish to prove that must then be true.
Now requires us to prove that
“ for all m, .”
Most of this is guaranteed by , which we assume to be true. It remains for us to check that the equality holds for . We know that
And we may use ), which we are supposing to be true, to conclude that:
Hence holds.
Combining these two bullet points then shows that “ holds, for all ”.
233. Let be the statement:
“ for all m, ”,
where and .
LHS of ; RHS of . Since these two are equal, is true.
LHS of ; RHS of . Since these two are equal, is true.
Suppose that is true for some particular (unspecified) ; that is, we know that, for this particular k,
“ for all m, .”
We wish to prove that must then be true.
Now requires us to prove that
“ for all m, .”
Most of this is guaranteed by , which we assume to be true. It remains to check this for . We know that
And we may use , which we are supposing to be true to conclude that:
(since and are roots of the equation ) Hence holds.
Combining these two bullet points then shows that “ holds, for all ”.
Note: You may understand the above solution and yet wonder how such a formula could be discovered. The answer is fairly simple. There is a general theory about linear recurrence relations which guarantees that the set of all solutions of a second order recurrence (that is, a recurrence in which each term depends on the two previous terms) is “two dimensional” (that is, it is just like the familiar 2D plane, where every vector (p, q) is a combination of the two “base vectors” (1, 0) and (0, 1):
Once we know this, it remains:
- to find two special solutions (like the vectors (1, 0) and (0, 1) in the plane), which we do here by looking for sequences of the form “” that satisfy the recurrence, which implies that , so , or ;
- then to choose a linear combination of these two power solutions to give the correct first two terms: , , so , .
234. Let be the statement:
“ is divisible by ”.
- is the statement: “ is divisible by 19”, which is true.
- Now suppose that we know that is true for some . We must show that is then also true.
- To prove that is true, we have to show that
“ is divisible by ”.
The bracket in the first term on the RHS is divisible by 19 (by ), and the bracket in the second term is equal to 19. Hence both terms on the RHS are divisible by 19, so the RHS is divisible by 19. Therefore the LHS is also divisible by 19, so is true.
is true for all .
235.
Note: The proofs of identities such as those in Problem 235., which are given in many introductory texts, ignore the lessons of the previous two problems. In particular,
• they often fail to distinguish between
– the single statement for a particular n, and
– the “quantified” result to be proved (“for all ”),
and
• they proceed in the `wrong' direction (e.g. starting with the identity and “adding the same to both sides”).
This latter strategy is psychologically and didactically misleading – even though it can be justified logically when proving very simple identities. In these very simple cases, each statement to be proved is unusual in that it refers to exactly one configuration, or equation, for each . And since there is exactly one configuration for each n, the configuration or identity for can often be obtained by fiddling with the configuration for k. In contrast, in Problem 230., for each value of n, there is a bewildering variety of possible polygons with n vertices, ranging from regular polygons to the most convoluted, re-entrant shapes: the statement makes an assertion about all such configurations, and there is no way of knowing whether we can obtain all such configurations for in a uniform way by fiddling with some configuration for k.
Readers should try to write each proof in the intended spirit, and to learn from the solutions – since this style has been chosen to highlight what mathematical induction is really about, and it is this approach that is needed in serious applications.
(a) Let be the statement:
“”.
- LHS of ; RHS of . Since these two are equal, is true.
- Suppose that is true for some particular (unspecified) ; that is,
we know that, for this particular k,
“ ”.
We wish to prove that must then be true.
Now is an equation, so we start with the LHS of and try to simplify it in an appropriate way to get the RHS of ):
If we now use , which we are supposing to be true, then the first bracket is equal to , so the complete sum is equal to:
Hence holds.
If we combine these two bullet points, then we have proved that “ holds for all ”.
(b) Let be the statement:
“”.
- LHS of ; RHS of . Since these two are equal, is true.
- Suppose that is true for some particular (unspecified) ; that is,
we know that, for this particular k,
“”.
We wish to prove that must then be true.
Now is an equation, so we start with the LHS of and try to simplify it in an appropriate way to get the RHS of :
If we now use , which we assume is true, then the first bracket is equal to , so the complete sum is equal to:
Hence holds.
If we combine these two bullet points, we have proved that “ holds for all ”.
(c) Note: If , then the LHS is equal to n, but the RHS makes no sense. So we assume .
Let be the statement:
“ ”.
- LHS of ; RHS of . Since these two are equal, is true.
- Suppose that is true for some particular (unspecified) ; that is,
we know that, for this particular k,
“”.
We wish to prove that must then be true.
Now is an equation, so we start with the LHS of and try to simplify it in an appropriate way to get the RHS of :
If we now use , which we assume is true, then the first bracket is equal to
so the complete sum is equal to:
Hence holds.
If we combine these two bullet points, we have proved that “ holds for all ”.
(d) Note: The statement to be proved starts with a term involving “0!”. The definition
does not immediately tell us how to interpret “”. The correct interpretation emerges from the fact that several different thoughts all point in the same direction.
(i) When , then to get from to we multiply by (n + 1). If we extend this to , then “to get from to , we have to multiply by 1” – which suggests that .
(ii) When , counts the number of permutations of n symbols, or the number of different linear orders of n objects (i.e. how many different ways they can be arranged in a line). If we extend this to , we see that there is just one way to arrange 0 objects (namely, sit tight and do nothing).
(iii) The definition of as a product suggests that involves a “product with no terms” at all. Now when we “add no terms together” it makes sense to interpret the result as “= 0” (perhaps because if this “sum of no terms” were added to some other sum, it would make no difference). In the same spirit, the product of no terms should be taken to be “= 1” (since if this empty product were included at the end of some other product, it would make no difference to the result).
Let be the statement:
“”.
• LHS of ; RHS of . Since these two are equal, is true.
• Suppose that is true for some particular (unspecified) ; that is, we know that, for this particular k,
“”.
We wish to prove that must then be true.
Now is an equation, so we start with the LHS of and try to simplify it in an appropriate way to get the RHS of ):
If we now use , which we assume is true, then the first bracket is equal to , so the complete sum is equal to:
Hence holds.
If we combine these two bullet points, we have proved that “ holds for all ”.
(e) Let be the statement:
“ ”
• LHS of ; RHS of . Since these two are equal, is true.
• Suppose that is true for some particular (unspecified) ; that is, we know that, for this particular k,
“ ”.
We wish to prove that must then be true.
Now is an equation, so we start with the LHS of and try to simplify it in an appropriate way to get the RHS of :
If we now use , which we assume is true, then the first bracket is equal to , so the complete expression is equal to:
Hence holds.
If we combine these two bullet points, we have proved that “ holds for all ”.
236. Let be the statement:
“ ”.
• LHS of ; RHS of . Since these two are equal, is true.
• Suppose that is true for some particular (unspecified) ; that is, we know that, for this particular k,
“ ”.
We wish to prove that must then be true.
Now is an equation, so we start with one side of and try to simplify it in an appropriate way to get the other side of . In this instance, the RHS of is the most promising starting point (because we know a formula for the triangular number, and so can see how to simplify it):
If we now use , which we assume is true, then the first bracket is equal to
so the complete RHS is:
Hence holds.
If we combine these two bullet points, we have proved that “ holds for all ”.
Note: A slightly different way of organizing the proof can sometimes be useful. Denote the two sides of the equation in the statement by and respectively. Then
• ; and
• simple algebra allows one to check that, for each ,
It then follows (by induction) that for all .
(a) , , . These may not be very suggestive. But
and
may eventually lead one to guess that
Note 1: This will certainly be easier to guess if you remember what you found in Problem 17. and Problem 63.
Note 2: There is another way to help in such guessing. Suppose you notice that
– adding values for up to of a polynomial of degree 0 (such as ) gives an answer that is a “polynomial of degree 1”,
and
– adding values for up to of a polynomial of degree 1 (such as ) gives an answer that is a “polynomial of degree 2”,
Then you might guess that the sum
will give an answer that is a polynomial of degree 3 in n. Suppose that
We can then use small values of n to set up equations which must be satisfied by A, B, C, D and solve them to find A, B, C, D:
– when , we get ;
– when , we get ;
– when , we get ;
– when , we get .
This method assumes that we know that the answer is a polynomial and that we know its degree: it is called “the method of undetermined coefficients”.
There are various ways of improving the basic method (such as writing the polynomial in the form
which tailors it to the values that one intends to substitute).
(b) Let be the statement:
“”.
• LHS of ; RHS of . Since these two are equal, is true.
• Suppose that is true for some particular (unspecified) ; that is, we know that, for this particular k,
We wish to prove that must then be true.
Now is an equation, so we start with the LHS of and try to simplify it in an appropriate way to get the RHS of :
If we now use , which we assume is true, then the first bracket is equal to
so the complete sum is equal to:
Hence holds.
If we combine these two bullet points, we have proved that “ holds for all ”.
Note: The triangular numbers are also equal to the binomial coefficients . And the sum of these binomial coefficients is another binomial coefficient , so the result in Problem 237. can be written as:
You might like to interpret Problem 237. in the language of binomial coefficients, and prove it by repeated use of the basic Pascal triangle relation (Pascal (1623--1662)):
Start by rewriting
(a) We know from Problem 237.(b) that
Also
Therefore
(b) Guess:
The proof by induction is entirely routine, and is left for the reader.
Therefore
239.
(a) Let be any such polynomial. If , then we know (by the Remainder Theorem) that has as a factor. Since the are all distinct, and for each k, , we have
for some polynomial . And since we are told that has degree n, has degree 0, so is a constant. Hence every such polynomial of degree n has the form
Since , we can substitute to find C:
(b) Let be the statement:
“Given any two labelled sets of real numbers , and , where the are all distinct (but the need not be), there exists a polynomial of degree n, such that , , , …, .”
• When , we may choose to be the constant polynomial. Hence is true.
• Suppose that is true for some particular (unspecified) ; that is, we know that, for this particular k:
“Given any two labelled sets of real numbers , and , where the are all distinct (but the need not be), there exists a polynomial of degree k, such that , , .”
We wish to prove that must then be true.
Now is the statement:
“Given any two labelled sets of real numbers , and , where the are all distinct (but the need not be), there exists a polynomial of degree , such that , , , …, .”
So to prove that holds, we must start by considering
any two labelled sets of real numbers where the are all distinct (but the need not be).
We must then somehow construct a polynomial function of degree with the required property.
Because we are supposing that is known to be true, we can focus on the first numbers in each of the two lists – , and – and we can then be sure that there is a polynomial of degree k such that
The next step is slightly indirect: we make use of the polynomial which we are still trying to construct, and focus on the polynomial
satisfying
Part (a) guarantees the existence of such a polynomial of degree and tells us exactly what this polynomial function is equal to. Hence we can construct the required polynomial ) by setting it equal to , which proves that is true.
If we combine these two bullet points, we have proved that “ holds for all ”.
240.
(a) 5 cents cannot be made; ; ; ; ; etc.
Guess: Every amount > N = 5 can be paid directly (without receiving change).
(b) Let be the statement:
“n cents can be paid directly (without change) using 3 cent and 4 cent coins”.
– is the statement: “6 cents can be paid directly”. And , so is true.
– Now suppose that we know is true for some . We must show that must then be true.
To prove we consider the statement :
“ cents can be paid directly”.
We know that is true, so we know that “k cents can be paid directly”.
– If a direct method of paying k cents involves at least one 3 cent coin, then we can replace one 3 cent coin by a 4 cent coin to produce a way of paying cents.
Hence we only need to worry about a situation in which the only way to pay k cents directly involves no 3 cent coins at all – that is, paying k cents uses only 4 cent coins. But then there must be at least two 4 cent coins (since ), and we can replace two 4 cent coins by three 3 cent coins to produce a way of paying cents directly.
Hence
• is true; and
• whenever is true for some , we know that is also true.
is true for all .
(a) , so (these are the two primitive sixth roots of unity).
(the two primitive cube toots of unity), and
(b) , so (repeated root). and .
(c) (d) , so .
As soon as one starts calculating z2 and , it becomes clear that it is time to p-a-u-s-e and think.
so whenever is an integer,
is also an integer.
(e) Let be the statement:
“if z has the property that is an integer, then is also an integer for all m, ”.
• and are clearly both true; and was proved in part (d) above.
• Suppose that is true for some particular (unspecified) ; that is, we know that, for this particular k:
“if z has the property that is an integer, then is also an integer for all m, ”.
We wish to prove that must then be true.
If is an integer, then, by ,
“ is also an integer for all m, ”.
So to prove that holds, we only need to show that
“ is an integer”.
By the Binomial Theorem:
The LHS is an integer (since is an integer), and (by ) every term on the RHS is an integer except possibly the first. Hence the first term is the difference of two integers, so must also be an integer.
Hence is true.
If we combine these two bullet points, we have proved that “ holds for all ”. QED
Note: If is even, the expansion of has an odd number of terms, so the RHS of the above re-grouped expansion ends with the term , which is also an integer.
242.
Note: In the solution to Problem 241. we included the condition on z as part of the statement .
In Problem 242. the result to be proved has a similar background hypothesis – “Let p be a prime number”. It may make the induction clearer if, as in the statement of the Problem, this hypothesis is stated before starting the induction proof.
Let p be any prime number. We let be the statement:
“ is divisible by p ”.
• is true (since , which is divisible by p).
• Suppose that is true for some particular (unspecified) ; that is, we know that, for this particular k:
“ is divisible by p”.
We wish to prove – that is,
“ is divisible by p”
must then be true. Using the Binomial Theorem again we see that
By , the first bracket on the RHS is divisible by p; and in each of the other terms each of the binomial coefficients , ,
• is an integer, and
• has a factor “p” in the numerator and no such factor in the denominator.
Hence each term in the second bracket is a multiple of p. So the RHS (and hence the LHS) is divisible by p.
Hence is true.
If we combine these two bullet points, we have proved that “ holds for all ”. QED
243.
This is a geometric series with first term and common ratio , and so has sum
244.
(a) Each person receives in total:
(here ).
(b) You have
(here , ); I have
(here , ).
245. The trains are 120 km apart, and the fly travels at 50 km/h towards Train B, which is initially 120 km away and travelling at 30 km/h.
The relative speed of the fly and Train B is 80 km/h, so it takes hours before they meet. In this time Train A and Train B have each travelled 45 km, so they are now 30 km apart. The fly then turns right round and flies back to Train A.
The relative speed of the fly and Train A is then also 80 km/h, so it takes just hours (or 22.5 minutes) for the fly to return to Train A. Train A and Train B have each travelled km in this time, so they are now km apart. The fly then turns round and flies straight back to Train B.
Train B is km away and the relative speed of the fly and Train B is again 80 km/h, so the journey takes hours (or 5.625 minutes).
Continuing in this way, we see that the fly takes
Hence the fly travels 100 km before being squashed.Note: The two trains are approaching each other at 60 km/h, so they crash in exactly 2 hours – during which time the fly travels 100 km.
246.
(a) (i) ; therefore
so
(ii) ; therefore
so
(b) The argument in part (a) gives an upper bound for each bracketed expression in the sum
Replacing each bracket by its upper bound, we see that the sum is
(c) The finite partial sums
• increase steadily as we take more and more terms, and
• every partial sum is less than . It is clear that these partial sums form a sequence
It follows that there is some (unknown) number to which the partial sums converge as , and we take this (unknown) exact value S to be the exact value of the endless sum
To see, for example, that , notice that
Note 1: The claim that
“an increasing sequence of partial sums , all less than 2, must converge to some number ”
is a fundamental property of the real numbers – called completeness.
Note 2: Just as one can obtain better and better lower bounds for S – like “”, so one can improve the upper bound “”. For example, if in part (b) we avoid replacing the third term by , we get a better upper bound “”, which is less than 2.
247.
(a) Let be the statement:
“”.
• Then LHS of , and RHS of . Hence is true.
• Suppose we know that is true for some . We want to prove that holds.
Hence holds.
holds; and whenever is known to be true, is also true.
is true, for all . QED
(b) Let be the statement:
“”.
• Then
and RHS of . Hence is true.
• Suppose we know that is true for some . We want to prove that holds.
Hence holds.
holds; and whenever is known to be true, is also true.
is true, for all .
(a) (i) whenever .
for all .
for all .
(ii) As we go on adding more and more terms, each finite sum
is bigger than the previous sum. Since every finite sum
the sums increase, but never reach 3, so they accumulate closer and closer to a value “e” . Moreover, this value “e” is certainly larger than the sum of the first two terms , so .
(b) (i) Let be the statement:
“”.
• . Hence is true.
• Suppose we know that is true for some . We want to prove that holds.
Hence holds.
holds; and whenever is known to be true, is also true.
is true, for all .
(ii) [The reasoning here uses the constant “3” while ignoring the refinement “”, and so sounds exactly like part (a)(ii).] As we add more terms, each finite sum
is bigger than the previous sum. Since every finite sum
the partial sums increase, but never reach 3; so they accumulate closer and closer to a value “e” . Moreover, this value “e” is certainly larger than the sum of the first three terms , so .
(c) Note: Examine carefully the role played by the number “3” in the above induction proof in (b)(ii). It is needed precisely to validate the statement : since ”. But the number “3” plays no active part in the second induction step, and could be replaced by any other number we choose.
The exact value “e” of the infinite series is not really affected by what happens when . Suppose we ask: “What number should replace “3” if we only want to prove that
The only part of the induction proof where then matters is when we try to check that holds; so we must choose the smallest possible to satisfy
that is, .
(i) Let be the statement:
“”.
• LHS of ; RHS of . Hence is true.
• Suppose we know that is true for some . We want to prove that holds.
Hence holds.
holds; and whenever is known to be true, is also true.
is true, for all . QED
(ii) As we add more terms, each finite sum
is bigger than the previous sum.
Since every finite sum
the finite sums increase, but never reach 2.75, so they accumulate closer and closer to a value “e” . Moreover, this value “e” is certainly larger than the sum of the first four terms
so .
(d) (i) Let be the statement:
“”.
• ; . Hence is true.
• Suppose we know that is true for some . We want to prove that holds.
Hence holds.
holds; and whenever is known to be true, is also true.
is true, for all .
(ii) As we add more terms, each finite sum
is bigger than the previous sum.
Since every finite sum
the finite sums increase, but never reach (for ever), so they accumulate closer and closer to a value “e” (for ever). Moreover, this value “e” is certainly larger than the sum of the first five terms
so .
Note: This process of refinement can continue indefinitely. But we only have to go one further step to pin down the value of “e” with surprising accuracy.
The next step uses the same proof to show that
“, for all ”,
and to conclude that the endless sum
has a definite value “e” that lies somewhere between and .
We could then repeat the same proof to show that
and use the lower bound from the first seven terms to conclude that the endless sum
has a definite value “e” that lies somewhere between and (for ever). And so on.
249. Let be the statement:
“”.
• LHS of RHS of . Hence is true.
• Suppose we know that is true for some . We want to prove that holds.
Hence holds.
holds; and whenever is known to be true, is also true.
is true, for all . QED
250. Let a, b be real numbers such that , and .
One of a, b is then the greater, and we may suppose this is a – so that . If , then for all n; if , then implies that , so for all n.
Let be the statement:
“”.
• LHS of . Hence is true.
• Suppose we know that is true for some . We want to prove that holds.
Hence holds.
holds; and whenever is known to be true, is also true.
is true, for all .
251. Let x be any real number .
If , then , for all .
Thus we may assume that , so .
Let be the statement: “ ”.
• LHS of . Hence is true.
• Suppose we know that is true for some . We want to prove that holds.
Hence holds.
holds; and whenever is known to be true, is also true.
is true, for all .
252. The problem is discussed after the statement of Problem 252. in the main text.
253.
(a) (i) , so . .
(ii) ; hence are all . .
(iii) Let be the statement:
“ ”.Then
• is true by (i), since
• Suppose that is true for some .
The first bracket is (by ); and each of the terms in the second bracket is , so the whole bracket is .
Hence the LHS of , so is true.
Hence is true for all .
(iv) At the very first stage (part (i)) we replaced by . Hence when , we know that the two sides of differ by at least . This difference is greater than when , so (iv) follows.
(b)
(i) , so .
.
(ii) ; hence are all .
.
(iii) Let be the statement:
“”.
Then
• is true by (i), since
• Suppose that is true for some .
LHS of .
The first bracket is (by ); and each of the terms in the second bracket is , so the whole bracket is . Hence the LHS of , so is true.
Hence is true for all .
254.
(a) We use induction. Let be the statement:
“n identical rectangular strips of length 2 balance exactly on the edge of a table if the successive protrusion distances (first beyond the edge of the table, then beyond the leading edge of the strip immediately below, and so on) are the terms of the finite harmonic series in reverse order.”
• When , a single strip which protrudes distance 1 beyond the edge of the table has its centre of gravity exactly over the edge of the table. Hence is true.
• Suppose that we know that is true for some .
Let identical strips be arranged as described in the statement .
The statement guarantees that the top k strips would exactly balance if the leading edge of the bottom strip were in fact the edge of the table; hence the combined centre of gravity of the top k strips is positioned exactly over the leading edge of the bottom strip.
Let M be the mass of each strip; since the leading edge of the bottom strip is distance beyond the edge of the table, the top k strips produce a combined moment about the edge of the table equal to .
The centre of gravity of the bottom strip is distance from the edge of the table in the opposite direction; hence it contributes a moment about the edge of the table equal to .
the total moment of the whole stack about the edge of the table is equal to zero, so the centre of gravity of the combined stack of strips lies exactly over the edge of the table. Hence is true.
Hence is true for all .
(b) Problem 253.(b)(iii) now guarantees that a stack of strips can protrude a distance beyond the edge of the table.
Note: Ivars Petersen's Mathematical Tourist blog contains an entry in 2009
which explores how one can obtain large overhangs with fewer strips if one is allowed to use strips to counterbalance those that protrude beyond the edge of the table.
255.
(a) (i) Let be the statement:
“ for all such that ”.
• is true (since is the empty sum, so
• Suppose that is true for some . Then most of the inequalities in the statement are part of the statement ; the only outstanding inequalities which remain to be proved are:
which are true, since
and
Hence is true.
Hence is true for all .
(ii) The “even sums” are increasing, but all are less than , so they approach some value .
The “odd sums” are decreasing, but all are greater than , so they approach some value .
The “even sums” are increasing, but all are less than , so they approach some value .
The “odd sums” are decreasing, but all are greater than , so they approach some value .
Moreover, the difference between successive sums is , and this tends towards zero, so the difference between each “odd sum” and the next “even sum” tends to zero, so .
(b) The proof from part (a) carries over word for word, with “” replaced at each stage by “”.
256. Let be the statement:
“ has at least k distinct prime factors”.
• has exactly 1 prime factor, so is true.
• Suppose that is true for some .
Then . The first factor has at least k distinct prime factors.
And the second factor , so has at least one prime factor.
Moreover , so the second bracket has no factor in common with .
Hence has at least distinct prime factors, so is true.
Hence is true for all .
Note: This problem [suggested by Serkan Dogan] gives a different proof of the result (Problem 76.(d)) that the list of prime numbers goes on for ever.
257.
(a) Let be the statement: “n distinct points on a straight line divide the line into intervals”.
• 0 points leave the line in pristine condition – namely a single interval – so is true.
• Suppose that is true for some .
Consider an arbitrary straight line divided by points .
Then the k points divide the line into intervals (by ).
The additional point is distinct from , …, and so must lie inside one of these intervals, and divides it in two – giving intervals altogether.
Hence is true.
Hence is true for all .
(b) (i) We want a function R satisfying
If we notice that in part (a) then we might guess that
(ii) Let be the statement:
“n distinct straight lines in the plane divide the plane into at most regions”.
• 0 lines leave the plane in pristine condition – namely a single region – so is true, provided that
• Suppose that is true for some .
Consider the plane divided by straight lines .
Then the k lines divide the plane into at most
regions (by ).
The additional line is distinct from and so meets each of these lines in at most a single point – giving at most k points on the line . These points divide into at most intervals, and each of these intervals corresponds to a cut-line, where the line cuts one of the regions created by the lines into two pieces – giving at most
regions altogether.
Hence is true.
Hence is true for all .
(c) (i) We want a function S satisfying
After thinking about the differences between successive terms in part (b), we might guess that
(ii) Let be the statement:
“n distinct planes in 3-space divide space into at most
regions”.
• 0 planes leave our 3D space in pristine condition – namely a single region – so is true – provided that
• Suppose that is true for some .
Consider 3D divided by planes .
Then the k planes divide 3D into at most
regions (by ).
The additional plane is distinct from and so meets each of these planes in (at most) a line – giving rise to at most k lines on the plane . This arrangement of lines on the plane divides into at most
regions, and each of these regions on the plane is the “cut” where the plane cuts an existing region into two pieces – giving rise to at most
regions altogether. (There is no need for any algebra here: one only needs to use the Pascal triangle condition.)
Hence is true whenever is true.
Hence is true for all .
258. Notice that, given a dissection of a square into k squares, we can always cut one square into four quarters (by lines through the centre, and parallel to the sides), and so create a dissection with squares.
Let be the statement:
“Any given square can be cut into m (not necessarily congruent) smaller squares, for each m, ”.
• Let . Given any square of side s (say). We may cut a square of side from one corner, leaving an L-shaped strip of width , which we can then cut into 5 smaller squares, each of side . Hence is true.
Let . Given any square, we can divide the square first into four quarters; then divide one of these smaller squares into four quarters to obtain a dissection into 7 smaller squares. Hence is true.
Let . Given a square of side s (say). We may cut a square of side from one corner, leaving an L-shaped strip of width , which we can then cut into 7 smaller squares, each of side . Hence is true.
• Suppose that is true for some .
Then , so any given square can be dissected into smaller squares (by ). Taking this dissection and dividing one of the smaller squares into four quarters gives a dissection of the initial square into squares. Hence is true.
Hence is true for all .
259. Let be the statement:
“Any tree with n vertices has exactly edges”.
• A tree with 1 vertex is simply a vertex with 0 edges (since any edge would have to be a loop, and would then create a cycle). Hence is true.
• Suppose that is true for some .
Consider an arbitrary tree T with vertices.
[Idea: We need to find some way of reducing T to a tree with k vertices. This suggests “removing an end vertex”. So we must first prove that “any tree T has an end vertex”.]
Definition The number of edges incident with a given vertex v is called the valency of v.
Lemma Let S be a finite tree with vertices. Then S has a vertex of valency 1 – that is an “end vertex”.
Proof Choose any vertex . Then must be connected to the rest of the tree, so has valency at least 1.
If is an “end vertex”, then stop; if not, then choose a vertex which is adjacent to .
If is an “end vertex”, then stop; if not, choose a vertex which is adjacent to .
If is an “end vertex”, then stop; if not, choose a vertex which is adjacent to .
Continue in this way.
All of the vertices must be different (since any repeat with would define a cycle
in the tree S). Since we know that the tree is finite (having precisely s vertices), the process must terminate at some stage. The final vertex is then an “end vertex”, of valency 1.
If we apply the Lemma to our arbitrary tree T with vertices, we can choose an “end vertex” v and remove both it and the edge e incident with it to obtain a tree having k vertices. By we know that has exactly edges, so when we reinstate the edge e, we see that T has exactly edges, so is true.
Hence is true for all .
260.
Note: All the polyhedra described in this solution are “spherical” by virtue of having their vertices located on the unit sphere.
(a) (i) A regular tetrahedron.
(ii) A square based pyramid with its apex at the North pole.
(b) If a spherical polyhedron has V vertices, E edges, and F faces, then
Now each edge has exactly two end vertices, so counts the exact number of ordered pairs (v, e), where e is an edge, and v is a vertex “incident with e”.
On the other hand, in a spherical polyhedron, each vertex v has valency at least 3; so each vertex v occurs in at least 3 pairs (v, e) of this kind. Hence .
In the same way, each edge e lies on the boundary of exactly 2 faces, so counts the exact number of ordered pairs (f, e), where e is an edge of the face f.
On the other hand, in a spherical polyhedron, each face f has at least 3 edges; so each face f occurs in at least 3 pairs (f, e) of this kind. Hence .
If , then , and ; now V and F are integers, so and . Hence . However . This contradiction shows that no such polyhedron exists.
(c) We show by induction how to construct certain “spherical” polyhedra, with at most one non-triangular face. Let be the statement:
“There exists a spherical polyhedron with at most one non-triangular face, and with e edges for each ”.
• We know that there exists a such a spherical polyhedron with edges – namely the regular tetrahedron (with four faces, which are all equilateral triangles).
We know there is no such polyhedron with edges (by part (b)).
When , there is no spherical polyhedron with edges and all faces triangular (since we would then have , as in part (b)). However, there exists a spherical polyhedron with edges and just one non-triangular face – namely the square based pyramid with its apex at the North pole.
When , we can join three points on the equator to the North and South poles to produce a triangular bi-pyramid (the dual of a triangular prism), with all faces triangular, and with edges.
When , there is no spherical polyhedron with edges and with all faces triangles (since we would then have to have , as in part (b)); but there exists a spherical polyhedron with edges and just one face which is not an equilateral triangle – namely the pentagonal based pyramid with its apex at the North pole.
This provides us with a starting point for the inductive construction. In particular , , and are all true.
• Suppose that is true for some . The only part of the statement that remains to be demonstrated is the existence of a suitable polyhedron with edges.
Since , we know that , so (by ) there exists a polyhedron with all its vertices on the unit sphere, with at most one non-triangular face, and with edges. Take this polyhedron and remove a triangular face . Now add a new vertex X on the sphere, internal to the spherical triangle , and add the edges , , and the three triangular faces , , , to produce a spherical polyhedron with edges, and with at most one non-triangular face. Hence is true.
Hence is true for all .
261. To prove a result that is given in the form of an “if and only if” statement, we have to prove two things: “if”, and “only if”.
We begin by proving the “only if” part:
“a map can be properly coloured with two colours only if every vertex has even valency”.
Let M be a map that can be properly coloured with two colours. Let v be any vertex of the map M.
The edges incident with v form parts of the boundaries of the sequence of regions around the vertex v (with , bordering one region; , bordering the next; and so on). Since we are assuming that the regions of the map M can be “properly coloured” with two colours, the succession of regions around the vertex v can be properly coloured with just two colours. Hence the colours of the regions around the vertex v must alternate (say black-white-black- …). And since the map is finite, this sequence must return to the start – so the number of such regions at the vertex v (and hence the number of edges incident with v – that is, the valency of v) must be even.
We now prove the “if” part:
“a map can be properly coloured with two colours if every vertex has even valency”.
Suppose that we have a map M in which each vertex has even valency. We must prove that any such map M can be properly coloured using just two colours.
Let be the statement:
“any map with m edges, in which each vertex has even valency, can be properly coloured with two colours whenever m satisfies ,”.
• If , a map in which every vertex has even valency, and which has just one edge , must consist of a single vertex v, with as a loop from to (so has valency 2, since the edge is incident with twice). This creates a map with two regions – the “island” inside the loop, and the “sea” outside; so we can colour the “island” black and the “sea” white. Hence is true.
• Suppose that is true for some
Most of the contents of the statement are already guaranteed by . To prove that is true, all that remains to be proved is that
any map with exactly edges, in which every vertex has even valency, can be properly coloured using just two colours.
Consider an arbitrary map M with edges, in which each vertex has even valency.
[Idea: We need to find some way of reducing the map M to a map with edges, in which every vertex still has even valency.]
Since , the map M has at least 2 edges. Choose any edge e of M, with (say) vertices , as its endpoints, and with regions R, S on either side of e.
Suppose first that , so the boundary of the region R (say) consists only of the edge e. Hence e is a loop, and S is the only region neighbouring R. The edge e contributes 2 to the valency of ; so if we delete the edge e, we obtain a map in which every vertex again has even valency, in which the regions R and S have been amalgamated into a region . Since has just k edges, can be properly coloured with just two colours. If we now reinstate the edge e and the region R, we can give S the same colour as (in the proper colouring of ) and give R the opposite colour to to obtain a proper colouring of the map M with just two colours.
Hence we may assume that , so that e is not the complete boundary of R. We may then slowly shrink the edge e to a point – eventually fusing the old vertices , together to form a new vertex , where two new regions , meet. The result is then a new map , in which all other vertices are unchanged (and so have even valency), and in which
which is also even.
Hence every vertex of the new map has even valency. Moreover, has at most k edges, so (by ) we know that the map can be properly coloured with just two colours. And in this colouring of , there are an odd number of colour changes as one goes from to through the other regions that meet around the old vertex of M, so receives the opposite colour to . The guaranteed proper two-colouring of therefore extends back to give a proper two-colouring of the original map M. Hence is true.
Hence is true for all .
262. Let be the statement:
• When , the required cycle is obvious: So is true.“The sequences of length n consisting of 0s and 1s can be arranged in a cyclic list such that any two neighbouring sequences (including the last and the first) differ in exactly one coordinate position.”
• The general construction is perhaps best illustrated by first showing how leads to .
The above cycle for sequences of length 2 gives rise to two disjoint cycles for sequences of length 3:
– first by adding a third coordinate “0”:
– then by adding a third coordinate “1”:
Now eliminate the final join in each cycle ( and ) and instead link the two cycles together by first reversing the order of the first cycle, and then inserting the joins and to form a single cycle.
In general, suppose that is true for some . Then we construct a single cycle for the sequences of length as follows:
Take the cycle of the sequences of length k guaranteed by , and form two disjoint cycles of length
• first by adding a final coordinate “0“
• then by adding a final coordinate “1”.
Then link the two cycles into a single cycle of length , by eliminating the final step
in the first cycle, andin the second cycle, reversing the first cycle, and inserting the joins
to produce a single cycle of the required kind joining all sequences of length . Hence is true.
Hence is true for all .
Note: The significance of what we call Gray codes was highlighted in a 1953 patent by the engineer Frank Gray (1887-1969) – where they were called reflected binary codes (since the crucial step in their construction above involves taking two copies of the previous cycles, reversing one of the cycles, and then producing half of the required cycle by traversing the first copy before returning backwards along the second copy). Their most basic use is to re-encode the usual binary counting sequence
where a single step can lead to the need to change arbitrarily many binary digits (e.g. the step from ” to ” changes 3 digits, and the step from ” to “1000” changes 4 digits, etc.) – a requirement that is inefficient in terms of electronic “switching”, and which increases the probability of errors. In contrast, the Gray code sequence changes a single binary digit at each step. However, the physical energy which is saved through reducing the amount of “switching” in the circuitry corresponds to an increase in the need for unfamiliar mathematical formulae, which re-interpret each vector in the Gray code as the ordinary integer in the counting sequence to which it corresponds.
263.
(a) The whole construction is inductive (each label derives from an earlier label). So let be the statement:
“if and , then the positive rational occurs once and only once as a label, and it occurs in its lowest terms”.
• By construction the root is given the label , so occurs. And it cannot occur again, since the numerator and denominator of each parent vertex are both positive, neither i nor j can ever be 0. Hence is true.
Notice that the basic construction:
“if is a `parent' vertex, then we label its `left descendant' as , and its `right descendant' ”
guarantees that, since we start by labelling the root with the positive rational , all subsequent `descendants' are positive.
Moreover, if any `descendant' were suddenly to appear not “in lowest terms”, then either
– , in which case , so at the previous stage; or
– , in which case , so at the previous stage.
Since we begin by labelling the root , where , it follows that all subsequent labels are positive rationals in lowest terms.
• Suppose that is true from some .
Most parts of the statement are guaranteed by . To show that is true, it remains to consider cases where and (). Either (i) , or (ii) .
(i) Suppose that . Then arises in this (fully cancelled) form only as a direct (right) descendant of . So occurs. Moreover, every label occurs in its lowest terms, so cannot occur again.
(ii) Suppose that . Then arises in this (fully cancelled) form only as a direct (left) descendant of . So occurs. Moreover, every label occurs in its lowest terms, so cannot occur again.
Hence is true.
Hence is true for all .
(b) The fact that the labels are left-right symmetric is also an inductive phenomenon. We note that the one fully “left-right symmetric” label, namely , occurs in the only fully “left-right symmetric” position – namely at the root.
All other labels occur in reciprocal pairs: and , where we may assume that . The fact that these occur as labels of “left-right symmetric” vertices derives from the fact that
is the `right descendant' of and
is the `left descendant' of .
So if we know that the earlier reciprocal pair reciprocal pair and occur as labels of symmetrically positioned vertices, then it follows that the same is true of the descendant reciprocal pair and . We leave the reader to write out the proof by induction – for example, using the statement
“: if , and , then the reciprocal pair , occur as labels of vertices at the same level below the root, with the two labelled vertices being mirror images of each other about the vertical mirror through the root vertex.”
264. The intervals in this problem may be of any kind (including finite or infinite). Each interval has two “endpoints”, which are either ordinary real numbers, or (signifying that the interval goes off to infinity in one or both directions).
Let be the statement:
“if a collection of n intervals on the x-axis has the property that any two intervals overlap in an interval (of possibly zero length – i.e. a point), then the intersection of all intervals in the collection is a non-empty interval”.
When , the hypothesis of is the same as the conclusion. So is true.
Suppose that is true for some . We seek to prove that is true.
So consider a collection of intervals with the property that any two intervals in the collection intersect in a non-empty interval. If this collection includes one interval that is listed more than once, then the required conclusion follows from . So we may assume that the intervals in our collection are all different.
Among the intervals, consider first those with the largest right hand endpoint. If there is only one such interval, denote it by ; if there is more than one interval with the same largest right hand endpoint, let be the interval among those with the largest right hand endpoint that has the largest left hand endpoint. In either case, put aside for the moment, leaving a collection S of k intervals with the required property.
By we know that the intervals in the collection S intersect in a non-empty interval I, with left hand endpoint a and right hand endpoint b (say).
We have to show that the intersection is non-empty.
The proof that follows works if the endpoint b is included in the interval I. The slight adjustment needed if b is not included in the interval I is left to the reader.
Since the right hand endpoint of is the largest possible, and since points between a and b belong to all the intervals of S, we can be sure that the right hand endpoint of is .
Moreover, for each point , we know that there must be some interval in the collection S which does not stretch as far to the right as x. Since, by hypothesis, the intersection is non-empty, the left hand endpoint of lies to the left of every such point x, so must overlap the interval I, whence it follows that is a non-empty interval as required.
Hence is true.
Hence is true for all .
265. If one experiments a little, it should become clear
• that if tank contains more than tank , then linking tank T to leads to a better immediate outcome (i.e. a larger amount in tank T) than linking T to ;
• that if, at some stage in the linking sequence, tank T contains an interim amount of a litres, and is about to link successively to a tank containing b litres, and then to the tank containing c litres, this ordered pair of changes alters the amount in the tank T to litres; so for a better final outcome we should always choose the sequence so that ;
• once the tap linking tank T to another tank has been opened, so that the two levels become equal, there is no benefit from opening the tap linking these two tanks ever again, so tank T should be linked with each other tank at most once.
These three observations essentially determine the answer – namely that tank T should be joined to the other tanks in increasing order of their initial contents.
For a proof by induction, let be the statement:
“given n tanks containing litres respectively, where
if T is the tank containing the smallest amount litres, then the optimal sequence for linking the other tanks to tank T (optimal in the sense that it transfers the maximum amount of water to tank T) is the sequence that links T successively to the other tanks in increasing order of their initial contents”.
• When , there is only one possible sequence, which is the one described, so is true.
• Suppose that is true for some , and consider an unknown collection of tanks containing litres respectively, where
and where T is the tank which initially contains litres.
Suppose that in the optimal sequence of k successive joins to the other k tanks (that is, that transfers the largest possible amount of water to tank T), the succession of joins is to join T first to tank , then to tank , and so on up to tank (where tank is not necessarily the tank containing litres). There are now two possibilities: either
(i) is the tank containing litres, or
(ii) contains less than litres.
(i) Suppose tank is the tank containing litres. Then the first joins involve the k tanks containing litres. And we know (by the very first bullet point above) that, in order to maximize the final amount in tank T, the amount in tank T after linking it to tank must be “as large as it could possibly be”. Hence, by , the first joins link T successively to the other tanks in increasing order of their contents – before finally linking to the tank containing litres. Hence the conclusion of holds.
(ii) Now suppose that tank contains litres.
By the very first bullet point, in order to guarantee the optimal overall outcome of the final linking with tank the amount in tank T after it has been linked to tank must be “as large as it can possibly be” (given the amounts in the tanks ). Hence statement applies to the initial sequence of joins (of T to , then to , and so on up to ), and guarantees that these tanks must be in increasing order of their initial contents. In particular, the last tank in this sequence, , must be the one containing litres. But if we denote by a litres the amount in tank T just before it links with tank (containing litres), then the last two linkings, with and contradict the second bullet point at the start of this solution. Hence case (ii) cannot occur.
Hence is true.
Hence is true for all .
266.
Note: Like all practical problems, this one requires an element of initial “modelling” in order to make the situation amenable to mathematical analysis.
`Residue' clings to surfaces; so the total amount of `residue' will depend on
(a) the viscosity of the chemical (how `thick', or `sticky' it is), and
(b) the total surface area of the inside of the flask.
Since we are given no information about quantities, we may fix the amount of residue remaining in the `empty' flask at “1 unit”, and the amount of solvent in the other flask as “s units”.
If we add the solvent, we get a combined amount of units of solution – which we may assume (after suitable shaking) to be homogeneous, with the chemical concentration reduced to “1 part in ”.
The first modelling challenge arises when we try to make mathematical sense of what remains at each stage after we empty the flask. The internal surface area of the flask, to which any diluted residue may adhere, is fixed. If we make the mistake of thinking of the original chemical as “thick and sticky” and the solvent as “thin”, then the viscosity of the diluted residue will change relative to the original, and will do so in ways that we cannot possibly know. Hence the only reasonable assumption (which may or may not be valid in a particular instance) is to assume that the viscosity of the original chemical is roughly the same as the viscosity of the chemical-solvent mixture. This then suggests that, on emptying the diluted mixture, roughly the same amount (1 unit) of diluted mixture will remain adhering to the walls of the flask. So we will be left with 1 unit of residue, with a concentration of . In particular, if , using all the solvent at once reduces the amount of toxic chemical residue to unit (with the other unit consisting of solvent).
But what if we use only half of the solvent first, empty the flask, and then use the other half? Adding units of solvent (and shaking thoroughly) produces units of homogeneous mixture, with a concentration of 1 part per . When we empty the flask, we expect roughly 1 unit of residue with this concentration – so just units of the chemical, with units of solvent.
If we then add the other units of solvent, this produces units of mixture with a concentration of 1 part per . When we empty the flask, we expect roughly 1 unit of residue with concentration 1 part per . In particular, if , this strategy reduces the amount of toxic chemical in the 1 unit of residue to units. Since this two-stage strategy seems more effective than the previous “all at once” strategy.
Suppose that we use a four-stage strategy – using first one quarter of the solvent, then another quarter, and so on. We then land up with roughly 1 unit of residue with concentration 1 part per . In particular, if , we land up with the amount of toxic chemical in the 1 unit of residue equal to units, and . More generally, if we use th of the solvent, n times, the final amount of toxic chemical in the 1 unit of residue is equal to . And as n gets larger and larger, this expression gets closer and closer to . In particular, when , this strategy leaves a final amount of chemical in the 1 unit of residue approximately equal to .
Note: The situation here is similar to that faced by a washing machine designer, who wishes to remove traces of detergent from items that have been washed, without using unlimited amounts of water. The idea of having a “fixed amount of solvent” corresponds to the goal of “water efficient” rinsing. However, the washing machine cycle, or programme, clearly cannot repeat the rinsing indefinitely (as would be required in the limiting case above).
267.
(i) If , then
Hence m2 is even.
It follows that m must be even.
Note: It is a fact that, if is even, then is also even. But this is completely irrelevant here. In order to conclude that “m must be even”, we have to prove:
Claim m cannot be odd.
Proof Suppose m is odd.
for some integer k.
But then
would be odd, contrary to “m2 must be even”.
Hence m cannot be odd.
(ii) Since m is even, we may write for some integer . Equation (*) in (i) above then becomes , so n2 is even. Hence, as in the Note above, n must be even, so we can write for some integer .
(iii) If , then , and are both even.
.
In the same way, it follows that and are both even, so we may write , for some integers , .
Continuing in this way then produces an endless decreasing sequence of positive denominators
contrary to the fact that such a sequence can have length at most (or in fact, at most ).
268.
(i) If and , then .
If , then (multiplying by ) it follows that , which is false. Hence .
We now know that , so multiplying by gives . Hence . In particular, cannot be written as a fraction with denominator 1, so is true.
(ii) Suppose is true for some . Most of the statement is implied by : all that remains to be proved is that cannot be written as a fraction with denominator .
Suppose , where and m are positive integers.
Then and are both even (as in Problem 267).
So with , contrary to . Hence holds.
is true for all .
269.
(a) Suppose that S is not empty. Then by the Least Element Principle the set S must contain a least element k: that is, a smallest integer k which is not in the set T. Then (since we are told that T contains the integer 1). Hence .
Therefore is a positive integer which is smaller than k. So is not an element of S, and hence must be an element of T. But if is an element of T, then we are told that must also be a member of T. This contradiction shows that S must be empty, so T contains all positive integers.
(b) Suppose that T does not have a smallest element. Clearly 1 does not belong to the set T (or it would be the smallest element of T). Hence 1 must be an element of the set S.
Now suppose that, for some , all positive integers are elements of S, but is not an element of S. Then would be an element of T, and would be the smallest element of T, which is not possible. Hence S has the property that
“whenever is an element of S, we can be sure that is also an element of S.”
The Principle of Mathematical Induction then guarantees that these two observations (that 1 is an element of S, and that whenever k is an element of S, so is ) imply that S contains all the positive integers, so that the set T must be empty – contrary to assumption.
Hence T must have a smallest element.
270.
(i) Triangle is a right angled triangle with . Hence the two base angles (at O and at ) are equal, so the triangle is isosceles: .
(ii) Triangles and are congruent by RHS (since they share the hypotenuse , and have equal sides (). Hence .
(iii) If , we may choose a unit so that has length m units and has length n units. Then
and
(iv) , so . Hence , and .
(v) In the square , the ratio “diagonal: side” . If the ratio , with m, n positive integers, then the construction here replaces the positive integers m, n by smaller positive integers and , with . And the process can be repeated indefinitely to generate an endless sequence of decreasing positive integers
Zarathustra's last, most vital lesson: “Now do without me.”
George Steiner (1929-)