A A A
OPB logo

The Essence of Mathematics Through Elementary Problems
More info and resources at: https://doi.org/10.11647/obp.0168

IV. Algebra

The first rule of intelligent tinkering is to save all the parts.

Paul R. Ehrlich (1932– )

Many important aspects of serious mathematics have their roots in the world of arithmetic. However, when we implement an arithmetical procedure by combining numbers with very different meanings to produce a single numerical output, it becomes almost impossible to see how the separate ingredients contribute to the final answer. In other words, calculating exclusively with numbers contravenes Paul Ehrlich’s “first rule of intelligent tinkering”. This is why in Chapters 1 and 2 we stressed the need to move beyond blind calculation, and to begin to think structurally - even when calculating purely with numbers. Algebra can be seen as a remarkable way of “tinkering with numbers”, so that we not only “keep all the parts”, but manage to keep them separate (by giving them different names), and hence can see clearly what contribution each ingredient variable makes to the final output. To benefit from this feature of algebra, we need to learn to “read” algebraic expressions, and to interpret what they are telling us - in much the same way that we learn to read numbers (so that, where appropriate, 100 is seen as 102, and 10 is seen as 1 + 2 + 3 + 4).

Before algebra proper was invented (around 1600), the ability to extract the general picture lying hidden inside each calculation was restricted to specialists. The ancient Babylonians (1700–1500 BC) described their general procedures as recipes, presented in the context of problems involving particular numbers. But they did this in such a way as to demonstrate convincingly that whoever formulated the procedure had managed to see “the general in the particular”. The ancient Greeks used a geometrical setting to reveal generality, and encoded what we would see as “algebraic” methods in geometrical language. In the 9th century AD, Arabs such as Al-Khwarizmi (c.780–c.850), managed to encapsulate generality using a very limited kind of algebra, without the full symbolical language that would emerge later. The abacists, such as Paolo dell’Abbaco (1282–1374) who featured in Chapter 3, clearly saw that the power and spirit of mathematics was rooted in this generality. But modern algebraic symbolism - in particular, the idea that to express generality we need to use letters to represent not only variables, but also important parameters (such as the coefficients a, b, c in a general quadratic ax2 + bx + c) - had to wait for the inscrutable writings of Viète (1540–1603), and especially for Fermat (1601–1665) and Descartes (1596–1650) who simplified and extended Viète’s ideas in the 1630s.

Within a generation, the huge potential of this systematic use of symbols was revealed by the triumphs of Newton (1642–1727), Leibniz (1646–1716), and others in the years before 1700. Later, the refinements proposed by Euler (1707–1783) in his many writings throughout the 18th century, made this new language and its discoveries accessible to us all - much as Stevin’s (1548–1620) version of place value for numbers made calculation accessible to Everyman.

Our coverage of algebra is necessarily selective. We focus on a few ideas that are needed in what follows, and which should ideally be familiar - but with an emphasis that may be less familiar. When working algebraically, the key mathematical messages are mostly implicit in the manipulations themselves. Hence many of the additional comments in this chapter are to be found as part of the solutions, rather than within the main text.

4.1. Simultaneous linear equations and symmetry

Problem 92 Dad took our new baby to the clinic to be weighed. But the baby would not stay still and caused the needle on the scales to wobble. So Dad held the baby still and stood on the scales, while nurse read off their combined weight: 78kg. Then nurse held the baby, while Dad read off their combined weight: 69kg. Finally Dad held the nurse, while the baby read off their combined weight: 137kg. How heavy was the baby?

The situation described in Problem 92 is representative of a whole class of problems, where the given information incorporates a certain symmetry, which the solver would be wise to respect. Hence one should hesitate before applying systematic brute force (as when using the information from one weighing to substitute for one of the three unknown weights – a move which effectively reduces the number of unknowns, but which fails to respect the symmetry in the data).

A similar situation arises in certain puzzles like the following.

Problem 93 Numbers are assigned (secretly) to the vertices of a polygon. Each edge of the polygon is then labelled with the sum of the numbers at its two end vertices.

(a) If the polygon is a triangle ABC, and the labels on the three sides are c (on AB), b (on AC), and a (on BC), what were the numbers written at each of the three vertices?

(b) If the polygon is a quadrilateral ABCD, and the labels on the four sides are w (on AB), x (on BC), y (on CD), and z (on DA), what numbers were written at each of the four vertices?

(c) the polygon is a pentagon ABCDE, and the labels on the five sides are d (on AB), e (on BC), a (on CD), b (on DE), and c (on EA), what numbers were written at each of the five vertices?

In case any reader is inclined to dismiss such problems as “artificial puzzles”, it may help to recall two familiar instances (Problems 94 and 96) which give rise to precisely the above situation.

Problem 94 In the triangle ABC with sides of lengths a (opposite A), b (opposite B), and c (opposite C), we want to locate the three points where the incircle touches the three sides - at point P (on BC), Q (on CA), and R (on AB). To this end, let the two tangents to the incircle from A (namely AQ and AR) have length x, the two tangents from B (namely BP and BR) have length y, and the two tangents from C (namely CP and CQ) have length z. Find the values of x, y, z in terms of a, b, c.

The second instance requires us first to review the basic properties of midpoints in terms of vectors.

Problem 95

(a) Write down the coordinates of the midpoint M of the line segment joining Y = (a, b) and Z = (c, d). Justify your answer.

(b) Position a general triangle XYZ so that the vertex X lies at the origin (0,0). Suppose that Y then has coordinates (a, b) and Z has coordinates (c, d). Let M be the midpoint of XY, and N be the midpoint of XZ. Prove the Midpoint Theorem, namely that

MN is parallel to YZ and half its length”.

(c) Given any quadrilateral ABCD, let P be the midpoint of AB, let Q be the midpoint of BC, let R be the midpoint of CD, and let S be the midpoint of DA. Prove that PQRS is always a parallelogram.

Problem 96

(a) Suppose you know the position vectors p, q, r corresponding to the midpoints of the three sides of a triangle. Can you reconstruct the vectors x, y, z corresponding to the three vertices?

(b) Suppose you know the vectors p, q, r, s corresponding to the midpoints of the four sides of a quadrilateral. Can you reconstruct the vectors w, x, y, z corresponding to the four vertices?

(c) Suppose you know the vectors p, q, r, s, t corresponding to the midpoints of the five sides of a pentagon. Can you reconstruct the vectors v, w, x, y, z corresponding to the five vertices?

The previous five problems explore a common structural theme - namely the link between certain sums (or averages) and the original, possibly unknown, data. However this algebraic link was in every case embedded in some practical, or geometrical, context. The next few problems have been stripped of any context, leaving us free to focus on the underlying structure in a purely algebraic, or arithmetical, spirit.

Problem 97 Solve the following systems of simultaneous equations.

(a) (i) x + y = 1, y + z = 2, x + z = 3

(ii) uv = 2, vw = 4, uw = 8

(b) (i) x + y = 2, y + z = 3, x + z = 4

(ii) uv = 6, vw = 10, uw = 15

(iii) uv = 6, vw = 10, uw = 30

(iv) uv = 4, vw = 8, uw = 16

Problem 98 Use what you know about solving two simultaneous linear equations in two unknowns to construct the general positive solution to the system of equations:

u a v b =m, u c v d =n.

Interpret your result in the language of Cramer’s Rule. (Gabriel Cramer (1704–1752)).

Problem 99

(a) For which values b, c does the following system of equations have a unique solution?

x+y+z=3,xy+yz+zx=b, x 2 + y 2 + z 2 =c

(b) For which values a, b, c does the following system of equations have a unique solution?

x+y+z=a,xy+yz+zx=b, x 2 + y 2 + z 2 =c

4.2. Inequalities and modulus

The transition from school to university mathematics is in many ways marked by a shift from simple variables, equations and functions, to conditions and analysis involving inequalities and modulus.

Problem 100 What is | − x| equal to: x or −x? (What if x is negative?)

4.2.1 Geometrical interpretation of modulus, of inequalities, and of modulus inequalities

Problem 101

(a) Mark on the coordinate line all those points x in the interval [0,1) which have the digit “1” immediately after the decimal point in their decimal expansion. What fraction of the interval [0,1) have you marked?

Note: “[0,1)” denotes the set of all points between 0 and 1, together with 0, but not including 1. [0,1] denotes the interval including both endpoints; and (0,1) denotes the interval excluding both endpoints.

(b) Mark on the interval [0,1) all those points x which have the digit “1” in at least one decimal place. What fraction of the interval [0,1) have you marked?

(c) Mark on the interval [0,1) all those points x which have a digit “1” in at least one position of their base 2 expansion. What fraction of the interval [0,1) have you marked?

(d) Mark on the interval [0,1) all those points x which have a digit “1” in at least one position of their base 3 expansion. What fraction of the interval [0,1) have you marked?

Problem 102 Mark on the coordinate line all those points x for which two of the following inequalities are true, and five are false:

x>1,x>2,x>3,x>4,x>5,x>6,x>7.

Problem 103 Mark on the coordinate line all those points x for which

|x5|=3.

Problem 104

(a) Mark on the coordinate line all those points x for which two of the following inequalities are true, and five are false:

|x|>1,|x|>2,|x|>3,|x|>4,|x|>5,|x|>6,|x|>7.

(b) Mark on the coordinate line all those points x for which two of the following inequalities are true, and five are false:

|x1|>1,|x2|>2,|x3|>3,|x4|>4,|x5|>5,|x6|>6,|x7|>7.

Problem 105 Mark on the coordinate line all those points x for which

|x+1|+|x+2|=2.

Problem 106 Find numbers a and b with the property that the set of solutions of the inequality

|xa|<b

consists of the interval (−1, 2).

Problem 107

(a) Mark on the coordinate plane all points (x, y) satisfying the inequality

|xy|<3.

(b) Mark on the coordinate plane all points (x, y) satisfying the inequality

|xy+5|<3.

(c) Mark on the coordinate plane all points (x, y) satisfying the inequality

|xy|<|x+y|.

4.2.2 Inequalities

Problem 108 Suppose real numbers a, b, c, d satisfy a b < c d .

(i) Prove that

a b < ( a b + c d ) 2 < c d .

(ii) If b, d > 0, prove that

a b < a+c b+d < c d .

Problem 109 (Farey series) When the fully cancelled fractions in [0,1] with denominator ≤ n are arranged in increasing order, the result is called the Farey series (or Farey sequence) of order n.

Order 1: 0 1 < 1 1

Order 2: 0 1 < 1 2 < 1 1

Order 3: 0 1 < 1 3 < 1 2 < 2 3 < 1 1

Order 4: 0 1 < 1 4 < 1 3 < 1 2 < 2 3 < 3 4 < 1 1

(a) Write down the full Farey series (or sequence) of order 7.

(b) (i) Imagine the points 0.1, 0.2,0.3,..., 0.9 dividing the interval [0,1] into ten subintervals of length 1 10 . Now insert the eight points corresponding to

1 9 , 2 9 , 3 9 ,, 8 9 .

Into which of the ten subintervals do they fall?

(ii) Imagine the n points

1 n+1 , 2 n+1 , 3 n+1 ,, n n+1

dividing the interval [0,1] into n + 1 subintervals of length 1 n+1 . Now insert the n − 1 points

1 n , 2 n , 3 n ,, n1 n .

Into which of the n + 1 subintervals do they fall?

(iii) In passing from the Farey series of order n to the Farey series of order n + 1, we insert fractions of the form k n+1 between certain pairs of adjacent fractions in the Farey series of order n. If a b < c d are adjacent fractions in the Farey series of order n, prove that, when adding fractions for the Farey series of order n + 1, at most one fraction is inserted between a b and c d .

(c) Note: It is worth struggling to prove the two results in part (c). But do not be surprised if they prove to be elusive – in which case, be prepared to simply use the result in part (c)(ii) to solve part (d).

(i) In the Farey series of order n the first two fractions are 0 1 < 1 n , and the last two fractions are n1 n < 1 1 . Prove that every other adjacent pair of fractions a b < c d in the Farey series of order n satisfies bd > n.

(ii) Let a b < c d be adjacent fractions in the Farey series of order n. Prove (by induction on n) that bcad = 1.

(d) Prove that if

a b < c d < e f

are three successive terms in any Farey series, then

c d = a+e b+f .

Problem 110 Solve the following inequalities.

(a) x+ 1 x <2

(b) x1+ 2 x

(c) x <x+ 1 4

Problem 111

(a) The sum of two positive numbers equals 5. Can their product be equal to 7?

(b) (Arithmetic mean, Geometric mean, Harmonic mean, Quadratic mean) Prove that, if a,b > 0, then

2 [ 1 a + 1 b ] = 2ab a+b ab a+b 2 a 2 + b 2 2 (HMGMAMQM)

Problem 112 The two hundred numbers

1,2,3,4,5,,200

are written on the board. Students take turns to replace two numbers a, b from the current list by their sum divided by 2 . Eventually one number is left on the board. Prove that the final number must be less than 2000.

4.3. Factors, roots, polynomials and surds

Problem 113

(a) (i) Find a prime number which is one less than a square.

(ii) Find another such prime.

(b) (i) Find a prime number which is one more than a square.

(ii) Find another such prime.

(c) (i) Find a prime number which is one less than a cube.

(ii) Find another such prime.

(d) (i) Find a prime number which is one more than a cube.

(ii) Find another such prime.

Problem 114 Factorise x4 + 1 as a product of two quadratic polynomials with real coefficients.

4.3.1 Standard factorisations

The challenge to factorise unfamiliar expressions, may at first leave us floundering. But if we assume that each such problem is solvable with the tools at our disposal, we then have no choice but to fall back on the standard tools we have available (in particular, the standard factorisation of a difference of two squares, in which “cross terms” cancel out). The next problem extends this basic repertoire of standard factorisations.

Problem 115

(a)(i) Factorise a3b3.

(ii) Factorise a4b4 as a product of one linear factor and one factor of degree 3, and as a product of two linear factors and one quadratic factor.

(iii) Factorise anbn as a product of one linear factor and one factor of degree n − 1.

(b)(i) Factorise a3 + b3.

(ii) Factorise a5 + b5 as a product of one linear factor and one factor of degree 4.

(iii) Factorise a2n+1 + b2n+1 as a product of one linear factor and one factor of degree 2n.

Problem 115 develops the ideas that were implicit in Problem 113. The clue lies in Problem 113(a), and in the comment made in the main text in Chapter 1 (after Problem 4 in Chapter 1), which we repeat here:

“The last part [of Problem 113(a)] is included to emphasise a frequently neglected message:

Words and images are part of the way we communicate.
But most of us cannot calculate with words and images.

To make use of mathematics, we must routinely translate words into symbols. So “numbers” need to be represented by symbols, and points in a geometric diagram need to be properly labelled before we can begin to calculate, and to reason, effectively.”

As soon as one reads the words “one less than a square”, one should instinctively translate this into the form “x2 1”. Bells will then begin to ring; for it is impossible to forget the factorisation

x 2 1=(x1)(x+1).

And it follows that:

for a number that factorises in this way to be prime, the smaller factor x − 1 must be equal to 1;
x=2 , so there is only one such prime.

The integer factorisations in Problem 113(c) − namely

3 3 1=2×13, 4 3 1=3×21, 5 3 1=4×31, 6 3 1=5×43,

may help one to remember (or to discover) the related factorisation

x 3 1=(x1)( x 2 +x+1 ).

∴ For a number that factorises in this way to be prime, the smaller factor “x − 1” must be equal to 1;
x = 2, so there is only one such prime.

Problem 113 parts (a) and (c) highlight the completely general factorisation (Problem 115(a)(iii)):

x n 1=(x1)( x n1 + x n2 ++ x 2 +x+1 ).

This family of factorisations also shows that we should think about the factorisation of x2 − 1 as (x1)(x + 1), with the uniform factor (x1) first (rather than as (x + 1)(x − 1)). Similarly, the results of Problem 115 show that we should think of the familiar factorisation of a2b2 as (a − b)(a + b), (not as (a ' b)(a − b), but always with the factor (a − b) first).

The integer factorisations in Problem 113(d) − namely

3 3 +1=4×7, 4 3 +1=5×13, 5 3 +1=6×21, 6 3 +1=7×31, 7 3 +1=8×43,

may help one to remember (or to discover) the related factorisation

x 3 +1=(x+1) x 2 x+1

∴ For such a number to be prime, one of the factors must be equal to 1.

This time one has to be more careful, because the first bracket may not be the “smaller factor” – so there are two cases to consider:

(i) if x + 1 = 1 , then x = 0, and x3 + 1 = 1 is not prime;

(ii) if x2x = 1 then x = 0 or x = 1, so x = 1 and we obtain the prime 2 as the only solution.

The factorisation for x3 + 1 works because “3 is odd”, which allows the alternating +/− signs to end in a “+” as required. Hence Problem 113(d)(iii) highlights the completely general factorisation for odd powers:

x 2n+1 +1=(x+1)( x 2n x 2n1 + x 2n2 + x 2 x+1 )

You probably know that there is no standard factorisation of x2 + 1, or of x4 + 1 (but see Problem 114 above).

Problem 116

(a) Derive a closed formula for the sum of the geometric series

1+r+ r 2 + r 3 ++ r n .

(The meaning of closed formula was discussed in the Note to the solution to Problem 54(b) in Chapter 2.)

(b) Derive a closed formula for the sum of the geometric series

a+ar+a r 2 +a r 3 ++a r n

We started this subsection by looking for prime numbers of the form x2 1. A simple-minded approach to the distribution of prime numbers might look for formulae that generate primes - all the time, or infinitely often, or at least much of the time. In Chapter 1 (Problem 25) you showed that no prime of the form 4k + 3 can be “represented” as a sum of two squares (i.e. in the form “x2 + y2”), and we remarked that every other prime can be so represented in exactly one way. It is true (but not obvious) that roughly half the primes fall into the second category; so it follows that substituting integers for the two variables in the polynomial x2 + y2 produces a prime number infinitely often.

Problem 117 Experiment suggests, and Goldbach (1690–1764) showed in 1752 that no polynomial in one variable, and with integer coefficients, can give prime values for all integer values of the variable. But Euler (1707–1783) was delighted when he discovered the quadratic

f(x)= x 2 +x+41.

Clearly f (0) = 41 is prime. And f (1) = 43 is also prime. What is the first positive integer n for which f (n) is not prime?

Problem 117 should be seen as a particular instance of the question as to whether prime numbers can be captured by a polynomial with integer coefficients, and in particular by a quadratic. The next two problems consider the simplest instances of representing prime numbers by expressions involving exponentials (that is, where the variable is in the exponent).

Problem 118

(a)(i) Suppose an 1 = p is a prime. Prove that a = 2 and that n must itself be prime.

(ii) How many primes are there among the first five such numbers

2 2 1, 2 3 1, 2 5 1, 2 7 1, 2 11 1?

(b) (i) Suppose an + 1 = p is a prime. Prove that either a = 1, or a must be even and that n must then be a power of 2.

(ii) In the simplest case, where a = 2, how many primes are there among the first five such numbers

2 1 +1, 2 2 +1, 2 4 +1, 2 8 +1, 2 16 +1?

Primes of the form 2p − 1 are called Mersenne primes (after Marin Mersenne (1588–1648)). We now know at least fifty such primes (with the exponent p ranging up to around 80 million). Finding new primes is not in itself important, but the search for Mersenne primes has been used as a focus for many new developments in programming, and in number theory.

Primes of the form 2n + 1 are called Fermat primes (after Pierre de Fermat (1601–1665)). The story here is very different. We now refer to the number 2n + 1 with n = 2k as the kth Fermat number fk. You showed in Problem 118 (as Fermat did himself) that f0, f1, f2, f3, f4 are all prime. Fermat then rather rashly claimed that fn is always prime. However, Euler showed (100 years later) that the very next Fermat number f5 fails to be prime. And despite all the power of modern computers, we have still not found another Fermat number that is prime!

4.3.2 Quadratic equations

The general solution of quadratic equations dates back to the ancient Babylonians ( 1700BC ). Our modern understanding depends on two facts:

  • an equation of the form x2 = a where a > 0, has exactly two solutions: x=± a ;
  • any product X . Y is equal to 0 precisely when one of the two factors X, Y is equal to 0.

Problem 119 Solve the following quadratic equations:

(a) x 2 3x+2=0

(b) x 2 1=0

(c) x 2 2x+1=0

(d) x 2 + 2 x1=0

(e) x 2 +x 2 =0

(f) x 2 +1=0

(g) x 2 + 2 x+1=0

Problem 120 Let

p(x)= x 2 + 2 x+1.

Find a polynomial q(x) such that the product p(x)q(x) coefficients.

Problem 121

(a) I am thinking of two numbers, and am willing to tell you their sum s and their product p. Express the following procedure algebraically and explain why it will always determine my two unknown numbers.

Halve the sum s, and square the answer.

Then subtract the product p and take the square root of the result, to get the answer.

Add “the answer” to half the sum and you have one unknown number; subtract “the answer” from half the sum and you have the other unknown number.

(b) I am thinking of the length of one side of a square. All I am willing to tell you are two numbers b and c, where when I add b times the side length to the area I get the answer c. Express the following procedure algebraically and explain why it will always determine the side length of my square.

Take one half of b, square it and add the result to c.

Then take the square root.

Finally subtract half of b from the result.

(c) A regular pentagon ABCDE has sides of length 1.

(i) Prove that the diagonal AC is parallel to the side ED.

(ii) If AC and BD meet at X, explain why AXDE is a rhombus.

(iii) Prove that triangles ADX and CBX are similar.

(iv) If AC has length x, set up an equation and find the exact value of x.

Problem 121(a), (b) link to Problem 111(a) (and to Problem 129 below), in relating the roots and the coefficients of a quadratic. If we forget for the moment that the coefficients are usually known, while the roots are unknown, then we see that if α and β are the roots of the quadratic

x 2 +bx+c,

then

(xα)(xβ)= x 2 +bx+c,

so

α+β=bandαβ=c.

In other words, the two coefficients b, c are equal to the two simplest symmetric expressions in the two roots α and β. Part (a) of the next problem is meant to suggest that all other symmetric expressions in α and β (that is, any expression that is unchanged if we swap α and β) can then be written in terms of b and c. The full result proving this fact is generally attributed to Isaac Newton (1642–1727). Part (b) suggests that, provided one is willing to allow case distinctions, something similar may be true of anti-symmetric expressions (where the effect of swapping a and ft is to multiply the expression by “−1”).

Problem 122 Let α and β be the roots of the quadratic equation

x 2 +bx+c=0.

(a) (i) Write α2 + β2 in terms of b and c only.

(ii) Write α2β + β2α in terms of b and c only.

(iii) Write α3 + β3 − 3αβ in terms of b and c only.

(b) (i) Write αβ in terms of b and c only.

(ii) Write α2ββ2α in terms of b and c only.

(iii) Write α3β3 in terms of b and c only.

Problem 123 (Nested surds, simplification of surds)

(a)(i) For any positive real numbers a, b, prove that a + b = a+b+ 4ab

(ii) Simplify 5+ 24 .

(b) (i) Find a similar formula for a b .

(ii) Simplify 5 16 and 6 20 .

Problem 124 (Integer polynomials with a given root) We know that α = 1 is a root of the polynomial equation x2 − 1 = 0; that α= 2 is a root of x2 − 2 = 0; and that α= 3 is a root of x2 − 3 = 0.

(a) Find a quadratic polynomial with integer coefficients which has

α=1+ 2

as a root.

(b) Find a quadratic polynomial with integer coefficients which has

α=1+ 3

as a root.

(c) Find a polynomial with integer coefficients which has

α= 2 + 3

as a root. What are the other roots of this polynomial?

(d) Find a polynomial with integer coefficients which has

α= 2 + 1 3

as a root. What are the other roots of this polynomial?

Problem 125

(a) Prove that the number 2 + 3 is irrational.

(b) Prove that the number 2 + 3 + 5 is irrational.

Problem 126 (Polynomial long division) Find

(i) the quotient and the remainder when we divide x10 + 1 by x3 − 1

(ii) the remainder when we divide x2013 + 1 by x2 − 1

(iii) the quotient and the remainder when we divide xm + 1 by xn − 1, for m>n1 .

Problem 127 Find the remainder when we divide x2013 + 1 by x2 + x + 1.

4.4. Complex numbers

Up to this point, the chapter and solutions have largely avoided mentioning complex numbers. However, the present chapter would be incomplete were we not to interpret some of the earlier material in terms of complex numbers. Readers who have already met complex numbers will probably still find much in the next two sections that is new. Those for whom complex numbers are as yet unfamiliar should muddle through as best they can, and may then be motivated to learn more in due course.

We already know that the square x2 of any real number x is ⩾ 0.

  • If a = 0, then the equation x2 = a has exactly one root, namely x = 0;
  • if a > 0, then the equation x2 = a has exactly two roots – namely ± a , where a denotes the root that is positive;
  • if a < 0, then the equation x2 = a has no real roots.

And that is where the matter would have rested.

From a modern perspective, we can see that complex numbers are implicit in the formula for the roots of a quadratic equation: complex numbers become explicit as soon as the coefficients of a quadratic ax2 + bx + c give rise to a negative discriminant b2 − 4ac < 0.

But this may not have been quite how complex numbers were discovered. Contrary to oft-repeated myths, complex numbers may not have forced themselves on our attention by someone asking about “solutions to the quadratic equation x2 = −1”. As long as we inhabit the domain of real numbers, we can be sure that no known number x could possibly have such a square, so we are unlikely to go in search of it.

New ideas in the history of mathematics tend to emerge when a fresh analysis of familiar entities forces us to consider the possible existence of some previously unsuspected universe. In the time from the ancient world up to the fifteenth century, the idea of “number”, and of calculation, was restricted to the world of real (usually positive) numbers. In such a world, quadratic equations with non-real solutions simply could not arise.

However, in the Brave New World of the Renaissance, where novelty, exploration, and discovery were part of the Zeitgeist, a general method for solving cubic equations was part of the as-yet-undiscovered “wild west” of mathematics, part of the mathematical New World which invited exploration. Notice that this was not a wildly speculative venture (like trying to solve the meaningless equation “x2 = − 1”), since a cubic polynomial always has at least one real root. After three thousand years in which little progress had been made, the first half of the sixteenth century witnessed an astonishing burst of progress, resulting in the solution not only of cubic equations, but also of quartic equations. We postpone the details until Section 4.5. All we note here is that,

the general method for solving cubic equations published in 1545, was given as a procedure, illustrated by examples, that showed how to find genuinely real solutions to equations of the third degree having genuinely real (and positive) coefficients.

The procedure clearly worked. And it proceeded as follows:

Construct the real solution x as the sum x = u + v of two intermediate answers u and v - where the two summands u and v sometimes turned out to be what we would call “conjugate complex numbers”, whose imaginary parts cancelled out, leaving a real result for the required root x.

Those who devised the procedure had no desire to leave the real domain: they were focused on a problem in the real domain (a cubic equation with real coefficients, having a real root), and devised a general procedure to find that genuinely real root. But the procedure they discovered led the solver on a journey that sometimes “passed into the complex domain”, before returning to the real domain! (See Problem 129.)

Working with complex numbers depends on two skills - one very familiar, and one less so.

  • The familiar skill is a willingness to work with a “number” in terms of its properties only, without wishing to evaluate it.

We are thoroughly familiar with this when we work with 2 3 and other fractions: we know that 2 3 =2× 1 3 ; and all we know about 1 3 is that “whenever we have 3 copies of 1 3 , we can simplify this to 1”. Much the same happens when we first learn to work with 2 , where we carry out such calculations as (1+ 2 ) 2 =3+2 2 , based only on collecting up like terms and the fact that 2 × 2 can always be replaced by 2.

  • The less familiar skill is easily overlooked. When, for whatever reason, we decide to allow solutions to the equation x2 = − 1, three things need to be understood.

    − First, these new solutions come in pairs: if i is one solution of x2 = −1 then −i is another (because (−1) x (−1) = 1 means that (−x)2 = x2 for all “numbers” x.

    − Second, the equation x2 = − 1 has exactly two solutions - one the negative of the other (if x and y are both solutions, then x2 = y2, so x2y2 = (xy)(x + y) = 0, so either x = y, or x = −y).

    − Third, we have no way of telling these two solutions apart: we know that each is the negative of the other, but there is no way of singling out one of them as “the main one” (as we could when defining the square root of a positive real such as 2). We can call them ±i, but they are each as good as the other. This important fact is often undermined by referring to one of these roots as -1 (as if it were the dominant partner), and to the other as − -1 (as if it were somehow just the “negative” of the main root).

The truth is that “ 1 ” is a serious abuse of notation, because there is no way to extend the definition of the function ” in the way that this implies: when we try to “take square roots” of negative (or complex) numbers, the output is inescapably “two-valued”, so “ ” is no longer a function. The two roots of x2 = −1 are like Tweedledum and Tweedledee: we know there are two of them, and we know how they are related; but we have no way of distinguishing them, or of singling one of them out.

Once we accept this, we can write complex numbers in the form a + bi, where a and b are real numbers (just as we used to write numbers in the form a+b 2 where a and b are rational numbers). And we can proceed to add, subtract, multiply, and divide such expressions, and then collect up the “real” and “imaginary” parts to tidy up the answer.

Problem 128

(a) Write the inverse (a + bi)−1 in the form c + di.

(b) Write down a quadratic equation with real coefficients, which has a + bi as one root (where a and b are real numbers).

Problem 129 Divide 10 into two parts, whose product is 40.

Problem 129 appears in Chapter XXXVII of Girolamo Cardano’s (1501–1576) book Ars Magna (1545). Having previously presented the general methods for solving quadratic, cubic, and quartic equations, he honestly confronts the phenomenon that his method for solving cubic equations (see Problem 135) produces the required real (and positive) solution x as a sum of complex conjugates u and v - involving not only negative numbers, but square roots of negative numbers. After presenting the formal solution of Problem 129, and having shown that the calculation works exactly as it should, he adds the bemused remark:

So progresses arithmetic subtlety,
the end of which … is as refined as it is useless.”

Arithmetic with complex numbers in the form a + bi is done by carrying out the required operations, and then collecting up the “real” and “imaginary” parts as separate components - just as with adding vectors (a, b). We treat the two parts as Cartesian coordinates, and so identify the complex number a + bi with the point (a, b) in the complex plane.

The “Cartesian” representation a + bi is very convenient for addition. But the essential definition (and significance) of complex numbers is rooted in multiplication. And for multiplication it is often much better to work with complex numbers written in polar form. Suppose we mark the complex number w = a + bi in the complex plane.

The modulus |w| of w (often denoted by r) is the distance r= a 2 + b 2 of the complex number a + 6i from the origin in the complex plane.

The angle θ, measured anticlockwise from the positive real axis to the line joining the complex number w to the origin, is called the argument, Arg (w)=θ , of w.

It is then easy to check that a = r cos θ, b = r sin θ, and that

w=r(cosθ+isinθ)

This is the polar form for w. Instead of focusing on the Cartesian coordinates a, b, the polar form pinpoints w in terms of

  • its length, or modulus, r (which specifies the circle, with centre at the origin, on which the complex number w lies), and
  • the argument 0 (which tells us where on this circle w is to be found).

Problem 130

(a) Given two complex numbers in polar form:

w=r(cosθ+isinθ),z=s(cosϕ+isinϕ)

show that their product is precisely

wz=rs(cos(θ+ϕ)+isin(θ+ϕ)).

(b) (de Moivre’s Theorem: Abraham de Moivre (1667–1754)) Prove that

(cosθ+isinθ) n =cos(nθ)+isin(nθ).

(c) Prove that, if

z=r(cosθ+isinθ)

satisfies zn = 1 for some integer n, then r = 1.

The last three problems in this subsection look more closely at “roots of unity” − that is, roots of the polynomial equation xn = 1. In the real domain, we know that:

(i) when n is odd, the equation xn = 1 has exactly one root, namely x = 1; and

(ii) when n is even, the equation xn = 1 has just two solutions, namely x=±1 .

In contrast, in the complex domain, there are n “nth roots of unity”. Problem 130(c) shows that these “roots of unity” all lie on the unit circle, centered at the origin. And if we put nθ=2kπ in Problem 130(b) we see that the n nth roots of unity include the point “1 = cos0 + i sin0”, and are then equally spaced around that circle with θ= 2kπ n (1kn1) , and form the vertices of a regular n-gon.

Problem 131

(a) Find all the complex roots of unity of degree 3 (that is, the roots of x3 = 1) in surds form.

(b) Find all the complex roots of unity of degree 4 in surd form.

(c) Find all the complex roots of unity of degree 6 in surd form.

(d) Find all the complex roots of unity of degree 8 in surd form.

Problem 132 Use Problem 131(d) to factorise x4 + 1 as a product of four linear factors, and hence as a product of two quadratic polynomials with real coefficients.

Problem 133

(a) Find all the complex roots of unity of degree 5 in surd form.

(b) Factorise x5 1 as a product of one linear and two quadratic polynomials with real coefficients.

4.5. Cubic equations

The first recorded procedure for finding the positive roots of any given quadratic equation dates from around 1700 BC (ancient Babylonian). A corresponding procedure for cubic equations had to wait until the early sixteenth century AD. The story is a slightly complicated one - involving public contests, betrayal, and much else besides.

In Section 4.4 we saw that the cubic equation x3 = 1 has three solutions - two of which are complex numbers. But in the sixteenth century, even negative numbers were viewed with suspicion, and complex numbers were still unknown. Moreover, symbolical algebra had not yet been invented, so everything was carried out in words: constants were “numbers”; a given multiple of the unknown was referred to as so many “things”; a given multiple of the square of the unknown was simply referred to as “squares”; and so on.

In short, we know that an improved method for sometimes finding a (positive) unknown which satisfied a cubic equation was devised by Scipione del Ferro (1465–1526) around 1515. He kept his method secret until just before his death, when he told his student Antonio del Fiore (1506-??). Niccolo Tartaglia (1499–1557) then made some independent progress in solving cubic equations. At some stage (around 1535) Fiore challenged Tartaglia to a public “cubic solving contest”. In preparing for this event, Tartaglia managed to improve on his method, and he seems to have triumphed in the contest. Tartaglia naturally hesitated to divulge his method in order to preserve his superiority, but was later persuaded to communicate what he knew to Girolamo Cardano (1501–1576) after Cardano promised not to publish it (either never, or not before Tartaglia himself had done so). Cardano improved the method, and his student Ferrari (1522–1565) extended the idea to give a method for solving quartic equations - all of which Cardano then published, contrary to his promise, but with full attribution to the rightful discoverers, in his groundbreaking book Ars Magna (1545 -just two years after Copernicus (1473–1543) published his De revolutionibus ...). Problem 134 illustrates the necessary first move in solving any cubic equation. Problem 135 then illustrates the general method in a relatively simple case.

Problem 134

(a) Given the equation x3 + 3x2 − 4 = 0, choose a constant a, and then change variable by substituting y = x + a to produce an equation of the form y3 + ky = constant.

(b) In general, given any cubic equation ax3 + bx2 + cx + d = 0 with a0 , show how to change variable so as to reduce this to a cubic equation with no quadratic term.

Problem 135 The equation x3 + 3x2 − 4 = 0 clearly has “x = 1” as a positive solution. (The other two solutions are x = −2, and x = −2 - a repeated root; however negatives were viewed with suspicion in the sixteenth century, so this root might well have been ignored.) Try to understand how the following sequence of moves “finds the root x = 1”:

(i) substitute y = x + 1 to get a cubic equation in y with no term in y2;

(ii) imagine y = u + v and interpret the identity for

(u+v) 3 = u 3 +3uv(u+v)+ v 3

as your cubic equation in y;

(iii) solve the simultaneous equations “3uv = 3”, “u3 + v3 = 2” (not by guessing, but by substituting v= 1 u from the first equation into the second to get a quadratic equation in “u3”, which you can then solve for u3 before taking cube roots);

(iv) then find the corresponding value of v, hence the value of y = u + v, and hence the value of x.

The simple method underlying Problem 135 is in fact completely general. Given any cubic equation

a x 3 +b x 2 +cx+d=0(witha0)

we can divide through by a to reduce this to

x 3 +p x 2 +qx+r=0

with leading coefficient = 1. Then we can substitute y=x+ p 3 and reduce this to a cubic equation in y

y 3 3 ( p 3 ) 2 y+qy+[ r+2 ( p 3 ) 3 q( p 3 ) ]=0

which we can treat as having the form

y 3 myn=0.

So we can set y = u + v (for some unknown u and v yet to be chosen), and treat the last equation as an instance of the identity

(u+v) 3 3uv(u+v)( u 3 + v 3 )=0

which it will become if we simply choose u and v to solve the simultaneous equations

3uv=m, u 3 + v 3 =n.

We can then solve these equations to find u, then v - and hence find y = u + v and x=y p 3 .

4.6. An extra

Back in Chapter 1, Problem 6 we introduced the Euclidean algorithm for integers. The same idea was extended to polynomials with integer coefficients in Problem 126. In both these settings one starts with a domain (whether the set of integers, or the set of all polynomials with integer coefficients) where there is a notion of divisibility: given two elements m, n in the relevant domain, we say

n divides m” if there exists an element q in the domain such that m = qn.

The next problem invites you to think how one might extend the Euclidean algorithm to a new domain, namely the Gaussian integers ℤ[i] − the set of all complex numbers a + bi in which the real and imaginary “coordinates” a and b are integers.

Problem 136 Complex numbers a + bi, where both a and b are integers, are called Gaussian integers. Try to formulate a version of the “division algorithm” for “division with remainder” (where the remainder is always “less than” the divisor in some sense) for pairs of Gaussian integers. Extend this to construct a version of the Euclidean algorithm to find the HCF of two given Gaussian integers.

It is a profoundly erroneous truism … that we should cultivate the habit of thinking what we are doing.

The precise opposite is the case. Civilisation advances by extending the number of important operations which we can perform without thinking about them.

Alfred North Whitehead (1861–1947)

4.7. Chapter 4: Comments and solutions

92. Answer: Humour aside, this is a common situation.

We know d + b , n + b , d + n , rather than the values of d, b, n.

The key is to exploit the symmetry in the given data, rather than solving blindly. Adding all three two-way totals gives 2 ( d + b + n ) = 284 whence d + b + n = 142 . We can then subtract the given value of d + n = 137 to get the value of b = 5 .

93.

(a) Let the numbers at the three vertices be A, B, C. Adding shows that

a + b + c = 2 ( A + B + C )

so

A = a + b + c 2 - ( B + C ) = b + c - a 2

etc.

(c) Note: We postpone the “solution” of part (b), and address part (c) first. Let the numbers at the five vertices be A, B, C, D, E. Adding shows that

d + e + a + b + c = 2 ( A + B + C + D + E )

so

A = d + e + a + b + c 2 ( B + C ) ( D + E ) = d e a b c 2

etc.

(b) The second part is different. The four given edge-values do not determine the four unknown vertex-values. It may look as though four pieces of information should suffice to find four unknowns; but there is a catch: the sum of the numbers on the two opposite edges AB and CD is just the sum of the numbers at the four vertices, and so is equal to the sum of the numbers on the edges BC and DA. Hence one of the given edge-values is determined by the other three.

Note: This distinction between polygons with an odd and an even number of vertices would arise in exactly the same way if each edge was labelled with the average (“half the sum”) of the numbers at its two end vertices.

94. a = B C = B P + P C = y + z ; b = x + z ; c = x + y . Hence

a + b + c = 2 ( x + y + z )

so

x + y + z = a + b + c 2 .

So

x = a + b + c 2 ( y + z ) = b + c a 2

etc.

95.

(a) Let

M = a + c 2 , b + d 2 .

The shift, or vector, from (a, b) to (c, d) goes

“along ca in the x-direction” and “up db in the y-direction”.

Draw the ordinate through Y and the abscissa through Z, to meet at P, so creating a right angled triangle with legs Y P of length | c - a | and PZ of length | d - b | . The midpoint of Y P clearly lies halfway along Y P at

S = a + c a 2 , b

and the midpoint of PZ clearly lies halfway up PZ at

T = c , d d b 2 .

Then YSM and △MTZ are both right-angled triangles and are congruent (by RHS congruence). Hence Y M = M Z , so M is the midpoint of Y Z.

(b)

M = a 2 , b 2 , N = c 2 , d 2

so vector

MN = c a 2 , d b 2 = 1 2 BC .

(c) Note: We use the result from part (b), but not the method from part (b). By part (b) applied to △BAC , PQ is half the length of AC and parallel to AC. By part (b) applied to △DAC , SR is half the length of AC and parallel to AC. Hence PQ is parallel to SR.

Similarly one can prove (applying part (b) twice – first to △ABD , and then to △CBD ) that PS is parallel to QR.

Hence PQRS is a parallelogram.

96.

(a) p = 1 2 ( x + y ), q = 1 2 ( y + z ), r = 1 2 ( z + x ) , so

p + q + r = x + y + z.

Hence

x = ( p + q + r ) ( y + z ) = p q + r y = ( p + q + r ) ( x + z ) = p + q r z = ( p + q + r ) ( x + y ) = q + r p .

(b)

p = 1 2 ( w + x ), q = 1 2 ( x + y ), r = 1 2 ( y + z ), s = 1 2 ( z + w )

so

p + q + r + s = w + x + y + z .

Hence

w = ( p + q + r + s ) ( x + y + z );

but there is no obvious way to pin down (x + y + z).

In fact different quadrilaterals may give rise to the same four “midpoints”. (It is an interesting exercise to identify the family of quadrilaterals corresponding to a given set of four midpoints.)

(c) As in parts (a) and (b),

p= 1 2 (v+w),q= 1 2 (w+x),r= 1 2 (x+y),s= 1 2 (y+z),t= 1 2 (z+v)

Hence

p+q+r+s+t=v+w+x+y+z

so

v=(p+q+r+s+t)(w+x)(y+z)=pq+rs+t w=(p+q+r+s+t)(x+y)(z+v)=p+qr+st x=(p+q+r+s+t)(v+w)(y+z)=p+q+rs+t y=(p+q+r+s+t)(w+x)(z+v)=pq+r+st z=(p+q+r+s+t)(v+w)(x+y)=p+qr+s+t .

97.

(a)(i) As in Problems 93-95 we instinctively add to get

2(x+y+z)=6

so

x+y+z=3.

Hence

x=3(y+z)=1 y=3(x+z)=0 z=3(x+y)=2.

(ii) The same idea (replacing addition by multiplication) leads to

2×4×8=64=uvvwwu= (uvw) 2

so uvw=±8 . Hence

u= uvw vw = ±8 4 =±2 v= uvw uw = ±8 8 =±1 w= uvw uv = ±8 2 =±4.

(u,v,w)=(2,1,4)or(2,1,4).

Note: Alternatively, we may notice that u, v, w are either all positive, or all negative. If we restrict in the first instance to purely positive solutions, then we may set u = 2x, v = 2y, w = 2z, translate (ii) into (i), and conclude that (x, y, z) = (1, 0, 2), so that (u, v, w) = (2, 1, 4). We must then remember the negative solution (u, v, w) = (−2, −1, −4).

(b) (i) As in (a)(i) we add to get 2(x + y + z) = 9, so x+y+z= 9 2 . Hence

x= 9 2 (y+z)= 3 2 y= 9 2 (x+z)= 1 2 z = 9 2 (x+y)= 5 2 .

(ii) The same idea leads to

6×10×15=900=uvvwwu= (uvw) 2 ,

so uvw=±30 .

Hence

u= uvw vw = ±30 10 =±3 v= uvw uw = ±30 15 =±2 w= uvw uv = ±30 6 =±5.

Either u, v, w are all positive, or all negative.

(u,v,w)=(3,2,5)or(3,2,5).

(iii) The same idea leads to

6×10×30=2×900=vwwu= (uvw) 2 ,

so uvw=±30 2 . Hence

u= uvw vw = ±30 2 10 =±3 2 v= uvw uw = ±30 2 15 =±2 2 w= uvw uv = ±30 2 6 =±5 2 .

Either u, v, w are all positive, or all negative.

(u,v,w)=(3 2 ,2 2 ,5 2 )or(3 2 ,2 2 ,5 2 ).

(iv) We could of course repeat the same method.

Or we could again look in the first instance for positive solutions, notice that 4 = 22, 8 = 23, 16 = 24, and take logs (to base 2). Then

log 2 u+ log 2 v=2 log 2 v+ log 2 w=3 log 2 u+ log 2 w =4,

so (from part (i)) any positive solution satisfies

log 2 u= 3 2 , log 2 v= 1 2 , log 2 w= 5 2 ,

so

(u,v,w)=(2 2 , 2 ,4 2 ).

We must then remember to include

(u,v,w)=(2 2 , 2 ,4 2 ).

98. The simplest idea is to take logs, and reduce the system to a familiar linear system:

alogu+blogv =logm clogu+dlogv =logn.

Multiplying the first equation by c and subtracting it from the second equation multiplied by a gives:

logv= alognclogm adbc .

Multiplying the first equation by d and subtracting b times the second equation gives:

logu= dlogmblogn adbc .

If the numerators and denominators are expressed in determinant form, we get the 2 × 2 version of Cramer’s Rule. The original unknowns u, v can then be obtained by taking suitable powers.

What emerges looks interesting:

u= m d adbc n b adbc v = m c adbc n a adbc

but it is not clear how it generalises.

99.

(a) x + y + z = 3 is the equation of a plane through the three points (3, 0, 0), (0, 3, 0), (0,0, 3).

x2 + y2 + z2 = c is the equation of a sphere, centered at the origin, with radius c . . The sphere misses the plane completely when c < 3, meets the plane in a single point when c = 3, and cuts the plane in a circle C when c > 3 (the circle lying in the positive octant provided c < 9).

If xy + yz + zx = b meets this intersection at all, then any permutation of the three coordinates x, y, z produces another point which also satisfies the other two equations (since they are both symmetrical). Hence for the system to have a unique solution, the circle C must contain a point with x = y = z. Hence c = 3, and b = 3, and the unique solution is

(x,y,z)=(1,1,1).

(b) We must have c0 for any solution. If c = 0, then for a unique solution, we must have x = y = z = 0, so a = b = 0. If we exclude this case, then we may assume that c > 0.

x+y+z=a

is the equation of a plane through the three points (a, 0, 0), (0, a, 0), (0, 0, a).

x 2 + y 2 + z 2 =c

is the equation of a sphere, centre the origin, with radius c , which misses the 2 2 plane completely when c< a 2 3 , meets the plane in a single point when c= a 2 3 , and cuts the plane in a circle C when c> a 2 3 (the circle lying in the positive octant provided c < a2).

If

xy+yz+zx=b

meets this intersection at all, then any permutation of the three coordinates x, y, z produces another point which also satisfies the other two equations (since they are both symmetrical). Hence for the system to have a unique solution, the circle C must contain a point with x = y = z. Hence that point is x=y=z= a 3 , so c= a 2 3 =b , and the unique solution is

(x,y,z)=( a 3 , a 3 , a 3 ).

100. |x| is never negative. If x0 , then |x|=x ; if x is negative, then − x is positive, so |x|=x .

Note: We need to learn to see both x and − x as algebraic entities, with x as a placeholder (which may well be negative, in which case − x would be positive).

101.

(a) The interval [0.1, 0.2). We have marked exactly 1 10 of the interval [0,1).

(b) This needs a little thought. First we mark the interval [0.1, 0.2), of length 1 10 . Then we mark 9 smaller intervals

[0.01,0.02),[0.21,0.22),,[0.91,0.92)

of total length 9 ( 1 10 ) 2 . Then 92 smaller intervals

[0.001,0.002),[0.021,0.022),,[0.991,0.992)

of total length 9 2 ( 1 10 ) 3 . And so on.

[ 0.1 ,0 .2 ) [ 0.01 ,0 .02 ) [ 0.21 ,0 .22 ) [ 0.31 ,0 .32 ) [ 0.41 ,0 .42 ) [ 0.51 ,0 .52 ) [ 0.61 ,0 .62 ) [ 0.71 ,0 .72 ) [ 0.81 ,0 .82 ) [ 0.91 ,0 .92 ) [ 0.001 ,0 .002 ) [ 0.021 ,0 .022 ) [ 0.031 ,0 .032 )

It would seem that a vast number of points are left unm,a,rked - namely, every point whose decimal representation uses only 0s, 2s, 3s, 4s, 5s, 6s, 7s, 8s, and 9s. However, the total length of the marked intervals is given by adding:

1 10 +9 ( 1 10 ) 2 + 9 2 ( 1 10 ) 3 + 9 3 ( 1 10 ) 4 + 9 4 ( 1 10 ) 5 + 9 5 ( 1 10 ) 6 +

That is an infinite geometric series with first term a= 1 10 and common ratio r= 9 10 , and hence with sum = 1. In other words, the total length of what remains unmarked is zero.

(c) (0, 1): every real number except 0 has an expansion in base 2 with a “1” in some position. So this time nothing is left unmarked, (except 0). Hence the complement of the set of marked points consists simply of one point, namely {0}. So it is not surprising that the total of all the marked intervals has length 1.

(d) First we mark the interval [0.1,0.2), of length 1 3 . Then we mark 2 smaller intervals

[0.01,0.02),[0.21,0.22)

of total length 2 ( 1 3 ) 2 . Then 22 smaller intervals

[0.001,0.002),[0.021,0.022),[0.201,0.202),[0.2211,0.222)

of total length 2 2 ( 1 3 ) 3 . And so on.

[0.1,0.2) [0.01,0.02)[0.21,0.22) [0.001,0.002)[0.021,0.022))[0.201,0.202)[0.2201,0.02202)

The set of marked points is the complement of the famous Cantor set (Georg Cantor (1845–1918)) and has total length

1 3 +2 ( 1 3 ) 2 + 2 2 ( 1 3 ) 3 + 2 3 ( 1 3 ) 4 + 2 4 ( 1 3 ) 5 + 2 5 ( 1 3 ) 6 +

This is an infinite geometric series with first term a= 1 3 and common ratio r= 2 3 , and so has sum = 1.

Hence, the total length of what remains unmarked is zero.

Note: The set described in (d) leaves as its complement a collection of points – the Cantor set – which consists of the “endpoints” of the intervals that have been removed; these are points whose base 3 expansion involves only 0s and 2s. This complement:

(i) is “the same size” as the whole interval [0,1] (since if we interpret the 2s in the base 3 expansion as 1s, we get a correspondence between the set of “endpoints” and the set of all possible base 2 expansions for real numbers in [0,1]);

(ii) is “nowhere dense” (since every pair of points in the complement is separated by some interval)

(iii) has total length = 0.

102. (2, 3]. Each inequality implies all the ones before it. Hence the two which are true must be the first two. Hence x3 , and x > 2.

103. If x50 , then we must solve x5=3 ; so x=8 ; if x5<0 , we must solve x5=3 , so x=2 .

Note: The fact that |x| denotes the positive value of the pair {x, − x} can be rephrased as: |x| is equal to the distance from x to 0.

In the same way, |x5| denotes the positive member of the pair

{x5,(x5)}

so |x5| is equal to the distance from x5 to 0 (i.e. the distance from x to 5). This is a very important way to think about expressions like |x5| .

104.

(a) [3,2)(2,3] . (Each inequality implies all those that go before it. So we need solutions to |x|>1 and |x|>2 , which satisfy |x|3

(b) (4, 6]. (Each inequality implies the one before it. To see this, think in terms of distances: we want points x whose distance from 1 is > 1, whose distance from 2 is > 2, etc.. So we need to find points x which solve the first two inequalities, but not the third. Points in the half line (,0) satisfy all seven inequalities, so we are left with (4, 6].)

105. { 5 2 , 1 2 } . (We need all points x for which

“the distance from x to −1” plus “the distance from x to −2”

equals 2. This excludes all points between −2 and −1, for which the sum is equal to 1; for points between 5 2 and 1 2 the sum is < 2; for points in ( , 5 2 ) or ( 1 2 , ) the sum is > 2.)

106. a= 1 2 , b= 3 2 . (For solutions to exist, we must have b > 0. The solutions of the given inequality then consist of all x such that

“the distance from x to a is less than b

that is, all x in the interval (ab, a + b). Hence a − b = −1, a + b = 2.)

107.

(a) “The difference between the x- and y-coordinates is < 3”, means that the point (x, y) lies in the infinite strip between the lines xy = −3 and xy = 3.

(b) Shifting the origin of coordinates to (−5, 0) changes the x-coordinate to “X = x + 5” and leaves the y-coordinate unaffected (so Y = y). In this new frame we want “|XY| < 3”, so the required points lie in the infinite strip between the lines XY = −3 and XY = 3; that is, between the lines xy+5=3 and xy+5=3 .

(b) x > 0 and y > 0, or x < 0 and y < 0. (For any solution at all, we must have |x+y|>0 , which excludes points on the line x + y = 0. Divide both sides by |x+y| and simplify to get

| 1 2y x+y |<1

In other words:

0< 2y x+y <2 .

If y > 0, then x + y > 0, so 2x + 2y > 2y, whence x > 0 (so “x > 0 and y > 0”).

If y < 0, then x + y < 0, so 2x + 2y < 2y, whence x < 0 (so “x < 0 and y < 0”).

If x > 0 and y > 0, or x < 0 and y < 0, then clearly |xy| < |x+y| .)

108. Let

x= a b < c d =y.

(i) Since x < y, it follows that

x x 2 = x 2 < y 2 ,

so x< x+y 2 ; moreover x 2 <y y 2 , so x+y 2 <y .

(ii) Since a b < c d and b,d>0 , we can multiply both sides by bd to get ad < bc. Therefore

a(b+d)=ab+ad<ba+bc=b(a+c),

and

(a+c)d=ad+cd<bc+dc=(b+d)c.

a b < a+c b+d , and a+c b+d < c d (since b, d, and b + d are all > 0, so we can divide the first inequality by b(b + d) and the second by d(b + d)).

109.

(a)

0 1 < 1 7 < 1 6 < 1 5 < 1 4 < 2 7 < 1 3 < 2 5 < 3 7 < 1 2 < 4 7 < 3 5 < 2 3 < 5 7 < 3 4 < 4 5 < 5 6 < 6 7 < 1 1

(b) (i) It is tempting simply to consider the decimals

1 9 =0.111, 2 9 =0.222, 3 9 =0.333,, 8 9 =0.888

in order to conclude that these fractions miss the first and last subinterval, and then fall one in each of the remaining subintervals. In preparation for part (ii), it is better to observe that

* 1 10 < 1 9 and 8 9 < 9 10 , so none of the 9ths land up in the first or last subintervals;

* then rewrite

1 9 = 1 10 + 1 90 , 2 9 = 2 10 + 2 90 , 3 9 = 3 10 + 3 90 ,, 8 9 = 8 10 + 8 90

and notice that

0 10 < 1 90 << 8 90 < 1 10 ,

so that, for 1m9 ,

m 10 < m 9 < m+1 10 ;

hence exactly one 9th goes in each of the other eight subintervals.

(ii) Notice that

* 1 n+1 < 1 n and n1 n < n n+1 , so none of the nths land up in the first or last subintervals;

* then rewrite

1 n = 1 n+1 + 1 n(n+1) , 2 n = 2 n+1 + 2 n(n+1) ,, n1 n = n1 n+1 + n1 n(n+1)

and notice that

0 n+1 < 1 n(n+1) << n1 n(n+1) < 1 n+1 ,

so, for 1mn ,

m n+1 < m n < m+1 n+1 ;

hence exactly one nth goes in each of the other n − 1 subintervals.

(iii) Suppose two (or more) fractions are inserted between a b and c d . Then these two fractions would have to be successive multiples of 1 n+1 ; but then they would have a multiple of 1 n between them (by part (ii)), and this would be a term of the Farey series of order n sitting between a b and c d . Since there is no such term, at most one fraction can be inserted between a b and c d .

(c) Note 1: This problem was included because the idea of Farey series seems so simple, and their curious properties are so intriguing. While this remains true, it turns out that Farey series also have something different, and slightly unexpected, to teach us about “the essence of mathematics”. Part of us expects that simple-looking results should have short and accessible proofs - even though we know that Fermat’s Last Theorem shows otherwise. In the case of Farey series, the relevant properties can be proved in ways that should be accessible (in principle); but the proofs are not easy. So do not be upset if, after all your efforts, you land up trying to absorb the solution given here - and the underlying idea that

simple objects and “elementary” proofs can sometimes be more intricate than one anticipates.

Note 2: If

a b < c d

are consecutive terms in a Farey series, then “bc − ad” must be an integer > 0. The fact that this difference is always equal to 1 is easily checked in any particular case, but it is unclear exactly why this is necessarily true (rather than an accident) - or even how one would go about proving it. Every treatment of Farey series has to find its own way round this difficulty. We give the simplest proof we can (in the sense that it assumes no more than we have already used: a little about numbers and some algebra). But it is not at all “easy”. We indicate a different approach in the “Notes” at the end of part (d).

(i) [The fact that, except for the two end intervals, we have bd > n will be needed in the proof of part (c)(ii).]

We proceed by mathematical induction on n (the “order” of the Farey series) - a technique which we have already met in Chapter 2 (Problems 54-59, 76) and which will be addressed more fully in Chapter 6.

* When n = 1, the Farey series of order n is just:

0 1 < 1 1

and this subinterval is both the first and the last, so the claim is “vacuously true” (because there is “nothing to check”). When n = 2, the Farey series of order n is:

0 1 < 1 2 < 1 1 ,

and again the only subintervals are the first and the last, so again there is nothing to check.

* We now suppose that we know the claim is true for the Farey series of order k, for some k > 1, and show that it must then also be true for the Farey series of order k + 1. (Since we know it is true for n = 2, it will then be true for n = 3; and once we know it is true for n = 3, it must then be true for n = 4; and so on.)

To show that the claim is true for the Farey series of order k + 1, we consider any adjacent pair of fractions

a b < c d

(other than the first pair and the last pair) in the Farey series of order k + 1.

Claim bd > k + 1.

Proof Note first that, since we are avoiding the two end subintervals, both b and d are > 1.

Suppose first that the pair a b < c d are not adjacent in the previous Farey series of order k. Then at least one of the two fractions has been inserted in creating the Farey series of order k + 1, and so has denominator = k + 1. (The fractions inserted are precisely those with denominator “k + 1” which cannot be reduced by cancelling.) Hence the product

bd2(k+1)>k+1.

Thus we may assume that the pair a a b < c d were already adjacent in the Farey series of order k. But then by our “induction hypothesis” (namely that the desired result is already known to be true for the Farey series of order k), we know that bd > k. If bd > k + 1, then the pair a b < c d satisfies the required condition. Hence we only have to worry about the possibility that bd = k + 1. Suppose that bd = k + 1. Then the interval a b < c d has length

bcad bd = r k+1

for some positive integer r = bcad.

If r > 1, then the interval would have length > 1 k+1 , so a b < c d would not be successive terms in the series (for we would have inserted some additional term when moving from the Farey series of order k to the Farey series of order k + 1).

Hence we can be sure that r = 1, that the subinterval a b < c d has length exactly 1 k+1 . Now successive fractions with denominator k + 1 differ by exactly 1 k+1 , so some fraction with denominator k + 1 must lie in this subinterval. Since no additional fraction is inserted between them in passing from the series of order k to the series of order k + 1, and a b and c d must both be “cancelled versions” of two successive fractions with denominator k + 1. But then, by part (b)(ii), there would have to be a fraction with denominator k in the interval a b < c d - which is not the case.

Therefore the possibility bd = k + 1 does not in fact occur.

So we can be sure that in every case, bd > k + 1.

Hence whenever the result is true for the Farey series of order k, it must then also be true for the Farey series of order k + 1.

It follows that the result is true for the Farey series of order n, for all n1 .

(ii) We proceed by induction on n.

* If a b < c d are adjacent fractions in the Farey series of order 1, then a b = 0 1 and c d = 1 1 , so bc − ad = 1.

* Now suppose that, for some k1 , we already know that the result holds for any adjacent pair in the Farey series of order k.

Let a b < c d be adjacent fractions in the Farey series of order k + 1.

If a b < c d were already adjacent fractions in the Farey series of order k (i.e. if no fraction has been inserted between a b and c d in passing from the series of order k to the series of order k + 1), then we already know (by the induction hypothesis) that bc − ad = 1.

Thus we may concentrate on the case where a b < c d are not adjacent fractions in the Farey series of order k. By (b)(iii), at most one fraction with denominator k + 1 is inserted between any two adjacent fractions in the Farey series of order k, so we have either

a b < c d < e f ,

with a b < e f , being adjacent fractions in the Farey series of order k (so be − af = 1), or

e f < a b < c d ,

with e f < c d being adjacent fractions in the Farey series of order k (so fc − ed = 1). We consider the first of these possibilities (the second is entirely similar).

Suppose

a b < c d < e f ,

with a b < e f being adjacent fractions in the Farey series of order k. By part (i) we know that bfk+1 ; and by induction we know that be − af = 1. Hence the interval a b < e f has length at most 1 k+1 . We have to prove that bcad = 1.

Let bc − ad = r > 0, and ed − fc = s > 0.

Then sa + re = c, and sb + rf = d.

In particular, HCF(r,s) = 1 (since HCF(c,d) = 1).

Hence c d belongs to the family

S = { x a + y e x b + y f : where x , y are any positive integers with H C F ( x , y ) = 1 } .

Since everything is positive, easy algebra shows that

a b < xa+ye xb+yf < e f ,

so every element of S lies between a b and e f .

As long as we choose x, y such that HCF(x,y) = 1, any common factor of xa + ye and xb + yf would also divide both

b(xa+ye)a(xb+yf)=(beaf)y=y,

and

e(xb+yf)f(xa+ye)=(beaf)x=x.

Hence

HCF(xa+ye,xb+yf)=1

so each element of S is in lowest terms (i.e. no further cancelling is possible).

We have shown that “ c d belongs to the family S”, and that all elements of S fit between a b and e f ; which are adjacent fractions in the Farey series of order k. So none of the elements of S can have arisen before the series of order k + 1. But each fraction in S arises at some stage in a Farey series.

And the first to arise (because it has the smallest denominator) is " a+e b+f "

Hence

c d = a+e b+f ,

so r = s = 1, and bcad = 1 as required.          QED

(d) Let

a b < c d < e f

be three successive terms in any Farey sequence. By (c) we know that bc−ad = 1, and that decf = 1. In particular, bc - ad = de - cf, so

c d = a+e b+f .

Note 1: It may not be clear why we are proving this result “again” - since it appeared in the final line of the solution to part (c). However, in part (c) the statement that

c d = a+e b+f

was arrived at within the induction step, and so was subject to other assumptions. In contrast, now that the result in part (c) has been clearly established, we can use it to prove part (d) without any hidden assumptions.

Note 2: If we represent each fraction a b in the Farey series of order n by the point (b, a), then each point lies in the right angled triangle joining (0, 0), (n, 0), and (n,n), and each fraction in the Farey series is equal to the gradient of the line, or vector, joining the origin to the integer lattice point (b,a). The ordering of the fractions in the Farey series corresponds to the sequence of increasing gradients, from 0 1 up to 1 1 . If a b and e f are adjacent fractions in some Farey series, then the result in (d) says that the next fraction to be inserted between them is a+e b+f corresponding to the vector sum of (b, a) and (f, e). And the result in (c) says that the area of the parallelogram with vertices (0, 0), (b, a), (f, e), (b + f, a + e) is equal to 1 (see Problem 57(b)). Hence the result in (c) reduces to the fact that

Theorem Any parallelogram, whose vertices are integer lattice points (i.e. points (b,a) where both coordinates are integers), and with no additional lattice points inside the parallelogram or on the four sides, has area 1.

110.

(a) Suppose that x satisfies x+ 1 x <2 . Then x0 (or 1 x is not defined).

x 2 +1 x <2.

If x > 0, then x2 − 2x + 1 = (x − 1)2 which has no solutions.

x<0 in which case x satisfies x+ 1 x <0<2 , so every x < 0 is a solution of the original inequality

(b) Suppose x satisfies x1+ 2 x . Again x0 (or 1 x is not defined).

(i) If x > 0, then x2x − 2 = (x − 2)(x + 1) < 0.

1x2 (and x > 0); hence 0<x2 , and all such x satisfy the original inequality.

(ii) If x < 0, then x 2 x20 , so (x2)(x+1)0 .
eitherx1 , or x2 (and x < 0); hence x1 , and all such x satisfy the original inequality.

(c) Suppose x satisfies x <x+ 1 4 .

4 ( x ) 2 4 x +1>0

(2 x 1) 2 >0 ,so x 1 4 and all such x satisfy the original inequality.

111.

(a) If a + b = 5 and ab = 7, then a, b are solutions of

(xa)(xb)= x 2 5x+7=0.

But the roots of this quadratic equation are

5± 2528 2 = 5± 3 2 ,

so a and b cannot be “positive reals”.

(b) We abbreviate the “arithmetic mean” by AM, the “geometric mean” by GM, the “harmonic mean” by HM, and the “quadratic mean” by QM.

( a b ) 2 0

so

a+b2 ab

therefore

ab 2ab a+b (GMHM)

and

ab a+b 2 (GMAM).

Also

( ab 2 ) 2 0,

so

a 2 + b 2 4 2ab 4

whence

a 2 + b 2 2 a 2 + b 2 +2ab 4 = ( a+b 2 ) 2 .

Therefore

a 2 + b 2 2 a+b 2 (QMAM).           QED

112. [This delightful problem was devised by Oleksiy Yevdokimov.] We need to find something which remains constant, or which does not increase, when we replace two terms a, b by a+b 2 .

Idea: If the two terms a, b were replaced each time by their sum a + b, then the sum of all the numbers in the list would be unchanged, so we could be sure that the final number after 199 such moves would have to be

1+2+3++200= 200×201 2 .

This doesn’t work here. However, in the spirit of this section on inequalities, one may ask:

What happens to the sum of the squares of the terms in the list aftereach move?

When we move from one list to the next, only two terms are affected, and for these two terms, the previous sum of squares is replaced by ( a+b 2 ) 2 . How does this affect the sum of all squares on the list?

We know that a 2 + b 2 2ab for all a, b. And it is easy to see that this is equivalent to:

a 2 + b 2 ( a+b 2 ) 2

So when we replace two terms a, b by a+b 2 , the sum of the squares of all the terms in the list never increases. Hence the final term is less than or equal to the square

root of the initial sum of squares

1 2 + 2 2 + 3 2 + + 20 0 2 = 200 × 201 × 401 6 ( by Problem 62 ) < 200 × 300 × 400 6 = 4 × 1 0 6 .

thefinaltermis< 4× 10 6 =2000.

113.

(a) (i) 3 = 22 - 1.

(ii) It seems hard to find another.

(b) (i) 2 = 12 + 1.

(ii) 5 = 22 + 1 (or 17 = 42 + 1; or 37 = 62 + 1; or 101 = 102 + 1; or …). In other words, there seem to be lots.

Note: At first sight primes of this form “keep on coming”. Given that we now know (see Problem 76) that the list of all prime numbers “goes on for ever”, it is natural to ask: Are there infinitely many prime numbers “one more than a square”? Or does the list run out?

This is one of the simplest questions one can ask to which the answer is not yet known!

(c) (i) 7 = 23 - 1.

(ii) It seems hard to find another.

(d) (i) 2 = 13 + 1.

(ii) It seems hard to find another.

Note: Parts (a), (c) and (d) should make one suspicious - provided one notices that:

(a) 63 = 7 × 9, 143 = 11 × 13, 323 = 17 × 19;

(c) 511 = 7 × 73, 1727 = 11 × 157;

(d) 217 = 7 × 31, 513 = 9 × 57, 1001 = 7 × 143.

This problem is so instructive that its solution is discussed in the main text following Problem 115.

114.

x 4 +1=( x 2 + 2 x+1 )( x 2 2 x+1 ).

(Suppose

x 4 +1=( x 2 +ax+b )( x 2 +cx+d ).

It is natural to try b = d = 1 in order to make the constant term bd = 1, and then to try c = −a (so that the coefficients of x3 and of x are both 0). It then remains to choose the value of a so that the total coefficient “2 − a2” of all terms in x2 is equal to 0: that is, a= 2 ).

115.

(a)(i)

a 3 b 3 =(ab)( a 2 +ab+ b 2 ).

(ii)

a 4 - b 4 = ( a - b ) ( a 3 + a 2 b + a b 2 + b 3 ) = ( a 2 - b 2 ) ( a 2 + b 2 ) = ( a - b ) ( a + b ) ( a 2 + b 2 ) .

(iii)

a n b n =(ab)( a n1 + a n2 b+ a n3 b 2 ++a b n2 + b n1 ).

Note: The general factorisation

x n 1=(x1)( x n1 + x n2 ++ x 2 +x+1 )

provides a fresh slant on the test for divisibility by 9 in base 10, or in general for divisibility by b − 1 in base b (see Problem 51):

“an integer written in base b is divisible by b − 1 precisely when its digit sum is divisible by b − 1”.

(b) (i)

a 3 + b 3 =(a+b)( a 2 ab+ b 2 ).

(ii)

a 5 + b 5 =(a+b)( a 4 a 3 b+ a 2 b 2 a b 3 + b 4 ).

(iii)

a 2n+1 + b 2n+1 =(a+b)( a 2n a 2n1 b+ a 2n2 b 2 a 2n3 b 3 +a b 2n1 + b 2n ).

116.

(a) Replace a by 1, b by r, and n by n + 1 in the answer to 115(a)(iii), to see that:

1+r+ r 2 + r 3 ++ r n = 1 r n+1 1r .

(b) Multiply the closed formula in (a) by “a” to see that:

a+ar+a r 2 +a r 3 ++a r n =a 1 r n+1 1r .

117. When x = 40,

f(x)= x 2 +(x+40)+1= 40 2 +2×40+1= 41 2

is not prime. So the sequence of prime outputs must stop some time before f (40). But it in fact keeps going as long as it possibly could, so that

f(0),f(1),f(2),,f(39)

are all prime. (This may explain Euler’s delight.)

Note: The links between polynomials with integer coefficients (even lowly quadratics) and prime numbers are still not fully understood. For example, you might like to look up Ulam’s spiral. (Ulam (1909–1984) plotted the positive integers in a square spiral and found the primes arranging themselves in curious patterns that we still do not fully understand.)

Interest in the connections between polynomials and primes was revived in the second half of the 20th century. It was eventually proved that there exists a polynomial in 10 variables, with integer coefficients, which takes both positive and negative values when the variables run through all possible non-negative integer values, but which does so in such a way that it generates all the primes as the set of positive outputs.

118.

(a) (i) For

a n 1=(a1)( a n1 + a n2 ++a+1 )

to be prime, the smaller factor must be = 1, so a = 2.

If n is not prime, we can factorise n = rs, with r, s > 1. Then

2 n 1= 2 rs 1= ( 2 r ) s 1=( 2 r 1 )( 2 r(s1) + 2 r(s2) ++2+1 );

Hence 2n 1 also factorises, so could not be prime. Hence n must be prime.

(ii) 22 − 1 = 3, 23 − 1 = 7, 25 − 1 = 31, 27 − 1 = 127 are all prime; 211 − 1 = 2047 = 23 × 89 is not.

Note: This is a simple example of the need to distinguish carefully between the statement

“if 2n − 1 is prime, then n is prime” (which is true),

and its converse

“if n is prime, then 2n − 1 is prime” (which is false).

(b)(i) Suppose that a > 1. Then an + 1 > 2; so for an + 1 to be prime, it must be odd, so a must be even.

If n has an odd factor m > 1, we can write n = km. Then

a n + 1 = a k m + 1 = ( a k ) m + 1 = ( a k + 1 ) ( a k ( m - 1 ) - a k ( m - 2 ) + - a k + 1 ) .

Since m is odd and > 1, we have m3 . It is then easy to show that

a k +1 a k(m1) a k(m2) + a k +1.

And since a > 1, neither factor = 1, so an + 1 can never be prime. Hence n can have no odd factor > 1, which is the same as saying that n = 2r must be a power of 2.

(ii) 21 + 1 = 3, 22 + 1 = 5, 24 + 1 = 17, 28 + 1 = 257, 216 + 1 = 65 537 are all prime. (The very next such expression

2 32 +1=4294967297=641×6700417

is not prime.)

Note: The sad tale of Fermat’s claim that “all Fermat numbers are prime” shows that mathematicians are not exempt from the obligation to distinguish carefully between a statement and its converse!

119.

(a) x2 − 3x + 2 = (x − 2)(x − 1) = 0 precisely when one of the brackets = 0; that is, x = 2, or x = 1.

(b) x2 − 1 = (x − 1)(x + 1) = 0 precisely when x = 1 or x = −1.

(c) x2 − 2x + 1 = (x − 1)2 = 0 precisely when x = 1 (a repeated root).

(d) x 2 + 2 x1=0 requires us

− to complete the square

x 2 + 2 x1= ( x+ 2 2 ) 2 1 1 2 .

so

x+ 2 2 =± 3 2 ,

− or to use the quadratic formula:

x= 2 ± 2+4 2 .

(e) x 2 +x 2 =0 requires us

− to complete the square

x 2 +x 2 = ( x+ 1 2 ) 2 2 1 4 ,

so

x+ 1 2 =± 2 + 1 4

− or to use the quadratic formula:

x= 1± 1+4 2 2 .

(f) x2 + 1 = 0 yields x=± 1 .

(g) x 2 + 2 x+1=0 yields

x= 2 ± 24 2 = 2 ± 2 2 .

120. q(x)= x 2 2 x+1 . (There is no obvious magic method here. However, it should be natural to try to insert a term 2 in q(x) to “resolve” the term 2 in p(x); and the familiar cancelling of cross terms in (a + b)(ab) should then suggest the possible benefit of trying q(x)= x 2 2 x+1. )

Note: p(x)q(x) = x4 + 1 (see Problem 114).

121.

(a) Let the two unknown numbers be α and β Then s=α+β , and p=αβ . “The square of half the sum" ( s 2 ) 2 = ( α+β 2 ) 2 .

Subtracting p=αβ produces ( αβ 2 ) 2 whose “square root” will be either αβ 2 , or ( αβ 2 ) − whichever is positive

Adding this to “half the sum” gives one root; subtracting gives the other root.

(b) Let the length of one side be x. We are told that x2 + bx = c.

“Take half of b, square it, and add the result to c

translates as:

“Rewrite the equation as: ( x+ b 2 ) 2 =c+ ( b 2 ) 2 .”

That is, we have “completed the square” ( x+ b 2 ) 2 . If we now take the (positive) square root and subtract b 2 , we get a single value for x, which determines the side length of my square as required.

If the same method is applied to the general quadratic equation

a x 2 +bx+c=0.

with the extra initial step of “multiply through by 1 a ”, we produce first

x 2 + b a x+ c a =0,

then

( x+ b 2a ) 2 +( c a ( b 2a ) 2 )=0,

then

x+ b 2a =± ( b 2a ) 2 c a = ± b 2 4ac 2a ,

which leads to the familiar quadratic formula.

(c) See Problem 3(c)(iv). AD : CB = DX : BX, so x : 1 = 1 : (x − 1). Hence x2x − 1 = 0. If we use the quadratic formula derived in the answer to part (b) above, and realise that x > 1, then we obtain x= 1+ 5 2 .

Note: The procedure given in (a) dates back to the ancient Babylonians (~ 1700 BC) and later to the ancient Greeks (~ 300 BC). Both cultures worked without algebra. The Babylonians gave their verbal procedures as recipes in words, in the context of particular examples. The Greeks expressed everything geometrically. In modern language, if we denote the unknown numbers by α and β, then

(xα)(xβ)= x 2 (α+β)x+αβ.

Being told the sum and product is therefore the same as being given the coefficients of a quadratic equation, and being asked to find the two roots.

Our method for factorizing a quadratic involves a mental process of ‘inverse arithmetic’, where we juggle possibilities in search of α and β, when all we know are the coefficients (that is, the sum α + β, and the product αβ).

The procedure in (b) also dates back to the ancient Babylonians, and is essentially our process of completing the square. It was given as a procedure, without our algebraic notation. The Babylonians seem not to have been hampered (as the Greeks were) by the fact that it makes no sense to add a length and an area! They worded things geometrically, but seem to have understood that they were really playing numerical games (an idea which European mathematicians found elusive right up to the time of Descartes (1590–1656)).

Similarly, the modern use of symbols - allowing one to represent either positive or negative quantities - was widely resisted right into the nineteenth century. What we would write as a single family of quadratic equations, ax2 + bx + c = 0, had to be split into separate cases where two positive quantities were equated. For example, the groundbreaking book Ars Magna in which Cardano (1501–1576) explained how to solve cubic and quartic equations begins with quadratics - where his procedure distinguishes four different cases: “squares equal to numbers”, “squares equal to things”, “squares and things equal to numbers”, “squares and numbers equal to things”.

122.

(a)(i) α 2 + β 2 = (α+β) 2 2αβ= b 2 2c .

(ii) α 2 β+ β 2 α=αβ(α+β)=c·(b)=bc .

(iii) We rearrange

α 3 + β 3 - 3 α β = ( α + β ) ( α 2 - α β + β 2 ) - 3 α β = ( - b ) ( b 2 - 3 c ) - 3 c = - b 3 + 3 b c - 3 c .

[Alternatively: α 3 + β 3 = (α+β) 3 3αβ(α+β) , etc.]

(b) (i) [Cf 121(a).] (αβ) 2 = (α+β) 2 4αβ .

Therefore

αβ= b 2 4c ifαβ,

and

αβ= b 2 4c ifα<β.

(ii)

α 2 β β 2 α=αβ(αβ)=c b 2 4c ifαβ,

and

αβ= b 2 4c ifα<β.

(iii) α 3 β 3 =(αβ)( α 2 +αβ+ β 2 ) .

Therefore

α 3 β 3 =[ b 2 4c ]( b 2 c )ifαβ,

and

α 3 β 3 =[ b 2 4c ]( b 2 c )ifα<β.

123.

(a)(i) a + b and a+b+ 4ab are both positive. And it is easy to check that they have the same square:

( a + b ) 2 =a+b+2 ab ,

and

( a+b+ 4ab ) 2 =a+b+ 4ab .

Hence

a + b = a+b+ 4ab .

(ii) 5 = 2 + 3, and 24 = 4 × 2 × 3;

Therefore

2+3+ 4×2×3 = 2 + 3

(which is easy to check).

(b) (i) Claim If ab(0) , then

a b = a+b 4ab .

Proof a b and a+b 4ab are both0 (Why?). And it is easy to check that

( a b ) 2 =a+b2 ab ,

and

( a+b 4ab ) 2 =a+b 4ab .           QED

(ii) Simplify 5 16 and 6 20 .

5 = 4 + 1 and 16 = 4 × 4 × 1, so 5 16 = 4 1 =1 .

Actually, there is a simpler solution

5 16 = 54 = 1 =1.

6 = 5 + 1 and 20 = 4 × 5 × 1, so 6 20 = 5 1 = 5 1 .

124.

(a) Let α=1+ 2 . . Then α 2 =3+2 2 . Hence α 2 2α=1 , so α satisfies the quadratic polynomial equation x2 − 2x − 1 = 0.

Note: Observe that the resulting polynomial is equal to

(x(1+ 2 ))(x(1 2 )).

In other words, to rationalize the coefficients, we need a polynomial which has both α=1+ 2 and its “conjugate” 1 2 as roots.

(b) Let α=1+ 3 . Then α 2 =4+2 3 . Hence α 2 2α=2 , so α satisfies the quadratic polynomial equation x 2 2x2=0 .

Note: Observe that the resulting polynomial is equal to

(x(1+ 3 ))(x(1 3 )).

In other words, to rationalize the coefficients, we need a polynomial which has both α=1+ 3 and its “conjugate” 1 3 as roots.

(c) Let α= 2 + 3 . Then α 2 =5+2 6 , so α 2 5=2 6 , and ( α 2 5 ) 2 =24 . Hence α satisfies the quartic polynomial equation x 4 10 x 2 +1=0 .

Note: Observe that the resulting polynomial is equal to

(x( 2 + 3 ))(x( 2 3 ))(x( 2 + 3 ))(x( 2 3 )).

In other words, the roots are: 2 + 3 (as required), and also 2 3 , 2 3 , and 2 + 3 .

(d) Let α= 2 + 1 3 . Then

α 2 = 7 3 +2 2 3 ,

so

( α 2 7 3 ) 2 = 8 3 ,

and α satisfies the quartic polynomial equation

x 4 14 3 · x 2 + 25 9 =0.

Note:

x 4 14 3 · x 2 + 25 9 =( x[ 2 + 1 3 ] )( x[ 2 1 3 ] ) ·( x+[ 2 + 1 3 ] )( x+[ 2 1 3 ] ),

so the roots are:

x= 2 + 1 3 , 2 1 3 , 2 1 3 , 2 + 1 3 .

125. A direct approach can be made to work in both cases (but see the Notes).

(a) Suppose to the contrary that 2 + 3 = p q , for some integers p, q with HCF(p, q) = 1. Then (5+2 6 ) q 2 = p 2 , so 6 is rational, and we can write 6 = r s with HCF(r,s) = 1. But then 6s2 = r2 ; hence r = 2t must be even; so 3s2 = 2t2, but then s must be even - contradicting HCF(r,s) = 1. Hence 2 + 3 cannot be rational.

Note: It is slightly easier to rewrite the initial equation in the form

3 = p q 2

before squaring to get

( p q ) 2 1= 2p q 2 ,

which would imply that 2 is rational.

(b) Suppose to the contrary that 2 + 3 + 5 = p q , for some integers p, q with HCF(p,q) − 1. Then

10+2( 6 + 10 + 15 )= ( p q ) 2 ,

so 6 + 10 + 15 is rational. Squaring 6 + 10 + 15 then gives that

60 + 90 + 150 =5 6 +3 10 +2 15

is rational. Subtracting 2( 6 + 10 + 15 ) then shows that 3 6 + 10 is rational, and we can proceed as in part (a) to obtain a contradiction. Hence 2 + 3 + 5 cannot be rational.

Note: It is simpler to rewrite the original equation in the form

2 + 3 = p q 5

before squaring to obtain

5+2 6 =( 5+ ( p q ) 2 ) 2p q 5 ,

whence 2 6 + 2p q 5 is rational, and we may proceed as in part (a).

126.

(i) We just have to fill in the missing bits of the partial factorisation

x 10 +1=( x 3 1 )( x 7 + x 4 + )+remainder.

To produce the required term x10 we first insert x7. This then creates an unwanted term “− x 7”, so we add +x4 to cancel this out. This in turn creates an unwanted term “− x 4”, so we add +x to cancel this out. Hence the quotient is x7 + x4 + x, and the remainder is “x + 1”:

x 10 +1=( x 3 1 )( x 7 + x 4 +x )+(x+1).

Note: It is worth noting a short cut. The factorised term of the form (x3 − 1) (x7 + …) is equal to zero when x3 = 1.

So one way to get the remainder is to “treat x3 as if it were equal to 1”. Then

x 10 = ( x 3 ) 3 ·x

is just like 1 · x, and x10 + 1 behaves as if it were equal to remainder.

(ii)

x 2013+1 =( x 2 1 )( x 2011 + x 2009 + x 2007 ++x )+(x+1),

so the remainder = x + 1.

Note: If we treat x2 “as if it were equal to 1”, then

x 2013 +1= ( x 2 ) 1006 ·x+1

behaves as if it were equal to 1 · x + 1

(iii) Apply the Euclidean algorithm to m and n in order to write m = qn + r, where 0r<n :

x m = x qn+r = ( x n ) q · x r .

Then

x m + 1 = x q n + r + 1 = ( x n - 1 ) ( x n ( q - 1 ) + r + x n ( q - 2 ) + r + x n ( q - 3 ) + r + + x r ) + x r + 1.

So the remainder is xr + 1.

Note: If we treat xn - 1 as if were 0 - that is, if we treat xn as if it were equal to 1 − then

x m +1= x qn+r +1= ( x n ) q · x r +1

which behaves like 1 q · x r +1 .

127. Suppose x 2013 +1=( x 2 +x+1 )q(x)+r(x) , where deg(r(x)) < 2. Then

( x 2013 + 1 ) ( x - 1 ) = x 2014 - x 2013 + x - 1 = ( x 3 - 1 ) q ( x ) + ( x - 1 ) r ( x ) .

Now

x 2014 x 2013 +x1=( x 3 1 )( x 2011 x 2008 + x 2008 x 2007 + x 2004 x 2003 ++x +2x2

so the remainder r(x) = 2.

Note: If x satisfies x2 + x + 1 = 0, then x3 − 1 = 0 and x1 .

x 2013 +1= ( x 3 ) 671 +1 behaves just like 1671 + 1 = 2, so r(x) = 2.

128.

(a)

(a+bi) 1 = a a 2 + b 2 [ b a 2 + b 2 ]i.

(b)

p(x)= x 2 2ax+( a 2 + b 2 ).

(Suppose that the quadratic equation

p(x)= x 2 +cx+d=0,

with real coefficients c, d, has x = a + ib as a root. Then take the complex conjugate of the equation p(x) = 0 to see that x = aib is also a rooti of

p(x)= x 2 +cx+d=0.

Therefore

p(x)= x 2 +cx+d =(x(a+ib))(x(aib)),

so c = −2a, and d = a2 + b2.)

129. Let the two unknown numbers be α and β. Then 10 = α + β, and 40 = αβ, so α and β are roots of the quadratic equation x2 − 10x + 40 = 0. Hence

α,β= 10± 100160 2 =5± 15 .

130.

(a) Applying a simple rearrangement:

wz=r(cosθ+isinθ)·s(cosϕ+isinϕ) =rs[(cosθ·cosϕsinθsinϕ)+i(cosθ·sinϕ+sinθ·cosϕ)] =rs[cos(θ+ϕ)+isin(θ+ϕ)]

(by the usual addition formula: Problem 35)

(b) By part (a),

(cosθ+isinθ) 2 =cos(2θ)+isin(2θ).

Hence

( cos θ + i sin θ ) 3 = ( cos θ + i sin θ ) 2 ( cos θ + i sin θ ) = [ cos ( 2 θ ) + i sin ( 2 θ ) ] ( cos θ + i sin θ ) = cos ( 3 θ ) + i sin ( 3 θ ) .

Etc.

Note: This should really be presented as a “proof by mathematical induction”, where (having established the initial cases) we “suppose the result holds for powers n = 1, 2, 3,…, k”, and then conclude that

( cos θ + i sin θ ) k + 1 = ( cos θ + i sin θ ) k ( cos θ + i sin θ ) = [ cos ( k θ ) + i sin ( k θ ) ] ( cos θ + i sin θ ) = cos ( ( k + 1 ) θ ) + i sin ( ( k + 1 ) θ ) .

(c) z n = r n (cos(nθ)+isin(nθ)) . Hence if z n =1 , then | z n |= r n =1 So r=1 (since r0 ).

131.

(a) We factorise: x3 − 1 = (x − 1) (x2 + x + 1, so the roots are x = 1; and

x= 1± 14 2 = 1± 3 2 = 1 2 ± 3 2 i

that is, the other two roots are

x=cos( 2π 3 )+isin( 2π 3 )

and

x=cos( 2π 3 )+isin( 2π 3 ).

(b) We factorise:

x 4 1=( x 2 1 )( x 2 +1 )=(x1)(x+1)( x 2 +1 ),

so the roots are x = 1, x = −1, x = i, x = −i.

(c) We factorise:

x 6 - 1 = [ ( x 2 ) 3 - 1 ] = ( x 2 - 1 ) ( x 4 + x 2 + 1 ) = ( x - 1 ) ( x + 1 ) [ ( x 2 ) 2 + x 2 + 1 ] ,

so the roots are

x = 1, x = − 1, and

− four further values of x satisfying x 2 = 1 2 ± 3 2 i : that is,

x=cos( π 3 )+isin( π 3 )= 1 2 + 3 2 i

and

x=cos( 2π 3 )+isin( 2π 3 )= 1 2 + 3 2 i

and

x=cos( π 3 )+isin( π 3 )= 1 2 3 2 i

and

x=cos( 2π 3 )+isin( 2π 3 )= 1 2 3 2 i.

(d) We factorise:

x 8 - 1 = ( x 4 - 1 ) ( x 4 + 1 ) = ( x 2 - 1 ) ( x 2 + 1 ) ( x 2 + 2 x + 1 ) ( x 2 - 2 x + 1 )

so the roots are

x = 1, x = −1;

x = i, x = − i, and

− the roots of x 2 + 2 ·x+1=0 and x 2 2 ·x+1=0 , which happen to be

x=cos( π 4 )+isin( π 4 )= 2 2 + 2 2 i

and

x=cos( π 4 )+isin( π 4 )= 2 2 2 2 i

and

x=cos( 3π 4 )+isin( 3π 4 )= 2 2 + 2 2 i

and

x=cos( 3π 4 )+isin( 3π 4 )= 2 2 2 2 .

132. [In Problem 114 you were left to work out the required factorisation with your bare hands - and a bit of inspired guesswork. The suggested approach here is more systematic.]

The roots of x4 + 1 = 0 are complex numbers whose fourth power is equal to that is,

x=cos( π 4 )+isin( π 4 )= 2 2 + 2 2 i

and

x=cos( π 4 )+isin( π 4 )= 2 2 2 2 i

and

x=cos( 3π 4 )+isin( 3π 4 )= 2 2 + 2 2 i

and

x=cos( 3π 4 ))+isin( 3π 4 )= 2 2 2 2 i.

The first two are complex conjugates and give rise to two linear factors whose product is x 2 + 2 ·x+1 ; the other two are complex conjugates and give rise to two linear factors whose product is x 2 2 ·x+1 . Hence

x 4 +1=( x 2 + 2 ·x+1 )( x 2 2 ·x+1 ).

133.

(a) The roots of x5 − 1 = 0 are precisely the five complex numbers of the form

cos( 2kπ 5 )+isin( 2kπ 5 ),$for$k=0,1,2,3,4:

that is,

x=1 x=cos( 2π 5 )+isin( 2π 5 ) x=cos( 4π 5 )+isin( 4π 5 ) x=cos( 6π 5 )+isin( 6π 5 ) x=cos( 8π 5 )+isin( 8π 5 ).

From Problem 3(c) we know that

cos( 2π 5 )= 5 1 4 =cos( 8π 5 ) sin( 2π 5 )= 10+2 5 4 =sin( 8π 5 ) cos( 4π 5 )=cos( π 5 )= 5 +1 4 =cos( 6π 5 ) sin( 4π 5 )= 102 5 4 =sin( 6π 5 ).

(b) The linear factor is clearly (x − 1). Each quadratic factor arises as the product of two of two conjugate linear factors. We saw in Problem 128(b) that two linear factors corresponding to roots a + bi and a - bi produce the quadratic factor x 2 2ax+( a 2 + b 2 ) . Hence the two quadratic factors are:

x 2 5 1 2 ·x+1,and x 2 + 5 +1 2 ·x+1

(whose product is equal to x4 + x3 + x2 + x + 1).

134.

(a) Put a = 1, y = x + 1: then x3 + 3x2 − 4 = 0 becomes y3 − 3y = 2.

(b) Divide through by a (which we may assume is non zero, since otherwise it would not be a cubic equation), to obtain a cubic equation

x 3 +p x 2 +qx+r=0.

If we now put y=x+ p 3 , then y3 incorporates both the x3 and the x2 terms, and the equation reduces to:

y 3 +[ q3 ( p 3 ) 2 ]y+[ r+2 ( p 3 ) 3 q( p 3 ) ]=0.

135. Given the equation x3 + 3x2 − 4 = 0. Let y = x + 1.

(i) Then y3 = x3 + 3x2 + 3x + 1, so 0 = x3 + 3x2 − 4 = y3 − 3y − 2.

(ii) Set y = u + v and use the fact that

(u+v) 3 = u 3 +3uv(u+v)+ v 3

is an identity, and so holds for all u and v.

(iii) Solve “3uv = 3”, “u3 + v3 = 2”. Substitute v= 1 u from the first equation into the second to get the quadratic equation in ( u 3 ) 2 2 u 3 +1=0 : that is, ( u 3 1 ) 2 =0 , So u3 = 1.

(iv) Hence u = 1 is certainly a solution. (We know there are also complex cube roots of 1; these lead to the other two solutions of the original cubic, but to “solve the equation” it is enough to find one solution.) Hence v = 1, so y = u + v = 2, and x = 2.

136. The Euclidean algorithm for ordinary integers arises by repeating the division algorithm:

given integers m,n(0) , there exists unique integers q, r such that m = qn + r where 0r<n .

Here q is the quotient (the integer part of the division m÷n ), and r is the remainder. If we then replace the initial pair (m, n) by the new pair (n, r) and repeat until we obtain the remainder 0, then the last non-zero remainder is equal to HCF(m,n) (see Problem 6). The same idea also works for polynomials with integer coefficients (see Problem 126).

We start by clarifying what we mean by divisibility for Gaussian integers. Given two Gaussian integers, m = a + bi and n = c + di, we say that n = c + di divides m = a + bi (exactly) precisely

when there exists some other Gaussian integer q = e + fi such that m = qn: that is, a + bi = (e + fi)(c + di).

For example, 2 + 3i divides −4 + 7i because (1 + 2i)(2 + 3i) = −4 + 7i.

If m = a + bi and n = c + di are any old Gaussian integers, then it will not in general be true that “n divides m”, but we can imitate the division algorithm. The important idea here when carrying out particular calculations is to realize that “divide by c + di” is the same as “multiply by cdi c 2 + d 2

  • first carry out the division m÷n= (a+bi)(cdi) c 2 + d 2 ;
  • then take the “nearest” Gaussian integer q = e + fi, and let the difference mqn = r be the remainder.

As for ordinary integers, any Gaussian integer that is a “common factor of m and n” is then automatically a common factor of n and of r = mqn, and conversely. That is, the common factors of m and n are precisely the same as the common factors of n and r. So we can repeat the process replacing m, n by n, r. Provided the “remainder” r is in some sense “smaller” than n, we can continue until we reach a stage where the remainder r = 0 - at which point, the last non-zero remainder is equal to the HCF(m,n) (that is, the Gaussian integer which is the HCF of the two initial Gaussian integers m, n).

The feature of the remainders that gets progressively smaller is their norm (see Problem 25, and Problem 54). As so often, this becomes clearer when we look at an example.

Let us try to find the HCF of the two Gaussian integers m = 14−42i and n = 4 −7i.

  • First do the division m÷n= (1442i)(4+7i) 4 2 + 7 2 = 350 65 70 65 i.
  • What is meant by the nearest Gaussian integer may require an element of judgment; but it is clear that the answer is fairly close to 5 − i = q, where qn = 13 − 39i, with remainder r = mqn = 1 − 3i.
  • Now repeat the process with n, r: n÷r= (47i)(1+3i) 1 2 + 3 2 = 5 2 + 1 2 i.
  • The nearest Gaussian integer is not well-defined, but the answer is fairly close to 3= q . So q r=39i , with remainder r =n q r=1+2i .
  • Now repeat the step with the pair r = 1 − 3i and r =1+2i , to discover that 13i=(1+i)(1+2i) with remainder 0. Hence 1+2i=HCF(1442i,47i).

Note: One way to picture the process is to learn to “see” the Gaussian integers geometrically. Every Gaussian integer (such as a + bi) can be written as an integer combination of the two basic Gaussian integers “1” and “i” - namely

a+bi=a×1+b×i.

Since 1 and i are both of length 1 and perpendicular to each other, this represents the set of all Gaussian integers as the dots in a “square dot lattice” generated by translations in the x- and y- directions of the basic unit square spanned by 0, 1, i, and 1 + i.

Any other given Gaussian integer, such as n = c + di, then generates a “stretched and rotated” square lattice, which consists of all “Gaussian multiples” of c + di - generated by the basic square which is spanned by

0,(c+di)×1,(c+di)×i,and(c+di)×(1+i).

Every Gaussian integer (or rather the point, or dot, which corresponds to it) lies either on the boundary, or inside, one of these larger “stretched and rotated” squares: if the diagonal of one of these larger squares has length 2k, then any other Gaussian integer m = a + bi lies inside one of these larger squares, and so lies within distance k (that is, half a diagonal) of some (Gaussian) multiple qn of n = c + di. And the difference m − qn is precisely the required remainder r.

Extra: We interpret 8 3 = 8 1 3 =2 . Prove that

1 1 1 23 1 7

(where ≈ denotes “approximately equal to”).