ch11

11. Evaluating some pronunciation rules for vowel graphemes

In this chapter I assess the reliability or otherwise of just five rules which purport to help children and others taking their first steps in reading to generate accurate pronunciations of vowel graphemes. For some rules covering the VC(C) part of CVC(C) monosyllables which could well be useful at a slightly later stage, see section A.7 in Appendix A.

11.1 Some history

There is a long tradition of teachers looking for rules for pronouncing vowel graphemes, and almost as long a tradition of finding most of them unhelpful. For example, McLeod (1961, cited in Carney, 1994: 70-74) reported ‘the result of a survey to which 76 teachers in 28 Scottish schools contributed’. From 59 rules submitted McLeod set 32 aside ‘since they merely grouped words according to common suffixes’. Of the other 27 only five are reading (grapheme-phoneme) rules, and only three of those concern vowel graphemes – they correspond to sections 11.2, 11.4 and 11.5 below. (The other two reading rules found by McLeod and discussed by Carney concern consonant graphemes, namely <wr> pronounced /r/ (see section 9.40) and <ch> allegedly pronounced /ʃ/ after <n>, which I have ignored. Except for a couple which cover very few words, all McLeod’s spelling (phoneme-grapheme) rules are covered, without this being made explicit, in chapters 4 and 6).

The most famous article in this tradition is Clymer (1963, reprinted 1996). Of the 45 rules he discussed, five deal with syllabification, which is not relevant to this book, and six with word stress – see section A.10 in Appendix A. The other 34 rules all deal with grapheme-phoneme correspondences, 10 with consonant graphemes, 23 with vowel graphemes, and one with a mixture of the two. Many are trivial, or special cases of more general rules; when all of that and duplications are sorted out, the rules for vowel graphemes reduce to the five discussed in sections 11.2-6, of which four are useful and one (the best known) is not.

Johnston (2001) listed several replications of Clymer’s study between 1967 and 1978; most arrived at similar conclusions. However, Gates (1983, 1986) re-formulated some generalisations to make them more reliable (as I have in some cases below), and Burmeister (1968) focused specifically on the best-known rule – see section 11.2. Johnston (2001) herself re-visited several of Clymer’s rules for vowel graphemes without, in my opinion, adding anything of value.

11.2 ‘When there are two vowels side by side, the long sound of the first one is heard and the second is usually silent.’

Often popularly stated as: ‘When two vowels go walking, the first does the talking.’

This rule has long been popular in North America, despite having been blown to pieces by Clymer (1963/1996). It was meant to tell children which of two adjacent vowel letters indicates the pronunciation of a digraph, but it is unclear, or underspecified, in seven respects:

It does not say (presumably assumes teachers and children know) which letters are ‘vowels’, but it seems clear that <a, e, i, o, u> are the intended vowel letters;
It ignores the consonantal pronunciations of <i, u> when they precede other vowel letters, as in onion, language (see sections 10.22 and 10.36), presumably because these are not relevant to initial instruction;
It does not say (presumably assumes teachers and children know) what the ‘long sounds’ (or ‘talkings’) of these vowel letters are, but again it seems clear that the ‘letter-name’ sounds /eɪ, iː, aɪ, əʊ, juː/ are meant;
It is not clear why it says the second vowel letter is ‘usually’ silent – perhaps to allow for words like dais, zoology with two vowel letters which normally form a digraph but in particular words do not;
It does not say whether sequences of two identical adjacent letters are to count as digraphs for this purpose, but I think the rule is meant to apply only to sequences of two different letters, so in what follows I have not looked at <aa, ee, ii (which never occurs as a digraph anyway), oo, uu (which only occurs in muumuu, vacuum)>, except for word-final <ee>;
It doesn’t say whether <w, y> are to count as vowel letters for this purpose. In her re-evaluation of the rule Johnston (2001) decided to include <aw, ew, ow, ay, ey, oy> (<iw, uw, iy, uy> never occur as digraphs), so I have followed this;
It takes no account of <ye>, the only vowel digraph with <y> as first letter, but since this occurs in only seven words, it can be ignored.

There are two other possible sequences that never occur as digraphs: <iu, uo>. Assuming the 12 exclusions just mentioned (<aa, ee, ii, iu, iw, iy, oo, uo, uu, uw, uy, ye>), there are 23 relevant vowel digraphs consisting of adjacent vowel letters or a vowel letter plus <w, y>.

There is one set of words for which this rule holds true with few exceptions, namely monosyllables ending in <ae, ee, ie, oe, ue>, almost all of which (see Table 10.3) are pronounced with the letter-name sounds /eɪ, iː, aɪ, əʊ, juː/. Unfortunately (as Table 10.3 also shows), the total number of relevant words in the entire language is about 54.

Within the set of 23 relevant vowel digraphs, 12 belong to the main system and 11 are Oddities; they are all shown in Table 11.1 with their predominant pronunciations (except for <ae, ie, oe, ue> in word-final position in monosyllables), and relevant percentages of occurrence of those pronunciations derived or deduced from chapter 10.

From Table 11.1 is it clear that the rule only works for <ay, ea> and possibly <ai, ue> among main-system digraphs, plus four or five of the Oddities, a very poor result.

Because so few digraphs actually conform to the rule Burmeister (1968) advocated teaching them in groups, of which those which do conform would be one – but her other groups were entirely artificial because they supported no generalisations at all, and therefore failed to set digraphs which conform to the rule sufficiently apart.

Verdict: This rule should be consigned to oblivion, and digraphs should be taught individually.

Table 11.1: ‘When two vowels go walking, the first does the talking’

	Digraph	Predominant pronunciation(s)	Conforms to the rule?
Main system	ai	/eɪ/ 43% /e/ 46%	No, unless said is excluded
	au	/ɔː/ 46% /ɒ/ 43%	No
	aw	/ɔː/ 100%	No
	ay	/eɪ/ 100%	Yes
	ea	/iː/ 73%	Yes
	ew	/juː/ 84%	No
	ie *	/iː/ 73%	No
	oi	/ɔɪ/ 100%	No
	ou	/aʊ/ 48% /əʊ/ 1%	No
	ow	/aʊ/ 45% /əʊ/ 44%	No
	oy	/ɔɪ/ 100%	No
	ue *	/juː/ 59% /uː/ 41%	?
Oddities	ae *	/iː/ 62%	No
	ao	/eɪ/ 69%	Yes
	ei	/iː/ 69%	Yes
	eo	/ə/ 70% /iː/ only in people **	No
	eu	/uː/ 58%	No
	ey	/iː/ 76%	Yes
	ia	/ə/ 57% /aɪ/ only in diamond	No
	io	/ə/ 100%	No
	oa	/əʊ/ 96%	Yes
	oe *	/iː/ 65%	?
	ua	/ə/ 100%	No
	ui	/uː/ 73%	No

* For monosyllables ending in these digraphs, the rule is largely true.

** and two other very rare words – see section 10.12.

11.3 ‘When a written word has only one vowel letter, and that letter is followed by at least one consonant letter other than <r, w, y>, the vowel has its usual short pronunciation.’

A better-known version of the rule is ‘When a word has only one vowel and that vowel is in the middle, it is usually short’, but my formulation (above) is more accurate, partly because <a, o, u> have alternative pronunciations. Even though <r, w, y> in these circumstances after a vowel letter always form a vowel digraph with the vowel letter, for teaching purposes it would clearly be better to treat them here as consonant letters. The rule applies mainly or entirely to closed monosyllables, and regardless of the number of consonant letters following the vowel letter.

(English is rich in monosyllables – many years ago three American nerds compiled a list of 9,123 (Moser et al., 1957), and there was once a competition to find or devise the longest one (Gardner, 1979), defined in terms of letters rather than phonemes; the competition was won by an American poet named William Harman, with broughammed (‘travelled by brougham’, which can be pronounced /bruːmd/ in General American but would have two syllables /ˈbruːwəmd/ in RP).

Most monosyllabic words in English are phonologically closed (end in a consonant phoneme(s)). Table 11.2 sets out the facts on my version of the rule, at least as far as the RP accent is concerned (it seems clear that <o> has no ‘short’ pronunciation at all in the General American accent – see Cruttenden (2014: 127) and Carney (1994: 59)).

Table 11.2: Pronunciations of vowel letters in words with a single,
non-final vowel letter followed by at least one consonant letter other than <r, w, y>.

Vowel letter	Principal short pronunciation	Other short pronunciations	Long pronunciations
a	/æ/	/ɒ/ in 25 words, e.g. squash, was	/ɔː/ in 26 words, e.g. ball, salt, talk; /ɑː/ in 18 words, e.g. calm, half; /eɪ/ only in bass (the musical term)
e	/e/	-	/iː/ only in retch pronounced /riːʧ/
i	/ɪ/	-	/aɪ/ in 18 words, e.g. child, find, pint, sign; /iː/ only in chic
o	/ɒ/	/ʌ/ in 8 words, e.g. son; /ʊ/ only in wolf	/əʊ/ in 34 words, e.g. both, colt, comb, don’t, gross, post, roll, told; /uː/ only in tomb, whom, womb
u	/ʌ/	/ʊ/ in 14 words, e.g. bull, push	-
y	/ɪ/	-	-

Thus the total number of exceptions, even counting both categories, is no more than 150, some of which beginner readers are unlikely to encounter, and there are undoubtedly thousands of words which obey the rule. I therefore consider it to have high reliability, probably over 90%, and well worth teaching.

11.4 ‘When a final <e> is preceded by a consonant letter other than <r, w, x, y> and that consonant is preceded by a single vowel letter, the final <e> is silent and the other vowel letter has its letter-name (‘long’) sound.’

This is my attempt to formulate a rule for ‘magic <e>’/split digraphs that is more accurate than some current formulations, e.g.

‘The final <e> in a word is not pronounced’.
‘<e> at the end of a word makes the preceding vowel in the word long’.

Table 11.3 shows the relevant data.

Table 11.3: Reliability of rules for split digraphs or ‘magic <e>’
where the intervening letter is not <r, w, x, y>.

Split digraph	Predominant pronunciation	Alternative long pronunciation	Major exceptions	Words with ‘pronounced’ final <e>
a.e	/eɪ/ 68%	/ɑː/ 32%, e.g. mirage	Lots of words with <-age, -ate> pronounced /ɪʤ, ət/, e.g. village, accurate	agape (‘love feast’), agave, biennale, blase, cafe, canape, curare, finale, glace, kamikaze, karate, macrame, pate (‘paste’), sesame, tamale
e.e	/iː/ 100%	-	-	hebe, machete, meze, naivete, protege, stele, ukulele
i.e	/aɪ/ 97%	/iː/ 3%, e.g. police	bodice, give, live (verb), lots of longer words with <-ive> pronounced /ɪv/, e.g. massive; various words with <-ine, -ite> pronounced /ɪn, ɪt/, e.g. examine, definite	aborigine, anime, facsimile, (bona) fide, recipe, simile
o.e	/əʊ/ 95%	-	compote, gone, scone, shone with /ɒ/, above, become, come, done, dove, glove, love, none, shove, some with /ʌ/, welcome and adjectives in <-some> with /ǝ/	abalone, adobe, anemone, coyote, epitome, extempore, expose (‘report of scandal’), furore pronounced /fjʊəˈrɔːreɪ/ (also pronounced /ˈfjʊərɔː/), guacamole, hyperbole, sylloge
u.e	/juː/ 89%	/uː/ 11%, e.g. rude	-	resume (‘c.v.’)
y.e	/aɪ/ 100%	-	-	-

All the rules for split digraphs are predicated on the word-final <e> being ‘silent’, so the first necessity is to exclude polysyllables in which it is ‘pronounced’. Table 11.3 shows that there are only about 39 words in the language in which a final <e> separated from a single preceding vowel letter by one consonant letter is ‘pronounced’, and three, curare, extempore and furore pronounced /fjʊəˈrɔːreɪ/, have the banned letter <r> intervening. Of the 39 words, only cafe is at all frequent.

The percentages shown were calculated without taking words with ‘pronounced’ final <e> or the major exception categories into account (but most of the words in those categories would again not feature in beginner readers’ books), and this would reduce the strength of the main rules for <a.e, i.e, o.e>, but on the whole the ‘magic <e>’ rules hold good and are worth teaching. Most learners will, I think, acquire the /uː/ pronunciation of <u.e> without even noticing that they have, or that /uː/ is different from /juː/, and also learn without noticing it that <y.e> has the same pronunciation as <i.e> (most words with <y.e> are rare, so this digraph should present no problem for reading when encountered).

As it happens, the inclusion of <x> among the letters banned from the mid position in this rule excludes just three words in the entire language: annexe, axe, deluxe, so the rule could well be taught without <x>, and would then be parallel to the ‘short vowel’ rule in section 11.3. If consonant digraphs were admitted to the dot position for this analysis, other rare words would join the list with ‘pronounced’ final <e>, e.g. antistrophe, oche, strophe, synecdoche.

For more on split digraphs and their definition, see section A.6 in Appendix A.

11.5 ‘When <a> follows <qu, w, wh> and is not followed by <r>, or by any consonant letter plus <e>, it is pronounced /ɒ/.’

This rule is usually stated without the clause ‘and is not followed by <r>, or by any consonant letter plus <e>’, but this is essential to rule out the <ar> digraph and cases where ‘magic <e>’ would override (e.g. quake, wade, whale), and my version is therefore more accurate. There are 21 relevant words with <qua>, 42 with <wa>, and only what with <wha>. Of the 64 words, the only exceptions are walk, wall, water, all pronounced with /ɔː/, so this rule is highly reliable (95% if each word is given equal weighting). There are also seven words in which <a> is followed by <r(r)> but those letters do not form a di/trigraph and the <a> is pronounced /ɒ/: quarantine, quarrel, quarry, warrant, warren, warrigal, warrior, but these need to be taught separately because in the great majority of words in which <ar> follows <qu, w, wh> it is pronounced either /ɔː/ (e.g. quart, ward, wharf) or /ə/ (e.g. steward, towards).

11.6 ‘When <y> is the final letter in a word, it always has a vowel sound, either alone or in combination with a preceding <a, e, o>.’

Given that word-final <y> is never a consonant letter, this rule is 100% reliable. Formulated like this, it may seem entirely obvious to proficient readers, but may be helpful to learners. <ay, ey, oy> are also covered in section 11.2.