Findword

Find words with particular
spelling/pronunciation/part of speech

Verbalist

Build up to 20,000 possible
verb phrases from any main verb

Jackass

Watch the computer become
more intelligent
Software for language learners and teachers to download from Lulu.com

Minimal pairs for English RP: lists by John Higgins


Vowels and diphthongs

i ɪ e æ ɑɒ ɔ ʊ u ʌ 3 ə ɔɪ əʊ ɪə ʊə null cons
i * 470 338 394 316 362 489 82 381 301 309 66 561 532 98 527 157 133 144 38 170 64
ɪ 4 * 446 639 228 438 326 64 235 492 194 365 368 296 62 380 98 24 29 7 1348 978
e 4 5 * 305 148 249 238 50 134 250 153 36 281 241 59 239 118 33 30 11

æ 2 3 5 * 180 438 202 58 172 436 173 11 284 275 33 269 118 24 33 9

ɑ 3 2 3 4 * 184 219 39 92 177 156 11 209 146 48 201 64 62 73 31 61
ɒ 2 3 3 4 4 * 174 72 150 322 161 3 231 190 24 231 100 27 19 8 46
ɔ 2 1 1 2 4 4 * 66 186 192 235 21 320 272 93 287 131 127 168 36 88
ʊ 1 3 3 3 2 5 4 * 18 20 46 1 66 50 3 29 14 6 8 3

u 2 1 1 1 2 3 4 4 * 134 85 15 280 260 50 268 118 53 66 19

ʌ 2 4 4 5 4 4 3 4 2 * 134 4 234 180 30 205 92 19 24 8

3 4 3 3 4 5 3 4 3 3 4 * 8 214 175 35 179 75 45 54 14

ə 3 5 4 5 3 4 2 4 2 5 5 * 90 22 4 67 3 1 8 4

4 4 5 4 3 1 1 1 1 3 4 4 * 405 108 417 187 77 82 22

3 2 3 4 4 3 3 2 2 4 4 4 4 * 59 341 192 81 96 19

ɔɪ 3 1 1 2 3 4 5 4 4 3 3 3 3 4 * 92 39 29 18 11

əʊ 2 1 2 2 3 4 4 4 3 3 4 4 3 3 3 * 134 77 96 20

1 1 2 3 4 4 4 3 3 4 3 3 2 4 3 5 * 41 33 12

ɪə 4 5 4 3 2 1 1 1 1 2 3 3 4 2 1 2 1 * 100 27

3 4 5 4 3 2 1 1 1 3 4 4 5 3 1 3 2 5 * 26

ʊə 1 1 1 1 3 4 4 5 5 3 3 2 2 1 4 3 3 2 2 *


i ɪ e æ ɑ ɒ ɔ ʊ u ʌ 3 ə ɔɪ əʊ ɪə ʊə null cons
In the table of vowels each cell links to a list of minimal pairs involving the phonemes in the relevant column and row. The numbers in north-eastern half of the table are the actual numbers of pairs identified. The numbers in the south-western half give an indication of the importance or difficulty of the pair calculated as follows: from a maximum of 6, deduct 1 for difference between vowel and diphthong, 1 for a difference of length within monophthongs, 1 for difference of direction within diphthongs, 1 for a difference in lip-rounding, and then for the distance apart of the starting tongue position deduct 1 for a distance of up to one cardinal vowel, 2 for up to two cardinal vowels, 3 for any wider distance. Thus a score of 4 or 5 would show two very similar sounds, a contrast likely to be a cause of difficulty for some or all learners, while a score of 1 or 2 would be unlikely to cause problems.

Consonants

p b t d k g f v Ɵ ð s z ʃ ʒ h m n ŋ l r j w ʧ ʤ null vowel
p * 612 882 524 1009 401 570 227 129 73 614 222 296 3 378 640 563 84 684 374 87 433 296 197 916 139
b 5 * 518 446 571 415 525 144 72 46 449 87 240 2 330 476 321 38 417 387 96 284 226 213 995
t 5 4  *  866 822 396 469 298 128 78 1352 446 276 9 274 559 687 109 575 318 46 216 238 248

d 4 5 5  *  466 250 332 285 126 58 481 2660 242 9 185 414 484 1619 507 440 39 142 206 208

k 4 3 4 4 * 341 464 176 112 42 472 214 213 4 272 413 460 87 470 229 50 193 211 155

g 3 4 4 5 5  *  196 79 52 18 201 54 145 1 125 239 240 61 207 155 26 109 97 108

f 4 2 2 2 2 1 * 130 50 35 371 73 137 2 185 312 236 22 272 218 49 178 156 171

v 3 3 1 3 1 2 5 * 25 30 204 148 49 2 66 187 222 83 233 112 7 52 63 93

Ɵ 3 2 4 3 3 2 5 4 * 9 91 59 41 2 36 60 67 10 65 37 10 42 42 36

ð 2 3 3 4 1 2 4 5 5 * 28 34 18 2 15 63 53 7 45 18 3 19 22 16

s 3 2 4 3 2 1 5 4 5 4 * 232 220 9 217 361 384 51 467 299 42 169 182 184

z 2 3 3 4 1 2 4 5 4 5 5 * 65 11 24 159 317 1135 253 50 8 17 102 94

ʃ 2 1 4 2 2 1 4 3 5 4 5 5 * 5 129 179 148 83 180 155 34 105 115 103

ʒ 1 2 3 4 2 3 3 4 4 5 4      *  none 9 6 none 6 none none 1 3 2

h 2 1 2                        *  226 139 none 216 225 70 191 95 101

m 3 4 2                          *  359 59 513 259 52 150 172 175

n 2 2 3                            *  78 681 239 35 142 151 147

ŋ 1 2 2                              *  58 2 nonenone 21 76

l 2 2 3                                *  589 68 204 182 202

r 2 2 2                                  *  58 213 120 151

j 1 2 2                                 4  *  48 28 45

w 3 4 2                                 4 4 * 61 93

ʧ 3 2 5                                 2 2 1 * 105

ʤ 2 3 4 5 2 3 3 4 3 4 3 4 4 5 2 3 4 3 4 4 3 2 5 *


p b t d k g f v Ɵ ð s z ʃ ʒ h m n ŋ l r j w ʧ ʤ null vowel

In the table of consonants each cell links to a list of minimal pairs involving the phonemes in the relevant column and row. The numbers in north-eastern half of the table are the actual numbers of pairs identified. The numbers in the south-western half give an indication of the importance or difficulty of the pair calculated as follows: from a maximum of 6, deduct 1 for difference of voicing, 1 or 2 for a difference of manner of articulation, 1 or 2 for the distance apart of the contact point. Thus a score of 4 or 5 would show two very similar sounds, a contrast likely to be a cause of difficulty for some or all learners, while a score of 1 or 2 would be unlikely to cause problems.

Click here for the phonetic transcription key.


What are minimal pairs?

Minimal pairs are pairs of words whose pronunciation differs at only one segment, such as sheep and ship or lice and rice. They are often used in listening tests and pronunciation exercises. Theoretically it is the existence of minimal pairs which enables linguists to build up the phoneme inventory for a language or dialect, though the process is not without difficulty.

Each cell in the tables above is a link to a list of minimal pairs derived from a dictionary. Use the tables of vowels and consonants to retrieve the relevant lists. Some of the consonant lists are still in a rather raw state, while all the vowel lists and some of the consonant lists have been edited and commented on. The first versions of the lists included only one pair for each pronunciation, such as heal/hole. Newly revised versions are being added which include all the pairs which arise when one or both members of the pair have a homophone, so giving a better indication of how much confusion a given pair may cause. In the case of heal/hole the new version of the list would include all of the following:

Please note that, as you move the mouse over a link, the name of the relevant document should appear at the bottom of the browser window and this gives a further indication of which sound contrast is featured in the list.

How minimal is minimal?

Although the normal definition of a minimal pair specifies that the words differ in one segment, it allows that segment to be widely different in terms of articulation. Another tighter definition of a minimal pair might be words which differ by only one feature. In that case the ideal minimal pair might be cheer versus jeer which differ only in voicing. These two words also belong to the same part of speech and so have the same inflections. Moreover they belong in the same domain of discourse, and are therefore highly confusable. If you were to overhear a fragment of conversation which included:

You should have heard them  ??eering at the end of the game.
you would have to perceive the voicing in order to know exactly what was meant. Most minimal pairs are considerably more distinct than that one, and in many cases would cause no difficulty to any speaker. However there is a kind of delight in recognising some of the pairs, which I feel may be related to the enjoyment we feel when we come across an outrageous rhyme in a song or piece of verse.

They can also be the source of genuine confusions and disputes. A story which appeared in newspapers in April 1998 suggested that the urn known as The Ashes and presented to the winning team in the England versus Australia cricket series contains not the remains of a bail, as the traditional account stated, but of a veil. Another story, involving not strictly a minimal pair but a highly confusable pair of words, appeared in January 1997. It told how a Japanese tourist with a ticket for Turkey had gone to Paddington station in London and asked for directions. She was put on the train to Torquay (a seaside town in South West England). There are all sorts of confusable sentences which can easily lead to 'slips of the ear' among English speakers, such as "the Dutch are suspicious" being misheard as "the Duchess is vicious". The only siginificant difference in the sound of those two sentences is /p/ versus /v/ and this is one which is notoriously difficult for foreign learners and can lead to unexpected problems. On a radio discussion on 13 March 2007 concerning the portrait of Adam Smith on the newly issued £20 note, a speaker with a strong Indian accent said what was first understood as "Adam Smith made a stink about society in a new way." What he really said was "Adam Smith made us think about society in a new way." Another slip of the ear I encountered recently was postcard for coastguard; although the initial /p/ versus /k/ distinction is a fairly strong one, the /k/ versus /g/ distinction in the middle of the word is neutralised by the presence of the /s/. (A contributory factor is that coastguards are often located in picturesque seaside towns, from which it would be reasonable to send a postcard.) A similar misunderstanding arose in conversation between raingear and reindeer. On a recent radio programme a presenter with a noticeable Irish accent was heard to be announcing an interview with "the born doctor Sir Roger Moore", though what he meant was "the Bond actor Sir Roger Moore".

Homophones engender many spelling mistakes; probably the commonest of all is to write there instead of their or vice versa. Sometimes a set of near homophones leads to a spelling error, as in a notice seen recently: RUGBY, STIRLING VERSES LIVERPOOL. In the same way minimal pairs engender many spelling errors in the writing of foreign learners. Among those I have seen recently are "a reach man" (rich man) and "a brought road" (broad road). An Arabic-speaking student once wrote an essay for me about a visit to London during which he had seen "the Pig Pen Watch" (Big Ben). Even national newspapers are not immune. The Times of Tuesday, September 5th, 2000, printed the following:

Apology
Readers will have been surprised yesterday to see the famous Cold War phrase "mutually assured destruction" (MAD) rendered as "neutrally assured destruction" (NAD). What began as a copytaking error somehow survived into this column. To anyone who was confused as well as to those who were not, we offer our apologies.

Source of the lists: Roger Mitton and The Advanced Learners' Dictionary

Hal Gleason (1955, p. 19), writing about minimal pairs before the era of widespread computing, said "Presumably by diligent search through the total vocabulary, minimal pairs might be found for all English consonant phonemes. But there is no guarantee that all will be found, and in any case it is hardly a feasible procedure."

I have not tried to search the total vocabulary, but I have tried to search a vocabulary which includes most of the words available in non-specialist contexts to everyday users of English. In putting together these lists I have used Roger Mitton's machine-readable version of the 1974 edition of the Advanced Learners Dictionary, incorporating Mitton's 1990 additions to the word list (see Mitton 1996). The minimal pair lists below have been prepared from the dictionary by means of a program which sorts the pronunciation field, identifies identical pairs (homophones), substitutes dummy characters for the symbols of the minimal pair, and then flags all the additional homophone pairs created by the process. This generates (fairly) complete lists of minimal pairs, though a certain amount of rather tedious post-editing is needed.

The dictionary lists just over 70,000 words, corresponding to about 40,000 headwords. This may seem rather short, leaving out words which may enter minimal pairs, making the lists incomplete. That is not necessarily a disadvantage. Sometimes, as with a spelling checker, one does not want obscure words included. There exists, for instance, an English word flong. It is the name of a rubberised cardboard or plastic which used to form an intermediate stage of the printing of newspapers on a rotary press. Perhaps it still does. I doubt whether one native-speaker in a thousand knows the word; the only reason I do is that our next-door neighbour in my childhood worked as a printer for a national newspaper and used to give us discarded sheets of the stuff to insulate our hen-house, and I enjoyed trying to read the news stories on it in mirror-writing. But if the Mitton dictionary included the word flong, there would be new pairs fling/flong, flung/flong, flop/flong, flog/flong and flock/flong, which would be of little relevance to teachers or learners.

Semantic loading and density

When this project (collecting and editing minimal pair lists for all the 510 theoretically possible contrasts) is complete, I hope to be in a position to measure the functional load of a pronunciation error, ie how much potential for confusion is created by a particular vowel or consonant error and therefore how important it is. Naturally this is not just a matter of counting the number of pairs, but also depends on other factors. One of these is the part of speech of the words and therefore their potential for appearing in the same contexts. Two nouns, such as beer and pier, are much more confusable than a noun and a preposition, such as frog and from. For this reason the edited lists draw a distinction between the number of pairs and number of semantic contrasts realised by the pairs, and calculate a "semantic loading" figure. Thus if there were 100 pairs but they belonged to only 70 different pairs of headwords, the semantic loading would be 70%. For the longer lists the semantic loding tends to fall within the range 48% to 60%, but the very short lists involving rare sounds are often higher. Paradoxically, the lower the semantic loading, the more confusable pairs may exist for that contrast, since a smaller number shows there are many inflected forms in the list and signals a large number of words in the open classes: noun, verb or adjective. To some extent the figure is arbitrarily dependent on editorial decisions. I have, for instance, treated agent nouns as separate headwords from their verb roots, since there is often a large shift of meaning, as in wait/waiter.

It is also important to take into account the density of the minimal pair, namely how the actual total relates to the theoretically possible number if every word containing one of the sounds were matched by a word containing the other. This would show how the distribution of minimal pairs relates to the overall phoneme frequencies in the same dictionary. A 100% match could only occur if there were exactly the same number of words with each sound in the language, and that is clearly unlikely. But, if the number is unequal, the density depends on which sound you start with. There are 37,729 words in the dictionary containing the vowel /ɪ/ and only 784 containing the diphthong /ɔɪ/. There are 62 minimal pairs. For the diphthong this is a density of 7.9%, but for the monophthong the density is only 0.16%. For an average of diphthong plus monophthong it is 0.32% (calculated using the harmonic mean, of course). What I have decided to do is report the mean density, pointing out where, as in this case, there is a large discrepancy in frequency.

The O'Connor conjecture

It is also my ambition to examine the statistical data coming our of these lists and to see if it offers any evidence for or against what I call "the O'Connor conjecture" that language is self-repairing. I don't know if J.D.O'Connor was the first person to express this, but he presents a very simple and clear statement of it in his book Phonetics.
A language can tolerate quite a lot of homophones provided they do not get in each other’s way, that is provided they are not likely to occur in the same contexts. This may be a grammatical matter: if the homophones are different parts of speech they are not likely to turn up in the same place in a sentence … If they are the same part of speech, e.g. site sight; pear, pair they can be tolerated unless they occur in the same area of meaning and in association with a similar set of other words. Site may be ambiguous in It’s a nice site, though a wider context will usually make the choice plain. … If homophones do interfere with each other the language may react by getting rid of one or by modifying one.
What minimal pairs do is increase the potential number of homophones in a learner's speech or the potential for misunderstaning between speakers of different dialects. What we would expect, therefore, is for there to be more minimal pairs between sounds which differ greatly, such as peat/part or shake/wake, and fewer between sounds which are close enough to create problems for learners such as cot/caught or pie/buy. So far the evidence I have collected does not support a strong form of the conjecture.

Problems

There are a number of problems waiting to be resolved:

You will find two related lists derived from the same dictionary source at the following links:

Why did I start?

My personal interest in this topic may be due partly to the fact that I once lived in a flat in the village of Etiler near Istanbul. From the living room one had a view across a green meadow down towards the steep sides of the Bosphorus, where one constantly saw passing freighters, small cruise liners and even submarines. It was one of the few places in the world where one might have said "Look, there's a sheep!" and expect to be misunderstood. Nor do I want to disappoint all those who ask me about my grandfather, Professor Henry Higgins.

Reference

Gleason, Hal (1955). An Introduction to Descriptive Linguistics, Holt Rinehart Winston.
Mitton, Roger (1996). English spelling and the computer. Longman.
O'Connor, J.D. (1973). Phonetics. Penguin Books.
Torikian, Merwyn (1992). “Watch your language; an account of Soundedit with reference to the validity of phonological rules.” System, 20, 4, p. 471-480.

Keywords:

Vowels Keyword Transcribed   Consonants Keyword Transcribed
ikey ki ppea pi
ɪ pit pɪt b beebi
epet pet ttoe təʊ
æ pat pæt ddoe dəʊ
ɑ hard hɑd k cap kæp
ɒ pot pɒt g get get
ɔ raw f fat fæt
ʊ put pʊt v vet vet
u coo ku Ɵ thin Ɵɪn
ʌ hut hʌt ð then ðen
3 cur k3 s sack sæk
ə about/mother əbaʊt/mʌðə z zoo zu
bay beɪ ʃ ship ʃɪp
buy baɪʒmeasuremeʒə
ɔɪboybɔɪhhidehaɪd
əʊgogəʊmmanmæn
cowkaʊnnonəʊ
ɪəpeerpɪəŋsingsɪŋ
pairpeəllielaɪ
ʊəpoorpʊərredred

jyearjɪə

wwetwet

ʧchinʧɪn

ʤjudgeʤʌʤ
Return to start

I would be grateful if teachers using this page could send me emails (to minpairsatwordscape.net, replace at with @ when mailing) telling me which lists they are using and what specific problems their learners experience. I am gradually re-editing all the lists, and this information would be useful to incorporate.
SLP Site of the Month Selected as the Speech-Language Pathology
Site of the Month, March 2007.
Links to revisions of some of John Higgins's other articles
Fuel for learning
… if I can without strain find 555 paraphrases of an 8-word sentence, then several thousand million paraphrases of a 50- to 60-word sentence is reasonable. … Why has Mother Language showered us with so many ways of expressing meanings?
I speak analogue; you hear digital
… In effect what we are doing here is to have the candidate give the assessor a listening test. We are certainly making the assessor behave more like a listener dealing digitally with the question "What is the candidate trying to tell me?" rather than like a judge dealing in an analogue way with the question "How well can the candidate make that sound?"
A note on quantities
…We seldom stop to ask, "What kind of 200 word text in real life is self-contained and interesting?" …
Artificial unintelligence
…While computers possess randomness, they can to some extent do without intelligence…
[The John and Muriel Higgins Home Page]

 

Page maintained by John Higgins. Last updated 6 February, 2010.