Software for language learners and teachers to download from Lulu.com.

Findword

Find words with particular
spelling/pronunciation/part of speech

Verbalist

Build up to 20,000 possible
verb phrases from any main verb

Jackass

Watch the computer become
more intelligent

Minimal pairs for English RP

by John Higgins


Note that to see the phonetic characters correctly you must have the font Lucida Sans Unicode installed on your systems. This is a standard Windows font and is provided on most modern systems. Click here for the phonetic transcription key.


Minimal pairs are pairs of words whose pronunciation differs at only one segment, such as sheep and ship or lice and rice. They are often used in listening tests and pronunciation exercises. Theoretically it is the existence of minimal pairs which enables linguists to build up the phoneme inventory for a language or dialect, though the process is not without difficulty.

Each cell in the tables below is a link to a list of minimal pairs derived from a dictionary. Use the tables of vowels and consonants below to retrieve the relevant lists. Some are still in a rather raw state, while others have been edited and commented on. The first versions of the lists included only one pair for each pronunciation, such as heal/hole. Newly revised versions are being added which include all the pairs which arise when one or both members of the pair have a homophone, so giving a better indication of how much confusion a given pair may cause. In the case of heal/hole the new version of the list would include all of the following:

Please note that, as you move the mouse over a link, the name of the relevant document should appear at the bottom of the browser window and this gives a further indication of which sound contrast is featured in the list.

How minimal is minimal?

Although the normal definition of a minimal pair specifies that the words differ in one segment, it allows that segment to be widely different in terms of articulation. Another tighter definition of a minimal pair might be words which differ by only one feature. In that case the ideal minimal pair might be cheer versus jeer which differ only in voicing. These two words also belong to the same part of speech and so have the same inflections. Moreover they belong in the same domain of discourse, and are therefore highly confusable. If you were to overhear a fragment of conversation which included:

You should have heard them ??eering at the end of the game.
you would have to perceive the voicing in order to know exactly what was meant. Most minimal pairs are considerably more distinct than that one, and in many cases would cause no difficulty to any speaker. However there is a kind of delight in recognising some of the pairs, which I feel may be related to the enjoyment we feel when we come across an outrageous rhyme in a song or piece of verse.

They can also be the source of genuine confusions and disputes. A story which appeared in newspapers in April 1998 suggested that the urn known as The Ashes and presented to the winning team in the England versus Australia cricket series contains not the remains of a bail, as the traditional account stated, but of a veil. Another story, involving not strictly a minimal pair but a highly confusable pair of words, appeared in January 1997. It told how a Japanese tourist with a ticket for Turkey had gone to Paddington station in London and asked for directions. She was put on the train to Torquay (a seaside town in South West England). There are all sorts of confusable sentences which can easily lead to 'slips of the ear' among English speakers, such as "the Dutch are suspicious" being misheard as "the Duchess is vicious". The only siginificant difference in the sound of those two sentences is /p/ versus /v/ and this is one which is notoriously difficult for foreign learners and can lead to unexpected problems. On a radio discussion on 13 March 2007 concerning the portrait of Adam Smith on the newly issued £20 note, a speaker with a strong Indian accent said what was first understood as "Adam Smith made a stink about society in a new way." What he really said was "Adam Smith made us think about society in a new way." On a recent trip to Spain I heard a Spanish guide leading a party of British tourists asking them to rendezvous at what they thought was "St Martin's Village" when he meant to say "St Martin's Bridge". The /b/ versus /v/ contrast is not made in Spanish, and his strong articulation of the /r/ made it easily confused with an /l/. Another slip of the ear I encountered recently was postcard for coastguard; although the initial /p/ versus /k/ distinction is a fairly strong one, the /k/ versus /g/ distinction in the middle of the word is neutralised by the presence of the /s/. (A contributory factor is that coastguards are often located in picturesque seaside towns, from which it would be reasonable to send a postcard.) A similar misunderstanding arose in conversation between raingear and reindeer. On a recent radio programme a presenter with a noticeable Irish accent was heard to be announcing an interview with "the born doctor Sir Roger Moore", though what he meant was "the Bond actor Sir Roger Moore".

Homophones engender many spelling mistakes; probably the commonest of all is to write there instead of their or vice versa. Sometimes a set of near homophones leads to a spelling error, as in a notice seen recently: RUGBY, STIRLING VERSES LIVERPOOL. In the same way minimal pairs engender many spelling errors in the writing of foreign learners. Among those I have seen recently are "a reach man" (rich man) and "a brought road" (broad road). An Arabic-speaking student once wrote an essay for me about a visit to London during which he had seen "the Pig Pen Watch" (Big Ben). Even national newspapers are not immune. The Times of Tuesday, September 5th, 2000, printed the following:

Apology
Readers will have been surprised yesterday to see the famous Cold War phrase "mutually assured destruction" (MAD) rendered as "neutrally assured destruction" (NAD). What began as a copytaking error somehow survived into this column. To anyone who was confused as well as to those who were not, we offer our apologies.

Source of the lists: Roger Mitton and The Advanced Learners' Dictionary

Hal Gleason (An Introduction to Descriptive Linguistics, Holt Rinehart Winston, 1955, p. 19), writing about minimal pairs before the era of widespread computing, said "Presumably by diligent search through the total vocabulary, minimal pairs might be found for all English consonant phonemes. But there is no guarantee that all will be found, and in any case it is hardly a feasible procedure."

I have not tried to search the total vocabulary, but I have tried to search a vocabulary which includes most of the words available in non-specialist contexts to everyday users of English. In putting together these lists I have used Roger Mitton's machine-readable version of the 1974 edition of the Advanced Learners Dictionary, incorporating Mitton's 1990 additions to the word list. The minimal pair lists below have been prepared from the dictionary by means of a program which sorts the pronunciation field, identifies identical pairs (homophones), substitutes dummy characters for the symbols of the minimal pair, and then flags all the additional homophone pairs created by the process. This generates (fairly) complete lists of minimal pairs, though a certain amount of rather tedious post-editing is needed.

The dictionary lists just over 70,000 words, corresponding to about 40,000 headwords. This may seem rather short, leaving out words which may enter minimal pairs, making the lists incomplete. That is not necessarily a disadvantage. Sometimes, as with a spelling checker, one does not want obscure words included. There exists, for instance, an English word flong. It is the name of a rubberised cardboard which used to form an intermediate stage of the printing of newspapers on a rotary press. I doubt whether one native-speaker in a thousand knows the word; the only reason I do is that our next-door neighbour in my childhood worked as a printer for a national newspaper and used to give us discarded sheets of the stuff to insulate our hen-house, and I enjoyed trying to read the news stories on it in mirror-writing. But if the dictionary included the word flong, there would be new pairs fling/flong, flung/flong, flop/flong, flog/flong and flock/flong, which would be of little relevance to teachers or learners.

Semantic loading and density

When this project (collecting and editing minimal pair lists for all the 510 theoretically possible contrasts) is complete, I hope to be in a position to measure the functional load of a pronunciation error, ie how much potential for confusion is created by a particular vowel or consonant error and therefore how important it is. Naturally this is not just a matter of counting the number of pairs, but also depends on other factors. One of these is the part of speech of the words and therefore their potential for appearing in the same contexts. Two verbs, such as cheer and fear, are much more confusable than a noun and a preposition, such as frog and from. For this reason the edited lists draw a distinction between the number of pairs and number of semantic contrasts realised by the pairs, and calculate a "semantic loading" figure. Thus if there were 100 pairs but they belonged to only 70 different pairs of headwords, the semantic loading would be 70%. As a rough rule of thumb, the lower the semantic loading, the more confusable pairs exist for that contrast, since a smaller number shows there are many inflected forms and signals a large number of words in the open classes: noun, verb or adjective.

It is also important to take into account the density of the minimal pair, namely how the actual total relates to the theoretically possible number if every word containing one of the sounds were matched by a word containing the other. This would show how the distribution of minimal pairs relates to the overall phoneme frequencies in the same dictionary. A 100% match could only occur if there were exactly the same number of words with each sound in the language, and that is clearly unlikely. But, if the number is unequal, the density depends on which sound you start with. There are 37,729 words in the dictionary containing the vowel /ɪ/ and only 784 containing the diphthong /ɔɪ/. There are 62 minimal pairs. For the diphthong this is a density of 7.9%, but for the monophthong the density is only 0.16%. For an average of diphthong plus monophthong it is 0.32%. What I have decided to do is report the density using the total for the rarer sound as the base but pointing out where, as in this case, there is a large discrepancy in frequency.

For a further discussion of densities and semantic loading, as well as affectionate memories of Tim Johns, see my paper "Don't ask the admiral to show you his pinnace" on this link (though the tables in that paper reflect a different way of measuring density).

There are a number of problems waiting to be resolved:

You will find two related lists derived from the same dictionary source at the following links:

Why did I start?

My personal interest in this topic may be due partly to the fact that I once lived in a flat in the village of Etiler near Istanbul. From the living room one had a view across a green meadow down towards the steep sides of the Bosphorus, where one constantly saw passing freighters, small cruise liners and even submarines. It was one of the few places in the world where one might have said "Look, there's a sheep!" and expect to be misunderstood. Nor do I want to disappoint all those who ask me about my grandfather, Professor Henry Higgins.


Vowels and diphthongs

ɪ e æ ɑɒ ɔ ʊ u ʌ 3 ə ɔɪ əʊ ɪə ʊə null cons
i 471 338 394 316 362 489 82 381 301 309 66 561 532 98 527 157 133 144 38 170 64
ɪ
446 639 228 438 326 64 235 492 194 365 368 296 62 380 98 24 29 7 1348 978
e
305 148 249 219 50 134 250 153 36 281 241 59 239 118 33 30 11

æ
180 438 202 58 172 436 173 11 284 275 33 269 118 24 33 9

ɑ
184 156 34 75 172 127 11 184 125 37 169 51 46 48 22 61
ɒ
157 73 141 300 153 1 218 172 22 203 96 26 19 8 46
ɔ
56 142 168 180 21 251 207 71 243 106 82 92 23 88
ʊ
18 19 41 1615232815683

u
1197492342004520897263311

ʌ
1264211148291818518207

3
81821413314963354114

ə
822034831 7 2


353 90 336154414715


56269166433313

ɔɪ
75331476

əʊ
115424413


22186

ɪə
6722


19


ɪeæɑɒɔʊuʌ3əɔɪəʊɪəʊənullcons


Consonants

btdkgfvƟðszʃʒhmnŋlrjwʧʤnullvowel
p61288252410094015702271296661322221633776205618468337487433296197916139
b
43140045835041112963343427918622283852703734628964196225179995
t
682731319405232117571258379247823145351710957531846216238248

d
4662503322851265848126602427185414484161950744039142206208

k
3414641761124247221421342724134608747022950193211155

g
196795218201541451125239240612071552610997108

f
13050353717313721853122362227221849178156171

v
253020414849266187222832331127526393

Ɵ
9915941236606710653710424236

ð
2834182156353745183192216

s
23222092173613845146729942169182184

z
65112415931711352535081710294

ʃ
51291791488318015534105115103

ʒ
none96none6nonenone132

h
226139none2162257019195101

m
3595951325952150172175

n
7868123935142151147

ŋ
582nonenone2176

l
58968204182202

r
58213120151

j
482845

w
6193

ʧ
92


btdkgfvƟðszʃʒhmnŋlrjwʧʤnullvowel


Keywords:

VowelsKeywordTranscribed  ConsonantsKeywordTranscribed
ikey ki ppea pi
ɪ pit pɪt b beebi
epet pet ttoe təʊ
æ pat pæt ddoe dəʊ
ɑ hard hɑd k capkæp
ɒpotpɒtggetget
ɔrawffatfæt
ʊputpʊtvvetvet
ucookuƟthinƟɪn
ʌhuthʌtðthenðen
3curk3ssacksæk
əabout/motherəbaʊt/mʌðəzzoozu
baybeɪʃshipʃɪp
buybaɪʒmeasuremeʒə
ɔɪboybɔɪhhidehaɪd
əʊgogəʊmmanmæn
cowkaʊnnonəʊ
ɪəpeerpɪəŋsingsɪŋ
pairpeəllielaɪ
ʊəpoorpʊərredred

jyearjɪə

wwetwet

ʧchinʧɪn

ʤjudgeʤʌʤ
Return to start

Comments and corrections should be sent to minpairs-at-wordscape.net (replace -at- with @ when mailing).


SLP Site of the Month Selected as the Speech-Language Pathology
Site of the Month, March 2007.
Links to revisions of some of John Higgins's other articles
Fuel for learning
… if I can without strain find 555 paraphrases of an 8-word sentence, then several thousand million paraphrases of a 50- to 60-word sentence is reasonable. … Why has Mother Language showered us with so many ways of expressing meanings?
I speak analogue; you hear digital
… In effect what we are doing here is to have the candidate give the assessor a listening test. We are certainly making the assessor behave more like a listener dealing digitally with the question "What is the candidate trying to tell me?" rather than like a judge dealing in an analogue way with the question "How well can the candidate make that sound?"
A note on quantities
…We seldom stop to ask, "What kind of 200 word text in real life is self-contained and interesting?" …
Artificial unintelligence
…While computers possess randomness, they can to some extent do without intelligence…
[The John and Muriel Higgins Home Page]

 

Page maintained by John Higgins. Last updated 20 November 2009.