Links to revisions of some of John Higgins's other articles
Fuel for learning
… if I can without strain find 555 paraphrases of an 8-word sentence, then several thousand million paraphrases of a 50- to 60-word sentence is reasonable. … Why has Mother Language showered us with so many ways of expressing meanings?
I speak analogue; you hear digital
… In effect what we are doing here is to have the candidate give the assessor a listening test. We are certainly making the assessor behave more like a listener dealing digitally with the question "What is the candidate trying to tell me?" rather than like a judge dealing in an analogue way with the question "How well can the candidate make that sound?"
A note on quantities
…We seldom stop to ask, "What kind of 200 word text in real life is self-contained and interesting?" …
Artificial unintelligence
…While computers possess randomness, they can to some extent do without intelligence…
[The John and Muriel Higgins Home Page]

Minimal pairs for English RP

by John Higgins

Updated March 2008. Note that to see the phonetic characters correctly you must have the font Lucida Sans Unicode installed on your systems. This is a standard Windows font and is provided on most modern systems.

Minimal pairs are pairs of words whose pronunciation differs at only one segment, such as sheep and ship or lice and rice. They are often used in listening tests and pronunciation exercises. Theoretically it is the existence of minimal pairs which enables linguists to build up the phoneme inventory for a language or dialect, though the process is not without difficulty.

Each cell in the tables below is a link to a list of minimal pairs derived from a dictionary. Use the tables of vowels and consonants below to retrieve the relevant lists, all of which are plain ASCII text files. Some are still in a rather raw state, while others have been edited and commented on. The first versions of the lists included only one pair for each pronunciation, such as heal/hole. Newly revised versions are being added which include all the pairs which arise when one or both members of the pair have a homophone, so giving a better indication of how much confusion a given pair may cause. In the case of heal/hole the new version of the list would include all of the following:

Please note that, as you move the mouse over a link, the name of the relevant document should appear at the bottom of the browser window and this gives a further indication of which sound contrast is in the list.

How minimal is minimal?

Although the normal definition of a minimal pair specifies that the words differ in one segment, it allows that segment to be widely different in terms of articulation. Another tighter definition of a minimal pair might be words which differ by only one feature. In that case the ideal minimal pair might be cheer versus jeer which differ only in voicing. These two words also belong to the same part of speech and so have the same inflections. Moreover they belong in the same domain of discourse, and are therefore highly confusable. If you were to overhear a fragment of conversation which included:

You should have heard them ??eering at the end of the game.
you would have to perceive the voicing in order to know exactly what was meant. Most minimal pairs are considerably more distinct than that one, and in many cases would cause no difficulty to any speaker. However there is a kind of delight in recognising some of the pairs, which I feel may be related to the enjoyment we feel when we come across an outrageous rhyme in a song or piece of verse.

They can also be the source of genuine confusions and disputes. A story which appeared in newspapers in April 1998 suggested that the urn known as The Ashes and presented to the winning team in the England versus Australia cricket series contains not the remains of a bail, as the traditional account stated, but of a veil. Another story, involving not strictly a minimal pair but a highly confusable pair of words, appeared in January 1997. It told how a Japanese tourist with a ticket for Turkey had gone to Paddington station in London and asked for directions. She was put on the train to Torquay (a seaside town in South West England). There are all sorts of confusable sentences which can easily lead to 'slips of the ear' among English speakers, such as "the Dutch are suspicious" being misheard as "the Duchess is vicious". The only siginificant difference in the sound of those two sentences is /p/ versus /v/ and this is one which is notoriously difficult for foreign learners and can lead to unexpected problems. On a radio discussion on 13 March 2007 concerning the portrait of Adam Smith on the newly issued £20 note, a speaker with a strong Indian accent said what was first understood as "Adam Smith made a stink about society in a new way." What he really said was "Adam Smith made us think about society in a new way." On a recent trip to Spain I heard a Spanish guide leading a party of British tourists asking them to rendezvous at what they thought was "St Martin's Village" when he meant to say "St Martin's Bridge". The /b/ versus /v/ contrast is not made in Spanish, and his strong articulation of the /r/ made it easily confused with an /l/. Another slip of the ear I encountered recently was postcard for coastguard; although the initial /p/ versus /k/ distinction is a fairly strong one, the /k/ versus /g/ distinction in the middle of the word is neutralised by the presence of the /s/. (A contributory factor is that coastguards are often located in picturesque seaside towns, from which it would be reasonable to send a postcard.) A similar misunderstanding arose in a recent conversation between raingear and reindeer.

Homophones engender many spelling mistakes; probably the commonest of all is to write there instead of their or vice versa. Sometimes a set of near homophones leads to a spelling error, as in a notice seen recently: RUGBY, STIRLING VERSES LIVERPOOL. In the same way minimal pairs engender many spelling errors in the writing of foreign learners. Among those I have seen recently are "a reach man" (rich man) and "a brought road" (broad road). An Arabic-speaking student once wrote an essay for me about a visit to London during which he had seen "the Pig Pen Watch" (Big Ben). Even national newspapers are not immune. The Times of Tuesday, September 5th, 2000, printed the following:

Apology
Readers will have been surprised yesterday to see the famous Cold War phrase "mutually assured destruction" (MAD) rendered as "neutrally assured destruction" (NAD). What began as a copytaking error somehow survived into this column. To anyone who was confused as well as to those who were not, we offer our apologies.

Source of the lists

Hal Gleason (An Introduction to Descriptive Linguistics, Holt Rinehart Winston, 1955, p. 19), writing about minimal pairs before the era of widespread computing, said "Presumably by diligent search through the total vocabulary, minimal pairs might be found for all English consonant phonemes. But there is no guarantee that all will be found, and in any case it is hardly a feasible procedure."

I have not tried to search the total vocabulary, but I have tried to search a vocabulary which includes most of the words available in non-specialist contexts to everyday users of English. In putting together these lists I have used Roger Mitton's machine-readable version of the 1974 edition of the Advanced Learners Dictionary, incorporating Mitton's 1990 additions to the word list. The minimal pair lists below have been prepared from the dictionary by means of a program which sorts the pronunciation field, identifies identical pairs (homophones), substitutes dummy characters for the symbols of the minimal pair, and then flags all the additional homophone pairs created by the process. This generates (fairly) complete lists of minimal pairs, though a certain amount of rather tedious post-editing is needed.

The dictionary lists just over 70,000 words, corresponding to about 40,000 headwords. This may seem rather short, leaving out words which may enter minimal pairs, making the lists incomplete. That is not necessarily a disadvantage. Sometimes, as with a spelling checker, one does not want obscure words included. There exists, for instance, an English word flong. It is the name of a rubberised cardboard which used to form an intermediate stage of the printing of newspapers on a rotary press. I doubt whether one native-speaker in ten thousand knows the word; the only reason I do is that our next-door neighbour in my childhood worked as a printer for a national newspaper and used to give us discarded sheets of the stuff to insulate our hen-house, and I enjoyed trying to read the news stories on it in mirror-writing. But if the dictionary included the word flong, there would be new pairs fling/flong, flung/flong, flop/flong, flog/flong and flock/flong, which would be of little relevance to teachers or learners.

Semantic loading

When this project (collecting and editing minimal pair lists for all the 510 theoretically possible contrasts) is complete, I hope to be in a position to measure the functional load of a pronunciation error, ie how much potential for confusion is created by a particular vowel or consonant error and therefore how important it is. Naturally this is not just a matter of counting the number of pairs, but also depends on the part of speech of the words and therefore their potential for appearing in the same contexts. Two verbs, such as cheer and hear, are much more confusable than an adjective and a preposition, such as mere and near. For this reason the edited lists draw a distinction between the number of pairs and number of semantic contrasts realised by the pairs, and calculate a "semantic loading" figure. Thus if there were 100 pairs but they belonged to only 70 different pairs of headwords, the semantic loading would be 0.7. As a rough rule of thumb, the lower the semantic loading, the more confusable pairs exist for that contrast, since a large number of inflected forms signals a large number of words in the open classes: noun, verb or adjective.

There are a number of problems waiting to be resolved:

You will find two related lists derived from the same dictionary source at the following links:

One extension of the project is to see how the distribution of minimalpairs relates to the overall phoneme frequenciesin the same dictionary. For each contrast I am in the process of calculating a figure to indicate the density of minimal pairs in the vocabulary as a whole, i.e. what the proportion of actual minimal pairs is to the number there might have been if every possible word in the dictionary was matched. This figure is normally no greater than 5%, and is often less than 2%.

Why did I start?

My personal interest in this topic may be due partly to the fact that I once lived in a flat in the village of Etiler near Istanbul. From the living room one had a view across a green meadow down towards the steep sides of the Bosphorus, where one constantly saw passing freighters, small cruise liners and even submarines. It was one of the few places in the world where one might have said "Look, there's a sheep!" and expect to be misunderstood.


Vowels and diphthongs

ɪ e æ ɑɒ ɔ ʊ u ʌ 3 ə ɔɪəʊɪəʊənullcons
i4713383943163624898238130130966561532985271571331443817064
ɪ
446635228438271612224561783623342575935888482871348978
e
302142227212431302331473625022257213118323011

æ
17940917956159425160112562372924010323319

ɑ
172156347517212711184125371695146482261
ɒ
15773141300153121817222203962619846
ɔ
56142168180212512077124310682922388
ʊ
1819411615232815683

u
1197492342004520897263311

ʌ
1264211148291818518207

3
81821413314963354114

ə
8220348317none


35390336154414715


56269166433313

ɔɪ
75331476

əʊ
115424413


22186

ɪə
6722


19


ɪeæɑɒɔʊuʌ3əɔɪəʊɪəʊənullcons


Consonants

btdkgfvƟðszʃʒhmnŋlrjwʧʤnullvowel
p61288252410094015702271296661322221633776205618468337487433296197916139
b
43140045835041112963343427918622283852703734628964196225179995
t
682731319405232117571258379247823145351710957531846216238248

d
4662503322851265848126602427185414484161950744039142206208

k
3414641761124247221421342724134608747022950193211155

g
196795218201541451125239240612071552610997108

f
13050353717313721853122362227221849178156171

v
253020414849266187222832331127526393

Ɵ
9915941236606710653710424236

ð
2834182156353745183192216

s
23222092173613845146729942169182184

z
65112415931711352535081710294

ʃ
51291791488318015534105115103

ʒ
none96none6nonenone132

h
226139none2162257019195101

m
3595951325952150172175

n
7868123935142151147

ŋ
582nonenone2176

l
58968204182202

r
58213120151

j
482845

w
6193

ʧ
92


btdkgfvƟðszʃʒhmnŋlrjwʧʤnullvowel


Keywords:

VowelsKeywordTranscribed ConsonantsKeywordTranscribed
ikeykippeapi
ɪpitpɪtbbeebi
epetpetttoetəʊ
æpatpætddoedəʊ
ɑhardhɑdkcapkæp
ɒpotpɒtggetget
ɔrawffatfæt
ʊputpʊtvvetvet
ucookuƟthinƟɪn
ʌhuthʌtðthenðen
3curk3ssacksæk
əabout/motherəbaʊt/mʌðəzzoozu
baybeɪʃshipʃɪp
buybaɪʒmeasuremeʒə
ɔɪboybɔɪhhidehaɪd
əʊgogəʊmmanmæn
cowkaʊnnonəʊ
ɪəpeerpɪəŋsingsɪŋ
pairpeəllielaɪ
ʊəpoorpʊərredred

jyearjɪə

wwetwet

ʧchinʧɪn

ʤjudgeʤʌʤ
Return to start

Comments and corrections should be sent to minpairs-at-wordscape.net (replace -at- with @ when mailing).


SLP Site of the Month Selected as the Speech-Language Pathology
Site of the Month, March 2007.
eXTReMe Tracker
Page maintained by John Higgins. Last updated .