Minimal pairs are pairs of words whose pronunciation differs at only one segment, such as sheep and ship or lice and rice. They are often used in listening tests and pronunciation exercises. Theoretically it is the existence of minimal pairs which enables linguists to build up the phoneme inventory for a language or dialect, though the process is not without difficulty.
Each cell in the tables above is a link to a list of minimal pairs derived from a dictionary. Use the tables of vowels and consonants to retrieve the relevant lists. All the vowel and consonant lists have now been edited and commented on. Earlier versions of the lists included only one pair for each pronunciation, such as heal/hole. Newly revised versions have been added which include all the pairs which arise when one or both members of the pair have a homophone, so giving a better indication of how much confusion a given pair may cause. In the case of heal/hole, for instance, the new version of the list would include all of the following:
Please note that, as you move the mouse over a link, the name of the relevant document should appear at the bottom of the browser window and this gives a further indication of which sound contrast is featured in the list.
Hal Gleason (1955, p. 19), writing about minimal pairs before the era of widespread computing, said "Presumably by diligent search through the total vocabulary, minimal pairs might be found for all English consonant phonemes. But there is no guarantee that all will be found, and in any case it is hardly a feasible procedure."
I have not tried to search the total vocabulary, but I have tried to search a vocabulary which includes most of the words available in non-specialist contexts to everyday users of English. In putting together these lists I have used Roger Mitton's machine-readable version of the 1974 edition of the Advanced Learners Dictionary, incorporating Mitton's 1990 additions to the word list (see Mitton 1996). The minimal pair lists below have been prepared from the dictionary by means of a program which sorts the pronunciation field, identifies identical pairs (homophones), substitutes dummy characters for the symbols of the minimal pair, and then flags all the additional homophone pairs created by the process. This generates (fairly) complete lists of minimal pairs, though a certain amount of rather tedious post-editing is needed.
I have added to the lists some notes on which nationalities would potentially have problems with each contrast. For this I have used Swan and Smith's invaluable Learner English, as well as drawing on my own experience of teaching in the Far East, East and North Africa and Europe. I have added tracking code to the pages, and will gradually incorporate information about which countries have figured most prominently on the visitor statistics to see if significant patterns emerge.
When this project (collecting and editing minimal pair lists for all the 510 theoretically possible contrasts) is complete, I hope to be in a position to measure the functional load of a pronunciation error, ie how much potential for confusion is created by a particular vowel or consonant error and therefore how important it is. Naturally this is not just a matter of counting the number of pairs, but also depends on other factors. One of these is the part of speech of the words and therefore their potential for appearing in the same contexts. Two nouns, such as beer and pier, are much more confusable than a noun and a preposition, such as frog and from. For this reason the edited lists draw a distinction between the number of pairs and number of semantic contrasts realised by the pairs, and calculate a "semantic loading" figure. Thus if there were 100 pairs but they belonged to only 70 different pairs of headwords, the semantic loading would be 70%. For the longer lists the semantic loading tends to fall within the range 48% to 60%, but the very short lists involving rare sounds are often higher. Paradoxically, the lower the semantic loading, the more confusable pairs may exist for that contrast, since a smaller number shows there are many inflected forms in the list and signals a large number of words in the open classes: noun, verb or adjective. To some extent the figure is arbitrarily dependent on editorial decisions. I have, for instance, treated agent nouns as separate headwords from their verb roots, since there is often a large shift of meaning, as in wait/waiter.
It is also important to take into account the density of the minimal pair, namely how the actual total relates to the theoretically possible number if every word containing one of the sounds were matched by a word containing the other. This would show how the distribution of minimal pairs relates to the overall phoneme frequencies in the same dictionary. A 100% match could only occur if there were exactly the same number of words with each sound in the language, and that is clearly unlikely. But, if the number is unequal, the density depends on which sound you start with. There are 37,729 words in the dictionary containing the vowel /ɪ/ and only 784 containing the diphthong /ɔɪ/. There are 62 minimal pairs. For the diphthong this is a density of 7.9%, but for the monophthong the density is only 0.16%. For an average of diphthong plus monophthong it is 0.32% (calculated using the harmonic mean, of course). What I have decided to do is report the mean density, pointing out where, as in this case, there is a large discrepancy in frequency.
A language can tolerate quite a lot of homophones provided they do not get in each other’s way, that is provided they are not likely to occur in the same contexts. This may be a grammatical matter: if the homophones are different parts of speech they are not likely to turn up in the same place in a sentence … If they are the same part of speech, e.g. site sight; pear, pair they can be tolerated unless they occur in the same area of meaning and in association with a similar set of other words. Site may be ambiguous in It’s a nice site, though a wider context will usually make the choice plain. … If homophones do interfere with each other the language may react by getting rid of one or by modifying one.What minimal pairs do is increase the potential number of homophones in a learner's speech or the potential for misunderstaning between speakers of different dialects. What we would expect, therefore, is for there to be more minimal pairs between sounds which differ greatly, such as peat/part or shake/wake, and fewer between sounds which are close enough to create problems for learners such as cot/caught or pie/buy. So far the evidence I have collected does not support a strong form of the conjecture.
There are a number of problems waiting to be resolved:
You will find two related lists derived from the same dictionary source at the following links:
I would be grateful if teachers using this page could send me emails (to minpairsatvictorcanning.com, replace at with @ when mailing) telling me which lists they are using and what specific problems their learners experience. I am gradually re-editing all the lists, and this information would be useful to incorporate.
|Links to revisions of some of John Higgins's other articles|
|Fuel for learning
… if I can without strain find 555 paraphrases of an 8-word sentence, then several thousand million paraphrases of a 50- to 60-word sentence is reasonable. … Why has Mother Language showered us with so many ways of expressing meanings?
|I speak analogue; you hear digital |
… In effect what we are doing here is to have the candidate give the assessor a listening test. We are certainly making the assessor behave more like a listener dealing digitally with the question "What is the candidate trying to tell me?" rather than like a judge dealing in an analogue way with the question "How well can the candidate make that sound?"
|A note on quantities
…We seldom stop to ask, "What kind of 200 word text in real life is self-contained and interesting?" …
…While computers possess randomness, they can to some extent do without intelligence…
|[The John and Muriel Higgins Home Page]|
Page maintained by John Higgins. Last updated 22 July, 2014.