How to represent pronunciation in ASCII

by Mark Israel
 
     [This is a fast-access FAQ excerpt.]
Beware of using ad hoc methods to indicate pronunciation.  The
problem with ad hoc methods is that they often wrongly assume your
dialect to have certain features in common with the readers'
dialect.  You may pronounce "bother" to rhyme with "father"; some of
the readers here don't.  You may pronounce "cot" and "caught" alike;
some of the readers here don't.  You may pronounce "caught" and
"court" alike; some of the readers here don't.
   The standard way to represent pronunciation (used in the latest
British Dictionaries and by linguists worldwide) is the
International Phonetic Alphabet (IPA).  For a complete guide to
the IPA, see Phonetic Symbol Guide by Geoffrey K. Pullum and
William A. Ladusaw (University of Chicago Press, 1986, ISBN
0-226-68532-2).  IPA uses many special symbols; on the Net, where
we're restricted to ASCII symbols, we must find a way to make do.
   The following scheme is due to Evan Kirshenbaum
(kirshenbaum@hpl.hp.com).  The complete scheme can be accessed on
the WWW at:
    <http://www.hpl.hp.com/personal/Evan_Kirshenbaum/IPA/>
     [There's a PDF version of Evan Kirshenbaum's 
     definition of ASCII IPA at 
     <http://www.hpl.hp.com/personal/Evan_Kirshenbaum/IPA/ascii-ipa.pdf>.
     It's also at <http://www.kirshenbaum.net/IPA/ascii-ipa.pdf>.
     It has graphical illustrations of the IPA symbols.
     The plain HTML version is also at <http://www.kirshenbaum.net/IPA/index.html>.
     Evan has stated "The PDF file is the one that should be
     treated as authoritative (to the extent that any of this is)."]
I show here only examples for the sounds most often referred to in
this newsgroup.  Where there are two columns, the left column shows
British Received Pronunciation (RP), and the right column shows a
rhotic pronunciation used by at least some U.S. speakers.  (There's
a WWW page that shows what the IPA symbols look like [...].)
     [You can see what the IPA symbols look like in the IPA guide at
     the AUE Web site.]
The IPA itself has a home page:
<http://www.arts.gla.ac.uk/IPA/ipa.html>.
The consonant symbols [b], [d], [f], [h], [k], [l], [m], [n], [p],
[r], [s], [t], [v], [w], and [z] have their usual English values.
[A] = [<script a>] as in:
    "ah"        /A:/        /A:/
    "cart"      /kA:t/      /kArt/
    "father"    /'fA:D@/        /'fA:D@r/
    "farther"   /'fA:D@/    /'fArD@r/
    and French bas /bA/.  This sound requires opening your
    mouth wide and feeling resonance at the back of your mouth.
[A.] = [<turned script a>] as in British:
    "bother"    /'bA.D@/
    "cot"       /kA.t/
    "hot"       /hA.t/
    "sorry"     /'sA.rI/
    This symbol (for the sound traditionally called "short o")
    is not much used to transcribe U.S. pronunciation.  [A] or
    [O] is used instead, according to which vowels the speaker
    merges; but the sound *used* by *many* such speakers will
    certainly be *heard* by Britons as [A.].  The sound is
    intermediate between [A] and [O], but typically of shorter
    duration than either.  Imagine Patrick Stewart saying "Tea,
    Earl Grey, hot."
[a] as in French ami /a'mi/, German Mann /man/, Italian pasta
    /'pasta/, Chicago "pop" /pap/, Boston "park" /pa:k/.  Also
    in diphthongs: "dive" /daIv/ (yes, folks, the sound
    traditionally called "long i" is actually a diphthong!),
    "out" /aUt/.  Typically, [a] is not distinguished
    phonemically from [A]; but if you use in "ask" a vowel
    distinct both from the one in "cat" and the one in "father",
    then [a] is what it is.
[C] = [<c cedilla>] as in German (Hochdeutsch) ich /IC/
[D] = [<edh>] as in "this" /DIs/
[E] = [<epsilon>] as in:
    "end"       /End/       /End/
    "get"       /gEt/       /gEt/
    "Mary"      /'mE@rI/    /'mE@ri/
    "merry"     /'mErI/     /'mEri/
    Some U.S. speakers do not distinguish between "Mary",
    "merry", and "marry".
[e] as in:
    "eight"     /eIt/       /eIt/
    "chaos"     /'keA.s/    /'keAs/
[g] as in "get" /gEt/
[I] = [<iota>] as in "it" /It/
[I.] = [<small capital y>] as in German Gl"uck /glI.k/.
    Round your lips for [U] and try to say [I].
[i] as in "eat" /i:t/
[j] as in "yes" /jEs/
[N] = [<eng>] as in "hang" /h&N/
[O] = [<open o>] as in:
    "all"       /O:l/       /O:l/
    "caught"    /kO:t/      /kO:t/
    "court"     /kO:t/      /kOrt/
    "oil"       /OIl/       /OIl/
    The [O] sound requires rounded lips, but lips making a
    a bigger circle than for [o].  If you do not use the
    same vowel sound in "caught" as in "court", then you are
    one of the North American speakers who use [O] only
    before [r]:  you do not round your lips for "all" and
    "caught", and you should use some other symbol, such as
    [A] or [a], to transcribe the vowel.
[o] as in U.S.:
    "no"                /noU/
    "old"               /oUld/
    "omit"              /oU'mIt/
    The pure sound is heard in French beau /bo/.  British
    Received Pronunciation does not use this sound,
    substituting the diphthong /@U/ (/n@U/, /@Uld/, /@U'mIt/).
    If you are one of the few speakers who distinguish such
    pairs as "aural" and "oral", "for" and "four", "for" and
    "fore", "horse" and "hoarse", "or" and "oar", "or" and
    "ore", then you use [O] for the first and [o] for the
    second word in each pair; otherwise, you use [O] for both.
[R] = [<right-hook schwa>], equivalent to /@r/, /r-/, or even /V"r/
[S] = [<esh>] as in "ship" /SIp/
[T] = [<theta>] as in "thin" /TIn/
[t!] = [<turned t>] as in "tsk-tsk" or "tut-tut" /t! t!/
[U] = [<upsilon>] as in "pull" /pUl/
[u] as in "ooze" /u:z/
[V] = [<turned v>] as in British RP:
    "hurry"     /'hVrI/
    "shun"      /SVn/
    "up"        /Vp/
    U.S. speakers tend not to use [V] in words (such as "hurry")
    where the following sound is [r]:  they would say /'h@ri/.
    And some U.S. speakers, especially in the eastern U.S.,
    substitute [@] for [V] in all contexts.  If you do not
    distinguish "mention" /'mEn S@n/ from "men shun" /'mEn SVn/,
    then you should use [@] and not [V] to transcribe your
    speech.
[V"] = [<reversed epsilon>] as in:
    "fern"      /fV":n/     /fV"rn/
    "hurl"      /hV":l/     /hV"rl/
    Many U.S. speakers substitute [@] for [V"], so they would
    say /f@rn/, /h@rl/.  Many other U.S. speakers pronounce "fern"
    with no vowel at all:  /fr:n/, /hr:l/.  If you are one of the
    few speakers who distinguish such pairs as "pearl" and "purl"
    (using a lower, more retracted vowel in "purl"), then you can
    transcribe "pearl" /p@rl/ and "purl" /pV"rl/.
[W] = [<o-e ligature>] as in French heure /Wr/, German K"opfe
    /'kWpf@/.  Round your lips for [O] and try to say [E].
[x] as in Scots "loch" /lA.x/, German Bach /bax/
[Y] = [<slashed o>] as in French peu /pY/, German sch"on /SYn/,
    Scots "guidwillie" /gYd'wIli/.  Round your lips for [o] and
    try to say [e].
[y] as in French lune /lyn/, German m"ude /'myd@/.  Round your
    lips for [u] and try to say [i].
[Z] = [<yogh>] as in "beige" /beIZ/
[&] = [<ash>] as in:
    "ash"       /&S/        /&S/
    "cat"       /k&t/       /k&t/
    "marry"     /'m&rI/     /'m&ri/
[@] = [<schwa>] as in "lemon" /'lEm@n/
[?] = [<glottal>] as in "uh-oh" /V?oU/
[*] = [<fish-hook r>], a short tap of the tongue use by some U.S.
    speakers in "pedal", "petal", and by Scots speakers in
    "pearl":  all /pE*@l/.  If you are a U.S. speaker but
    distinguish "pedal" from "petal", then you do not use this
    sound.
- previous consonant syllabic as in "bundle" /'bVnd@l/ or /'bVndl-/,
    "button" /bVt@n/ or /bVtn-/
~ previous sound nasalized
: previous sound lengthened
; previous sound palatalized
<h> previous sound aspirated
' following syllable has primary stress
, following syllable has secondary stress
     [There is an ASCII IPA guide and tutorial here, with audio 
     pronunciation examples by both dilettantes and experts.]
Here is the scheme compared with the transcriptions in 4 U.S.
dictionaries.  (Most British dictionaries now use IPA for their
transcriptions.)
     [While it's true that some British dictionaries use IPA for
     pronunciation transcriptions, an important exception is The
     Chambers Dictionary -- both the 1993 and the 1998 editions.
     They do not use IPA.]
       Merriam-Webster    American Heritage Random House     Webster's New World
[A]    a umlaut           a umlaut          a umlaut          a umlaut
[A.]   (merged with [A])  o breve           o                 (merged with [A])
[a]    a overdot          (merged with [A]) A                 a overdot
/aI/   i macron           i macron          i macron          i macron
/aU/   a u overdot        ou                ou                ou
[C]    (merged with [x])  (merged with [x]) (merged with [x]) H
[D]    th underlined      th in italics     th slashed        th in italics
/dZ/   j                  j                 j                 j
[E]    e                  e breve           e                 e
/E@/   e schwa            a circumflex      a circumflex      (merged with [e])
/eI/   a macron           a macron          a macron          a macron
[g]    g                  g                 g                 g
[I]    i                  i breve           i                 i
[I.]   ue ligature        (merged with [y]) (merged with [y]) (merged with [y])
[i]    e macron           e macron          e macron          e macron
[j]    y                  y                 y                 y
[N]    <eng>              ng                ng                <eng>
[O]    o overdot          o circumflex      o circumflex      o circumflex
/OI/   o overdot i        oi                oi                oi ligature
/oU/   o macron           o macron          o macron          o macron
[S]    sh                 sh                sh                sh ligature
[T]    th                 th                th                th ligature
/tS/   ch                 ch                ch                ch ligature
[U]    u overdot          oo breve          oo breve          oo
[u]    u umlaut           oo macron         oo macron         oo macron
[V]    (merged with [@])  u breve           u                 u
[V"]   (merged with [@])  u circumflex      u circumflex      u circumflex
[W]    oe ligature        oe ligature       OE ligature       o umlaut
[x]    k underlined       KH                KH                kh ligature
[Y]    oe ligature macron (merged with [W]) (merged with [W]) (merged with [W])
[y]    ue ligature macron u umlaut          Y                 u umlaut
[Z]    zh                 zh                zh                zh ligature
[&]    a                  a breve           a                 a
[@]    schwa              schwa             schwa             schwa
-      superscript schwa  syllabicity mark  unmarked          '
   Auditory files demonstrating speech sounds can be obtained by
anonymous ftp from ftp.cs.cmu.edu (or on the World Wide Web at [...]).
Look in "/user/ai/areas/nlp/corpora/pron" and
"/user/ai/areas/speech/database/britpron".
     [That anonymous ftp source no longer seems to be available. The web addresses are now 
      http://www-2.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/nlp/corpora/pron/0.html and
      http://www-2.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/speech/database/britpron/0.html.]
     [There are audio files made by speakers in various parts
     of the world in the AUE Audio Archive.
     On the Links page there's a list of links to several
     places where audio files are available.]