I recently encountered the following party game: Mumbo Jumbo, based on trying to pronounce a word while wearing a mouth-stretcher. Basically, you wear a mouth-stretcher, and then you are asked to read out a phrase. Your teammate has to figure out the phrase you're trying to say.
The mouth-stretcher means that you cannot make labials. Of course, the words are selected to take advantage of this limitation as much as possible, asking you to make phrases like 'the moose makes molasses.'
But if you know about phonetics, you should be able to figure out a way to get around this limitation. For example, you could use a substitution of [n] for [m] since both are nasals: 'the noose nakes nolasses.' 'nakes' and 'nolasses' aren't words so hopefully your interlocutor will take advantage of error correction to figure out that the words are 'makes' and 'molasses.' 'noose' is a real word, so they may need to try twice to get the phrase correctly.
Similarly, what about something like 'feel the breeze'? [f] and [b] are not available to you. You know [f] is a fricative, so maybe you think to go with 'seel' as a substitution. But 'seal' is a real world and that might confuse your interlocutor. You could try a voiceless 'th' instead, which is more similar to 'f' anyway (as f-substitution shows). This produces the nonsense word 'theel', which hopefully your interlocutor will correct to 'feel'.
How about for communicating 'breeze'? You have a few voiced stops you can use like [g] and [d]. If the plastic mouth extender isn't totally rigid, you could also take advantage of the McGurk effect to move your mouth down as if making a [b] and then producing a [g], hoping that your interlocutor will take the hint that the labial movement matters.
While this game is based on using a plastic prop to force your mouth into not producing labials, I feel like you could extend the concept for linguists as a party game. How about a game where you cannot use stops? Or no alveolar consonants?
I've written about another game where knowledge of phonetics can give you an edge - karuta, a Japanese card game.
In the anime “Chihayafuru,” a team of high schoolers plays a card game called ‘karuta’. It’s based around hearing a reader read a poem aloud, and then finding the card corresponding to that poem on the floor. You can see an example in the following illustrative video:
There are 100 poems in the version of karuta played in the anime. The phonological qualities of the poems matter - for example, 7 of the poems begin with unique syllables (u, ho, me, mu, sa, se, su), so if you hear one of those syllables, you don’t have to wait to hear the rest of the card. Others share the starting syllable, so you have to wait longer, such as with the ‘chi’ cards.
One of the conceits in the game is that the main character, Chihaya, has amazing hearing. Other high-ranked characters also have great hearing. Some of the examples of this hearing:
Chihaya can hear the sound before the f.
The meijin can hear the difference between “su” and “se”
The ability to be able to anticipate a sound before it is completed grants you a competitive advantage. Is this actually linguistically possible?
Because of anticipatory coarticulation, it may be possible that the ’s’ in ‘su’ and the ’s’ in ‘se’ sound different. There is lip compression involved in the Japanese ‘u’ sound that is not present in the ‘e’, which could theoretically affect the ’s’ in ‘su’. Someone with very good hearing may be able to notice it.
What about ‘the sound before the ‘f’? This must be interpreted as some sort of artistic license, as there is no sound before the production of the Japanese bilabial fricative. Perhaps what is meant by this is that Chihaya can tell based on a very short sample and no vowel that the sound is going to be a bilabial fricative and not a glottal fricative or devoiced vowel.
Is there linguistic evidence that coarticulation affects Japanese consonants this way? I couldn’t find any articles on anticipatory coarticulation affecting consonants in Japanese, so I can’t tell you. Looking at the linked video, you can see that the participants do react rather quickly. If you'd like to see a high-level match commentated with English subtitles, you can look here also:
It would be interesting to ‘port’ this game to different languages with a different set of poems, and see how high-level players react. I also wonder if you could make a game like this that's cross-lingual - IPA recognition? Karuta isn't just about listening, but also about card layout and reflexes, so you could use that as a base to avoid making the game just about who has the better listening recognition.
The IPA is a pretty good invention, allowing us to transcribe languages with as much precision as we feel is necessary. It's especially useful for language learners, as each IPA character's name is basically an instruction on how to pronounce it. However, you should be careful when it comes to the IPA, because a transcription is not always reality!
To start with a trivial example, consider that the General American English rhotic sound is represented as /r/ in most transcriptions, even though it is not an alveolar trill in any variant. This is for reasons of tradition, convenience ('r' is easier to type than 'ɻ'), and generalization ([ɻ] is not the only realization of the American rhotic, and 'r' is a suitable enough symbol for the rhotic). I doubt anyone has been misled to believe that American English uses an alveolar trill, but it serves as an example of this disconnect between map and territory.
All this to say, if you are trying to learn a new language, it is not enough to just read IPA transcriptions. Practice listening closely to the language.
For example, most IPA transcriptions of Russian /o/ are [o]. If you try to learn Russian pronunciation by referencing a table like this (as I know many autodidacts on the internet do), you may think that Russian has a monophthongal [o] sound, much like Spanish.
It is true that Russian [o] may be monophthongal, but Russian [o] is also often diphthongized! I had noticed this for years in listening to Russian, but it was only by looking up 'Russian o diphthong' and talking to Russian-speaking linguists that I found any sources on it. You'll definitely not find anything on it in beginner textbooks on Russian.
The /o/ vowel is a diphthongoid, with a closer lip rounding at the beginning of the vowel that gets progressively weaker [ᶷo] or even [ᶷɔᶺ], particularly when occurring word-initially or word-finally under the stress, e.g. očen' [ˈᶷoˑt͡ʃjɪn̠ʲ] ‘very’, okna [ˈᶷɔᶺkn̪ə] ‘windows’, moloko [məɫ̪ʌˈkᶷɔᶺ] ‘milk’.
(2015) "Illustrations of the IPA", Yanushevskaya and Bunčić. h/t to prikaz_da
The Swedish sj-sound is a drastic case of IPA misleading. If you have ever heard that the sj-sound is a 'coarticulation of [ʃ] and [x]', you may be entitled to financial compensation! Or at least, linguistic compensation, because it is not actually a coarticulation of [ʃ] and [x]. It's often something much simpler - a voiceless 'wh' [ʍ] (like in Southern American English), a labialized [xʷ], a [ʃ] or [ɕ]. Lindblad even offers up a velarized and labialized labiodental [fˠʷ] as a more likely pronunciation. The video below demonstrates:
I was never able to 'coarticulate' [ʃ] and [x], but a Swedish native speaker once told me that I had a pretty good approximation of 'sj'-sound. My approximation was based not off reading IPA descriptions but off listening to Swedish. I ended up doing something like [çʷ] or [xʷ], because that was what I heard.
This does not mean you shouldn't ever use the IPA a guide. Don't take broad IPA transcriptions as the final word, especially if you are just seeing a table of phonemes. If you notice that you hear something that does not seem to be in the standard IPA transcription, trust your ears! Look for articles on the phonetics of that sound if you can. Many descriptions of languages are old, or put together by someone on Wikipedia, and while there are many great Wikipedia editors, they may make mistakes or omit information for brevity's sake.
I'll finish with a quote from one of my phonetics professors - "write what you heard, not what you think you heard." What are some examples of pronunciations you've noticed that don't match common descriptions of the language?
Although Europeans dominate metal music, the lingua franca of the genre is still English. And not only is it English, but many subgenres of metal rely on specific vocabulary (fantastical and Tolkien-esque, or macabre and deathly). This leads to a lot of interesting little quirks of pronunciation and grammar. I've already documented some examples in the non-metal pop band, ABBA. I don't want to make fun of these singers or lyricists, as writing songs in a different language is difficult, and these mistakes are harmless - I just find them interesting and want to share. Here are some I've noticed from a handful of bands:
Nightwish, a symphonic metal band from Finland: their earliest records, understandably, had more L2 errors than later ones.
"And the [p]ath under my bare feet... the [e]lven [p]ath" - Pronouncing the 'p' sound without aspiration makes it sound like the "elven bath." Finnish does not have aspirated consonants, so it sounds like singer Tarja is transerring Finnish rules to English.
"Songs as a SED-uction of sirens" - Writer Tuomas appear to have thought that 'seduction' has the stress on the first syllable, and Tarja sings it with an unexpected 'eh' vowel.
"The unc[e]rven path" - A spelling pronunciation from Tarja, perhaps by analogy with words like 'care' [ker].
"The moonwitch took me TO a ride on a broomstick" - The expression in English is either "took me on a ride" or "took me for a ride." There is no expression "take to a ride."
"You stand a[k]used of robbery" - A lack of aspiration and no 'y' sound here (a spelling pronunciation?) makes this sound like "You stand a goose of robbery."
Burning Witches, a power metal band from Switzerland:
"Just stories on tape-stries" - a spelling pronunciation dividing 'tapestry' up not as 'ta - pes - try' but as 'tape - stry.'
Sonata Arctica, a power metal band from Finland, has relatively good pronunciation, but the writer struggles with stylistically appropriate English.
"Find a barn which to sleep in, but can he hide anymore?" - The use of 'anymore' without a negative sounds odd to me, especially in a question, but some people do use the word like this. If you're a 'positive anymore' user, does this sound grammatical to you? The 'barn which to sleep in' is clumsy. Stylistically you would prefer either 'Find a barn to sleep in' (no linking word necessary).
"Knock on the door and scream that is soon ending" - lack of article on both 'knock' and 'scream'.
Share your favorite moments of L2 errors in metal or other genres in the comments!
This is such a micro music linguistic trend that I have a hard time justifying writing about it, but I've now heard it from several different artists and feel compelled to include it here.
It's simple: take a word with a short i [I] sound, like 'bit'. Now replace that vowel with a long 'ee' sound [i], so it sounds like 'beat'. You don't have to go all the way to the [i] sound - you can be somewhere in between - but the end result sounds more like the long 'ee' than the short 'ih'.
The scant handful of examples:
Florence and the Machine. What kind of man loves like th[i]s?
Adele. Chasing Tables. "Th[i]s is love."
Natalia Kills. Television. "We'll never go to heaven but who needs to when you l[i]ve this good."
I had originally included these in my article on Indie Voice, but after some feedback, I decided to remove them since my examples leaned more towards pop and pop-indie, and I couldn't really say I'd heard them enough to justify calling it part of the Indie Voice cluster. But if it's not indie voice, what is it?
I have so few examples that coming up with any serious explanation isn't likely, but we can speculate. My immediate thought is that lengthening and raising the vowel in 'BIT' is something that many speakers of foreign languages do, because they don't have a short 'ih' vowel and the closest available one to them is 'ee'. It's common for Spanish speakers to confuse 'ship' and 'sheep'.
Singers can be influenced by singers with other accents, including singers who speak English as a second language. Marina Diamandis is one such example who sounds like she's imitating Greek or Spanish speakers. I also hypothesized that influence from jazz and bossa nova was part of the Indie Voice sound. If you want to use the fancy terms, these pronunciations are "linguistic resources" that singers can draw from when singing. Imagine listening to someone who speaks differently from you, hearing a pronunciation you find cool, and going "I'm gonna save that for later..." and putting it in a sort of phonetic palette. (hat tip to Lisa Jansen for introducing me to the "linguistic resource" concept in her book on pop and rock pronunciation.)
Now, here's the thing that's bugging me... the ship-sheep confusion isn't considered cool by most people. It's one of the most obvious signs that you have a foreign accent, and it's one of the thing second-language speakers of English focus on when trying to reduce their accent. On the flipside, it's one of the features that comedians like to exaggerate when mimicking a foreign accent. A quick example comes from the song "Illegal Alien" by Genesis, where Phil Collins attempts to imitate a Mexican accent:
"With a bottle of Tequila, and a new pack of c[i]garettes." - Illegal Alien, Genesis
One day, I'll do a little mini-dissection on "Illegal Alien," because wow does this song have some interesting ideas on what Spanish English sound like that. But I hope this gets the point across that this feature is ripe for mimicry and caricature.
Listening to Florence and Adele and Natalia, I don't really think they sound or want to sound foreign. In the words with the 'ih'-lengthening, the actual musical duration is also longer than the surrounding word. "What kind of man loves like thiiiiis?" "Thiiiiiis is love." "We'll never go to heaven but who needs to when you liiiive this good." Could it be easier to sing it by raising the vowel? This would be in contrast to the pop pronunciations of HAPPY with a short 'ih' at the end and HAPPY with an 'ey' sound at the end - singers claim that 'ih' and 'ey', which are lower in the mouth, are easier to sing than the 'ee' they replace.
Could be a slip of the tongue, but Florence repeats this pronunciation in every instance of her chorus, so that's a whole lot of slips of the tongue that made it to the final cut.
If imitation is out, and there's no clear phonetic or musical motivation, then we're left with the fuzziest reason of all - aesthetics. Is there some kind of aesthetic linked to pronouncing 'this' as 'thees' that has nothing to do with foreign speakers of English? The fact that these pronunciations only show up at certain parts of the song, instead of replacing every 'ih' throughout, suggests that they're sort of special. It could be a type of marking, bringing attention to these syllables by breaking our expectations of which vowel 'belongs' there.
This is still all speculation, but sometimes that's the fun stuff. Do you have any other examples like this, from other genres? What do you think is motivating this pronunciation?
I've recently been delving deeper into the human voice and the actual anatomy behind it. Most of the audio I look at is sung speech, so I've been wanting to further categorize aspects of sung speech that may be relevant. Vocal science and voice descriptions are quite opaque, and descriptions of voices can vary wildly from one source to another. A big part of this is also because it is still not entirely clear how the vocal mechanism functions, and what anatomical configuration results in certain vocal colors.
Even with this lack of consensus, one can find useful ideas. One recent idea I've found that explains a fair amount is that there are two types of resonance that singers naturally find when singing. One is based around the 'oo' sound: image a loud 'woo-hoo' call. The other is based around the 'eh' sound: imagine calling out 'hey'.
The 'eh' type of resonance results in a voice that sounds brassy and even shrill. It is prominent in pop music, which I cover a lot. I suspect that the 'eh' type of resonance is also easily found with the 'oh' sound and the 'uh' sound: sounds that are mid-way between the top and the bottom of the mouth.
When I use 'eh' resonance, I can feel a vibration above the back of my tongue. Achieving this vibration is very easy if I use mid-height vowels like [e], [o], and [@]. The farther away you go from these vowels, the harder it is, but lower is even harder. [I] is harder than [e], but not by much, while [ɛ] requires some effort to get that brassy resonance. [a] is very hard to achieve, and requires a weird tongue posture where the front is low in the jaw while the back is raised.
I was never taught how to achieve this resonance; I simply noticed that when I sang on [e] and [o] vowels, my voice seemed to carry farther without having to use more air. From there, I tried modifying vowels to get them closer to mid-height where I could get away with it:
[ɛ] got shifted up to [e]
[ɪ] got lowered to [e] as well.
[ʌ] could sound nigh-on Southern as [ɜ]
[u] didn't decrease in height, but I did push it forward to [ʉ], which let me keep the back of the tongue high.
What is the impact of this? Well, it means that when analyzing a piece of sung text, we have to keep these vowel modifications in mind. Here's an example from a past Dialect Dissection: Britney Spears raising the vowel in BED:
"You used to say that I was sp[e]cial" - What You See Is What You Get
In the Britney Spears article, I listed this as a possible example of Southern influence. And it could be! But it could also be an example of a Pop Vowel Modification (time to name things :)). Considering that Britney is actually from the South and has a Southern accent in real life, I would give points to the "is using her Southern accent" theory. She's just also using features that work well within the aesthetic constraints of pop music.
This also plays a role in the popularization of ME-breaking and HAPPY-breaking (and to a lesser extent, HAPPY-laxing): we move from vowels that are harder to get 'eh' type resonance on ([i] and [ɪ]) to vowels that are easier to get resonance on ([i] moving to [ɪ], and [ɪ] moving to [ej]). This feature began as part of actual, real life accents: Southern American English (white and Black) and working class London English. It just so happens that they fit the Pop Vowel Modification.
I believe that music can be a great vehicle for understanding language and linguistic change. But in order to make use of it, we need to understand how sung speech differentiates from spoken speech, and how the pressures on sung speech are different from the pressures on spoken speech. To that end, I'll continue studying this area and writing about it.
Language in music does not just reflect spoken speech, but actually creates new forms that exist mostly in music. One of the most well-known examples of a music-locked language variety is "Indie Voice," also called "Cursive Singing," "Indie (Girl) Voice" or "hip singing." The style has disparate roots, but began to crystallize and gain media attention in the 2010s. It has proven to have staying power, both as a musical tool for singers to draw on, and as a distinct entity that lay people can point out and imitate.
Cursive singing is a huge topic I have attempted to cover before. I want to revisit it because cursive singing still has a lot of staying power in music fans' imaginations, and the history of cursive singing demonstrates sociolinguistic concepts far better than a dry lecture would.
Cursive singing is a name for a style of singing that has also been called "indie girl/boy voice," "indie pop voice," and "hip singing." It is associated with a breathy voice, vocal fry, distinct vowel choices, and a thin, delicate style of singing. We are going to be focusing on the linguistic aspects of cursive singing. We won't be approaching cursive singing from a vocal pedagogy or music theory standpoint because each one of those could probably take up their own article!
The earliest example I can find online to any sort of 'indie voice' is this 2009 (archived) review:
The record's spare production helps keep it from dating, but what really works today is Bunyan's soft, fragile Peter Pan voice. I imagine her understated whisper sounded out of step in its own time but now it sounds like a founding document of a certain school of indie singing. [NOTE - the album they're reviewing came out 2005]
After 2009, the topic of 'indie voice' lays dormant until 2013, when we start getting attempts to definitively name this vocal style. One of the earliest to gain traction is "hip singing." "Hip singing" was coined by YouTuber Madeline Roberts in her video "How to Hip-Sing." This term is not as popular anymore, but you can still hear some people refer to cursive singing as a "hip style" of singing.
"Indie girl voice" was an explicitly gendered way to refer to this style of singing. It alternated with "indie girl singing" and "indie girl style," all centering the idea that this style is particular to "indie girls." There are gender neutral and male-coded versions as well, like "indie voice," "indie boy voice," and (rarely) "indie guy voice." There is no one originator for this term, but an early online reference comes from a Straight Dope thread (archive), showing it was familiar to users by 2014. The association with "indie girls" was strengthened by a famous 2015 vine of Chrish imitating an "indie girl" welcoming the viewer to her kitchen. MTV journalist Molly Beauchemin gave the term a boost in 2016 when she compared it to "emo boy voice."
"Indie pop voice" appears to have been coined by Buzzfeed journalist Reggie Ugwu, referring to a particular version of "indie voice" that affected not just underground singers, but mainstream pop stars. His term was adopted by others looking into this phenomenon afterwards, like musician and blogger Kelly Hoppenjam.
"Cursive singing" is the newest to the game. The origin of this dates to fans typing out lyrics in "indie voice" using italics, an overabundance of diacritics, or a cursive font.
bæheby yöu should go and fõåucck yôhuorsęælf (Source)
This typographical convention was reinforced by the fact that cursive can be extremely ornate, delicate, and hard to understand - characteristics that people associated with "indie voice." We can see how the typography began to evolve from merely denoting indie girl voice to becoming the very name of it in the following examples (documented on Know Your Meme):
On July 8th, 2018, Redditor barihakiim posted "The SZA jokes where people say she sings in hieroglyphics and italics will forever be funny to me. No matter how much of a fan I am.😂" to r/sza. Redditor FKAnugs91 responded by saying, "When TDE first released that she lost her voice someone commented 'well if she stopped singing in cursive maybe she’d still have her voice' I died." On September 13th, Aries672 asked a LipstickAlley forum "Why Are Singers Singing In Cursive Now? What is this new style that Jorja Smith, SZA, FKA Twigs and etc sing in and what is the purpose?.
"Singing in cursive" overtook "indie voice" as a search term in 2019, when "singing in cursive" became a popular challenge on the app TikTok. I'll be using both "indie voice" and "cursive singing" to discuss this style of singing, since they are both very popular and neither is explicitly gendered.
Why do people sound different when they sing?
We don't speak exactly the same way in every situation. Image you're giving a very important presentation to some very important people. You will likely use different vocabulary, grammar, and even tone of voice compared to when you're hanging out with friends. These two different ways of speaking can be called registers. A single person will use many registers throughout their lives as they encounter different situations.
Registers can also exist in music. You use very different vowels for classical operatic singing than you do when singing a bluesy rock song. Think of how unusual it would sound if you used your operatic vowels to sing the blues, and vice versa. We can say that musical genre and register reinforce each other.
Cursive singing is a register of sung speech used in a variety of musical genres, such as folk, rock, pop. The overall goal is a feeling of intimacy and vulnerability, which is why it's often paired with breathy voice and glottal fry. There are other registers that make use of these vocal techniques: r&b and pop singers use breathy voice and glottal fry to create an intimate scene with the listener (think of Mariah Carey on 'Touch My Body') or to sound vulnerable (Britney Spears in 'Oops! I Did It Again').
But the r&b and pop registers draw heavily from African American Vernacular English and Southern American English. They also, musically, tend towards pomp, bombast, and virtuosity. Cursive singing does not take from the Mississippi Delta in such a literal way, and it's much more conservative in its vocal range.
Now that we know what registers are, and how indie voice is no different from 'blues voice' or 'opera voice', let's talk about the qualities that actually make up indie voice.
The sounds
There are many different features that make up indie voice, and not every 'indie voice'-using singer uses all of them! Some of these features group together, so that you can talk about using a certain bundle of sounds.
We're going to talk about these bundles of sounds, where they came from, and who has them. And a note - this list covers a lot of features of cursive singing, but it is not exhaustive. Don't be surprised if future cursive-ologists find even more features and make even more fine-tuned distinctions!
The English/Aussie Bundle
This first bundle is going to sound very familiar to any readers from Southern England or Australia. Indeed, a lot of these are just straight-up features of Australian and Southern English English:
/aɪ/
🔊
-> [ɑɪ]
🔊
The first element in the diphthong in words like RIDE is pronounced lower, sounding like 'royd'. This feature appears in London English (Wells 1982:308), as well as Australian, New Zealand, and some New York accents.
Hide/show examples
"Caught up in m[ɑɪ] job." - Love Yourself (live), Halsey (2016)
"If love is a l[ɑɪ]." - I Don't Wanna Grow Up, Bebe Rexha (2015)
"So I heard you are m[ɑɪ] sister's friend." – I Don't Know My Name, Grace Vanderwaal (2016)
/ʌ/
🔊
→ [a]
🔊
The vowel in words like STRUT is pronounced higher, like 'a' in Spanish.
This is a feature of older
varieties of Received Pronunciation (Wells 1982:291-292) and Australian English.
Hide/show examples
"This is l[a]veb[a]t" – Chasing Pavements, Adele (2008)
"Acting [a]p" – Single Ladies, PomplamooseMusic (2009)
"L[a]cky, lucky me" – Lucky, Kat Edmonson (2012)
"You are not ab[a]ve me" – The Room Song, Allie Goertz (2013)
"You still hit my phone [a]p" - Love Yourself, Halsey (2016)
/eɪ/
🔊
→ [æɪ]
🔊
The vowel in FACE is lowered to sound more like FAEICE, or even "FICE." This Australian feature (source) is also found in New Zealand English, Cockney English (Wells 1982:307) and some Southern American accents.
Hide/show examples
"They were infl[æɪ]med" – Lemonade, CocoRosie (live) (2010)
"That boy's got my heart in a silver c[æɪ]ge" – Crave you, Flight Facilities ft. Giselle (2010)
"The rules of the g[æɪ]me" – I Don't Know My Name, Grace Vanderwaal (2016)
How did these features become associated with indie voice? We can trace part of it back to England in the 70s and 80s. While there had been successful English acts before then, many of them totally Americanized their accents, like the Rolling Stones, or only kept some of their features, like the Beatles. Starting in the 70s, English punk rock bands took a different approach by embracing working class Southern English accents completely. Peter Trudgill famously wrote about this in "Acts of Conflicting Identity:"
'Punk-rock' singers, like their antecedents, modify their pronunciation when singing. Analysis of their pronunciation, however, shows that there has been a reduction in the use of the 'American' features discussed above [flapping words with 't', using a flat 'æ' in words like dance, pronouncing 'r's after vowels], although they are still used, and an introduction of features associated with low-prestige south of England accents.
Many of these punk rock and new wave bands went on to be successful internationally, with recognizable hit songs. The Sex Pistols and the Human League, for instance, both sang with unapologetically English accents. The American music machine was ultimately able to reclaim its crown, but the result was that there were now three available model for global pop music: white standard American English, Black American English, and 'low-prestige south of England accents.' These English accents were then imitated by people from outside of England, and other people began to imitate those imitators. One such imitator is Billie Joe from Green Day, who undoubtedly became imitated by his own fans in the future:
Delivered in a halting Joe Strummer-like baritone (“I’m an American guy faking an English accent faking an American accent,” Billie Joe jokes) and framed by lean guitar parts and melodic bass lines, Green Day’s songs are the polar opposite of the fuzz-toned Seattle sound.
(Source 1,Source 2)
You don't have to be directly influenced by punk rockers to be influenced by what they ultimately stood for. They believed that you didn't need to sand down the corners of your accent to fit into a corporate, cosmopolitan box, and if you're gonna speak the truth, everyone better be prepared to listen regardless of what accent it's delivered in. Doubtless these bands inspired future English rock and pop acts to feel more comfortable using the more English parts of their speech in song. Meanwhile, there were non-English folks listening who were inspired by these sounds, and borrowed features.
For example, some commenters informed me that there were a number of Australian indie bands in the 90s that found some worldwide popularity. The most mentioned one was Frente!, a band which found popularity in the early 1990s. The frontwoman of the band, Angie Hart, was influenced by English New Wave band, the Cure. While it probably wasn't a straightforward thing, listening to a rock frontman sing in an English accent probably didn't hurt her choice to sing in her Australian accent. You can hear her and a bandmate using some of these features here.
"And all the t[ɑɪ]mes you've been alone" - No Time, Frente (1992)
"Somebody's ch[æɪ]nged the deal" - Dangerous, Frente (1992)
"Don't sm[ɑɪ]le... don't tr[ɑɪ]... One-nine-oh then m[a]ch too low" - 1.9.0, Frente (1992)
If cursive singing is supposedly influenced by English or even Australian bands, then why does it only have some features occasionally, and not just sound like a straight imitation of them?
Let's pretend you're a singer, and you're a huge fan of Frente! and you're trying to imitate them. Most people can't reproduce another accent perfectly, so you, hypothetical Frente! fan, only copy some of the more salient features, like PRICE-backing, STRUT-centering, and FACE-lowering. Now let's say you got real popular, and you have fans that try to copy the way you sing. Those fans are also imperfect hearers and producers. Some of those fans might only pick up on PRICE-backing and FACE-lowering, and totally ignore STRUT-centering, while other fans may only pick up on FACE-lowering. Some fans may faithfully reproduce all of these features, but some may miss out on them entirely! And so the original full vowel set of Australian English becomes spread across musicians imitating musicians imperfectly, over and over.
My tentative explanation is there was a sort of diffusion happening here where English, Australian, and New Zealand-accented bands became popular, and their listeners, in trying to fit in with them, copied different parts of their accents.
The features that were most copied seem to be ones that can be found in other accents. The low FACE vowel, the backed PRICE vowel, and raised STRUT vowel can be found in multiple different and well-known varieties of English. All three of these can be found in Southern English, Australian, and New Zealand accents. Low FACE vowel is also part of Southern American English, and backed PRICE vowel can be heard in New York English. (It was much harder for me to find examples of the KIT-tensing change. The 'ih' to 'ee' change is also significantly rarer. Australian English is the only major English variety I can think of with that change - and even there, it's not that common.)
We can explain this with a concept from language contact studies: convergence. "Language convergence often results in the increased frequency of preexisting patterns in a language; if one feature is present in two languages in contact, convergence results in increased use and cross-linguistic similarity of the parallel feature" (Hickey, 2010, and the University of Manchester).
This bundle is very common in indie voice, but it's not the only one. There's another bundle, and this one is much more controversial.
The Diphthongized Bundle
This bundle is the one where a single vowel sounds like it's had an 'ih' added afterwards. 'Good' becomes 'guid,' 'touch' becomes 'tuitch', and 'breath' becomes 'breyth'. One vowel turning into two is called diphthongization. Diphthongization can be very obvious or extremely subtle, but either way the vowels no longer sound pure. These diphthongs are closing diphthongs – they go from a low vowel to a high vowel. The one exception is /ʊ/ → [ʊɪ], where the tongue stays at the same height as it moves forward. Let's take a listen to some of these stretched-out vowels:
/ɛ/
🔊
→ [ɛɪ]
🔊
The vowel in words like "dress" has a short 'ih' added to it at the end.
Hide/show examples
"Nearly put to d[ɛɪ]th" – Lemonade (live), CocoRosie (2010)
"I must conf[ɛɪ]ss, when I wear this dr[ɛɪ]ss" – Stuck On You, Meiko (2013)
"I don't ever think about d[ɛɪ]th" – Glory and Gore, Lorde (2013)
"Carves into my hollow ch[ɛɪ]st" – Drive, Halsey (2015)
/ʌ/
🔊
→ [ʌɪ]
🔊
The vowel in words like "just" has a short 'ih' added to it at the end.
Hide/show examples
"B[ʌɪ]t ships are fallible, I say." - Bridges and Balloons, Joanna Newsom (2004)
"I cannot r[ʌɪ]n now." - Wake Up Alone, Amy Winehouse (2006)
"She's up all night for good f[ʌɪ]n." Get Lucky, Daughter (2013)
"I’ll be the [wʌɪ]n." - Timber, Pitbull ft. Kesha (2013)
"J[ʌɪ]st let me be." - I Don’t Wanna Grow Up, Bebe Rexha (2015)
"...cold to the t[ʌɪ]ch." - Stitches, Shawn Mendes (2015)
"...you look that m[ʌɪ]ch." - Love Yourself (live), Halsey (2016)
Other: /ʊ/
🔊
→ [ʊɪ]
🔊
, /ɑ/
🔊
→ [ɑɪ]
🔊
, /ɔ/
🔊
→ [ɔɪ]
🔊
The vowels in "book," "spa," and "caught" respectively have a short 'ih' added on to them at the end. Note that "on" appears here with two different representations because
the singers have different pronunciations.
Hide/show examples
"I just wanna look g[ʊɪ]d for you." - Good For You, Selena Gomez (2015)
"then you swore [ɑɪ]n." - Our Own House, MisterWives (2015)
"If you think that I'm still holding [ɔɪ]n [...] and baby I be moving [ɔɪ]n." - Love Yourself (live), Halsey (2016)
R-Vocalization. Here, the r sound is replaced with an 'ee' [i] or 'ih' [ɪ] sound.
Hide/show examples
"Even if it leads nowehe[i]." - Chasing Pavements, Adele (2008)
"Witness to the arc tow[ɔɪ]ds the sun." - Don't Carry It All, The Decembrists (2011)
"Never saw you befo[ɪ] [...] let me show you the do[ɪ]." – The Room Song, Allie Goertz (2013)
"I've never seen anybody do the things you do befo[ɪ]." - Dance Monkey, Tones and I (2019)
The diphthongized bundle has been a hot topic of discussion among people interested in indie voice. I've heard a lot of different theories of where it comes from, with one of the most popular being that it was just made up by singers to sing easier, or to stand out from the crowd in an effort at personal branding. But it turns out that there are real-world people who have this type of diphthongization. Dialect expert John Wells noticed something like this happening as far back as 1980:
The vowels /ɪ, ɛ, ʊ, ʌ/, while normally monophthongal, tend to have centring-dipthong allophones when prosodically salient and when in the environment of a following final voiced consonant, thus He's wearing a 'bib' [bɪəb]! In the east, this variant is not very widespread except in the South Midland area; in the United States as a whole it seems to grow commoner as one moves further towards the West and South. Illustrations involving /ɛ, ʊ, ʌ/ are 'bed' [bɛəd], 'good' [gʊəd], 'rub' [rʌəb]. Under the same environmental and prosodic constraints, /æ/ may be found with diphthongal realizations of the [æə] and [æɨ] types [...] Before /ʃ/ and /ʒ/, certain vowels have variants involving an assimilatory off-glide to the [ɪ] area. This particularly affects /ɛ, æ, ʊ, ɔ/ and is associated with the south midland region (and the south); it is common in the mid and far west. Examples include measure [mɛɪʒɚ], splash [splæɪʃ], push [pʊɪʃ], wosh [wɔɪʃ].
Translation: some American accents add an 'uh' vowel to words with short vowels, so 'bib' becomes 'bi-uhb', and others add a short 'ih' vowel before the 'sh' and 'zh' sounds, so you get 'puish' instead of 'push' and 'may-zher' instead of 'measure.' It's not exactly what's happening with our indie singers, but it's in the same ballpark, and shows that diphthongization of these vowels has been happening for several decades in some types of American English.
This may have been happening as far back as the 1940s. Famed jazz singer Billie Holiday has this feature. Plenty of critics have noticed that indie singers sound like they may be imitating her, and this feature - while not exactly the same - may be why:
“Take my li[ə]ps, I want to lose the[ə]m. Take my arms, I'll ne[ə]ver use the[ə]m." - All of Me, Billie Holiday (1941)
I have found spoken examples from American English speakers, including a white Southern American man. This counters the idea that this diphthongization is limited to women only - considering the male speaker in question is a male over 30 and therefore considered less likely to use novel linguistic forms (Coates 1993, Chambers 1994).
"I don't think you think it's goo[ɪ]d." - Emily Procter playing Ainsley Hayes on the West Wing (2000s). Her accent on the show is her real accent too.
"It has cuts and nasty bits, bu[ɪ]t I used this Angelus product..." - American Duchess (2013)
"And I don't think anyone would wear a velvet and vinyl helmet and expect it to keep them alive in outer space... Bu[ɪ]t it was kind of closely mimicking these astronauts' pressure suits." - Sarah Jean Culbreth (2019)
"It's no[ɪ]t." - Contestants on 'Something Borrowed, Something New' (2014-2015)
"I'm hoping it's yellow pvc... bu[ɪ]t." - Roger Wakefield (2020).
Cursive diphthongization is therefore not disconnected from spoken speech - It is based on a process that does occur in spoken speech by people who are not indie singers! It is a relatively subtle sound shift, such that I would not have noticed it had I not been listening for examples of it. Perhaps that is how 'real life' examples of indie voice have gone unnoticed for so long.
There is also an interesting articulatory explanation for Indie Voice diphthongization. This paper puts forward the explanation that Indie Voice singers are using pharyngealization when singing, and that this forces them to move the tongue more when singing. This results in an 'ih' sound being produced, because the 'ih' is made close to the front of the mouht. It also explains the R-vocalization mentioned above.
Click to show/hide explanation of the paper
First, we found that front-rising diphthongs can occur after any vowel except /i/, /ɪ/, or /u/, and most occur before coronal consonants. We took this to suggest that the diphthongs are prolonged audible transitions between the tongue’s vocalic position and the articulatory target for the following consonant.
As mentioned above, the diphthongization mostly happens before a consonant that's made in the front of the mouth, but not the lips (n, l, t, d). The authors are suggesting that the diphthong is a result of the tongue taking a longer time to move from the vowel to the consonant, resulting in an "audible transition."
Second, we observed a pervasive pharyngeal sound. Reinforcing this, we observed that, in some artists, /r/ sounds were realized postvocalically as high-front vowels, which could indicate that the pharyngeal component of /r/ was not distinctive in that environment.
The authors heard that the singers had a sound that might be the result of constricting the pharynx (the part of the throat behind the mouth and nasal cavity). As evidence, they offer that some singers pronounced 'r' as 'i', which could be a result of pharyngeal constriction making 'r's sound less distinctive.
Pharyngeal constriction can be achieved via various articulations , but we interpreted these initial observations as consistent with retraction of the tongue body. In addition to reducing pharyngeal volume, this could also prolong the transitions between vowels and coronal consonants, simply by increasing the physical distance the tongue must travel.
There is more than one way to constrict the pharynx, but one way is pulling the tongue back. Pulling the tongue back has the effect of making the transition between vowels and consonants longer because the tongue has to move farther than normal. Any vocal coaches out there will also note that pulling the tongue back has an audible effect on the coloring of the voice.
The pharyngealization explanation is quite intriguing, and I'm especially interested in the way that it seems to account for the very unusual pronunciation of 'before' as 'befoy.' If pharyngealization is the root cause of these diphthongs, then maybe we should rename this bundle the pharyngealized bundle.
In models of sound change, the initial stage might be more physiologically driven, but the resulting shift is later adopted as phonology. Similarly, while indie-pop’s distinctive diphthongs may originally have been a by-product of an articulatory setting, they have since been adopted as part of a musical style. For example, one online tutorial instructs the viewer to “add the letter i... after vowels”
The authors note that what may have originally been a by-product of retracing the tongue has become noticed and adopted on purpose.
That being said, this is only one paper, and the authors suggest that further research is required before we can say with any certainty that pharyngealization is behind the distinctive Indie Voice sound. They only examined 5 artists, so clearly we would need to see if these results are reproducible in larger samples, and among different styles of cursive indie singing. Nevertheless, it is a promising line of inquiry, and hopefully one that future linguists will look into.
Diphthongization Bundle in Music History
Though this assimilatory 'ih' sound may be as old as forty years old, its musical lineage appears to be more recent. One of the earliest examples of this diphthongization that I've found is Elizabeth Fraser, the Scottish singer of the Cocteau Twins.
(Fraser also has some of the other features of cursive singing, such as an [a] vowel for TRAP words,
but that is not surprising since it is typical of Scottish English).
She is known to have very stylized and difficult to understand singing. I found two examples of her
using diphthongization:
"Fearless on my bre[ɪ]th" - Teardrop
"My dreams are erotic(?) sick and must be addre[ɪ]ssed" - Fotzepolitic
One singer who is a common example of 'indie voice' or 'cursive singing' is Australian songwriter Sia, and she does display some diphthongization. Sia had both a solo career and a well-known career as a songwriter and demo singer for other pop stars, so she is a major influence on modern pop music:
“You will be loved by someone goo[ɪ]d” - Sia, You Have Been Loved (2007)
Another major artist to use this diphthongization is Adele, in her 2011 album "21" and 2012 song "Skyfall."
“So overdue I own the[ɪ]m, swept away I’m stole[ɪ]n” - Skyfall, Adele
Lorde is another one who used it circa 2012. Lorde's debut "Pure Heroine" was culturally influential, so it's possible that she was a major vector for the spread of this pronunciation.
“I've never seen a diamond in the fle[ɪ]sh” - Royals
"I don't ever think about de[ɪ]th" - Glory & Gore
At some point, "indie voice" starts spreading from the indie pop/rock/folk scene into mainstream
pop music. One notable example of this that has rarely been mentioned before is Kesha, who
displays both diphthongization and some of the other features of indie voice:
"I feel it in my bloo[ɪ]d [...] baby when we tou[ɪ]tch" - Supernatural (Deconstructed) [Slowed Down]
"Let's make a n[ɑɪ]t... I'll be the wu[ɪ]n you won't forget" - Timber (noted by Nashville Pop)
"We're t[æɪ]king names" - Blow (Deconstructed)
By 2015, this phenomenon was reaching critical mass. Massively popular pop artists like Selena Gomez and Shawn Mendes were using it in their music, and then up-and-coming pop artists like Halsey were also using it. It's in 2015 that the Buzzfeed article is written about indie voice diphthongization, showing that it was attracting a lot of attention.
Although there are examples of the diphthongization bundle from 2017 and beyond, the style begins to attract a lot less
attention. This makes sense as in 2016, the zeitgeist began to move away from the indie-pop influenced
vibe that began in 2012 and moved towards tropical house, Latin music, and most importantly trap.
None of these styles easily accommodate the vulnerability of indie voice, and trap has its own linguistic register.
Nevertheless, "indie voice" continues to be used in plenty of indie pop, alt-r&b, and folk music. If you
go on Spotify and look any playlist that comes from keywords like "indie relaxing chill music," you'll notice that cursive singing is alive and well. "Chill pop" artists like Khalid continue to use phonetic aspects (a rare male example of cursive singing!).
"You know I wish I c[ʊɪ]d [...] do it all in the name of f[ʌɪ]n, f[ʌɪ]n” - Young, Dumb and Broke, Khalid (2017)
The Grab-bag Bundle
This bundle is made of features that aren't easy to trace back. They're not necessarily from one accent or other, and they can combine with the other bundles.
One example is DRESS-lowering, where 'dress' and similar words sound like 'drass': /ɛ/ to [æ]. This shift comes from California English. This one is surprisingly common, considering other aspects of the California shift aren't well-represented and it's not a common feature, either. This may be copped from "emo boy" or "pop punk voice," which is a topic for a future article.
/ɛ/ → [æ].
Hide/show examples
"Accidently k[æ]lly street" - Accidently Kelly Street, Frente! (1992)
"I find it's b[æ]tter to be somebody else" - So Much To Say, Dave Matthews Band (1996)
"Summer has come and past, the innocent can n[æ]ver last" - Wake Me Up When September Ends (live), Green Day (2005)
"Your little brother n[æ]ver tells you but he loves you so" - Colors, Halsey (2015)
"Cuz there's a m[æ]nace in my bed" - Trouble (Stripped), Halsey (2014)
“H[æ]llo from the outside” - Hello, Adele (2015)
Another common feature is words like 'down' being produced with a lowered vowel. In both American English and Southern English English, words with an 'ow' vowel tend to start with a sharper vowel, so you get [æʊ] or even [eʊ] in Australian and New Zealand English. But some indie singers use a softer vowel that is made with a lowered tongue, so you get [aʊ]. This pronunciation is found in some American dialects and in conservative varieties of Received Pronunciation. This may be what a Cracked article referred to as a "a really weird pseudo-Estonian affectation among female pop vocalists where they kinda slur together multiple vowel sounds and needlessly add '-ow' phonemes." (They also link to an old version of this article. Hi!)
[æʊ] → [aʊ]
"When I'm d[aʊ]n, I get real d[aʊ]n. When I'm high, I don't come d[aʊ]n." - Issues, Julia Michaels (2017)
"Thinking you could live with[aʊ]t me, thinking you could live with[aʊ]t me" - Without Me, Halsey (2018)
The third common feature is lack of aspiration. This was noted in the "How to Hip Sing" video, where Madeline Roberts explains that "hip singing" involves "soft consonants" - in linguistic terms, they are pronouncing sounds like 'p', 't', and 'k' without a puff of air afterwards. This makes them sound like 'b', 'd', and 'g'.
One of the most well-known examples I can think of is Marina and the Diamonds, who uses this feature frequently. In my earlier article, I attributed it to Marina being influenced by Greek. While I still think that is a passable interpretation, I think the wider phenomenon of unaspirated consonants in Cursive Singing is also due to influence from speakers of English as a second language (also known as L2 speakers).
Specifically, I am reminded of Astrud Gilberto, the Brazilian bossa nova singer who became famous for singing "The Girl From Ipanema" in a hushed, flat, and breathy style reminiscent of modern Indie Voice. Astrud has a noticeable Brazilian accent, and her consonants are unaspirated. I would not be surprised if bossa nova were an influence on Indie Voice, since a number of indie artists seem to be influenced by jazz.
We will note an unaspirated consonant by adding a dash - after it. So "t-all" is the word "tall" but with no aspiration in the 't'.
Lack of aspiration or weakened aspiration in stop consonants [p], [k]. and [t].
Hide/show examples
"T-all and t-an [...] and when she p-asses, each one she p-asses goes ah." - The Girl From Ipanema, Astrud Gilberto
"Here c-omes the sun" - Here Comes The Sun, Nina Simone (1971)
"They go along to t-ake your honey" - Breezeblocks, Alt-J (2012)
"C-an't you see" - Salvatore, Lana Del Rey (2015)
"It's a p-ower, it's a p-ower, it's a p-ower move" - Better Than That, Marina and the Diamonds (2015)
"Somebody get the t-acos" - Drew Barrymore, SZA (2017)
"Now I beg to see you dance just one more t-ime" - Dance Monkey, Tones & I (2019)
Why does indie voice exist?
Let's do a summary:
Indie voice is a group of related registers for singing 'indie' music
Indie voice features come from bundles of English English, pharyngealization, and miscellaneous features that have disseminated
People who use indie voice use features from these bundles, but they don't use every single feature at once
These features all have antecedents in a variety of English or a phonological process - they were likely not independently derived
One of the most popular explanations for why cursive singing exists is that it's a way for singers to 'distinguish themselves.' But this explanation doesn't line up with the aforementioned facts. If you want to distinguish yourself, why try to sound like every other indie singer? Why use the same features they use? Why stick to sounds we're already somewhat more familiar with instead of coming up with something actually unexpected? If instead of adding an 'ih' after vowels, they turned every vowel into an 'er' sound, I would probably remember that more just because it's so unprecedented.
Instead of an attention-based explanation of indie-voice, I propose that indie voice behaves as other registers do - as a way to communicate something to the audience and to signal group membership. Contrary to sticking out, adopting indie voice means a singer is attempting to fit in to the existing crop of singers. This is neither bad nor good - it is simply the way registers work.
In my experience as a singer, singers aren't actually aware of the register they're singing with. They adopt and switch registers unconsciously, the way children pick up the rules of language without needing them explained. In my (anecdotal) experience, getting singers to even realize that they are using a linguistic register is a challenge - they just view it as 'singing in a particular style.'
This is interesting, because it suggests that the spread of indie voice may have been subconscious. It wasn't someone purposefully studying their favorite singer's vowels and then dutifully practicing. It was hours of immersing themselves in a particular register, singing along and imitating, and then continuing with that style afterwards. It's quite a fluid process, and perhaps some people are more open to picking up different linguistic registers than others. The point is that it's not really a put-on or a conscious decision.
Retroactively finding out where you picked up a linguistic style from is also a challenge. One of the artists mentioned in this article, Ally Goertz, actually responded to this article, saying her accent was a result of "copious Beatles, Sundays, and Kinks + having a So Cal dialect." I have no doubt that these artists influenced her music (and the Beatles are an interesting linguistic case all their own), but they are not direct ancestors of indie voice (perhaps there's a future topic for folks to study in the future - how far back can we trace it?). The job of a musician is to create music, not to be a linguistic anthropologist of music tracing the history of every vowel and consonant they've ever uttered. It's not surprising that after subconsciously picking something up, it's not clear where it came from.
So what is the effect of cursive singing on an audience? What sorts of songs get the cursive treatment? If you've listened to every sound sample in this article, first of all good job, and secondly, you'll notice that the general feel is low-key. There's not a lot of upbeat dance songs or hard rock or mumbly trap. Emotional intimacy and vulnerability are recurring themes, even in the cases where the arrangement is dramatic (Adele).
Cursive singing is ornate, but it is not about swagger or braggodocio. It's an encouraging whisper, or an introvert's take on individuality. (The song 'Dance Monkey' is a curious aberration in that sense - it uses indie voice, but in a noticeably aggressive growling style. It borrows the 'phrase-final yodel' from the Daya school of singing, edging it towards the pop direction. The end result is eccentric - delightfully so, in my opinion.)
Cursive singing is supposedly hated - my last article had multiple commenters posting to inform everyone about how much they disliked the style of singing. One of the early sources was a thread about how much everyone hated it. But despite how stigmatized it seems to be, cursive singing hasn't stopped. It appears to have struck a chord with an audience that doesn't care that others find it 'affected' or 'annoying.' Perhaps this is why cursive singing, which I had thought was a dead meme when I wrote my first article, continues to get so much attention - it is striking an unmet need in audiences.
Conclusion
Linguistic experimentation in spoken speech usually takes place in the realm of words - they're easy to grasp, spell, use - and turn off. But the sounds themselves usually aren't played with so obviously. Real life language is heavily policed, and which accent you speak affects how the world treats you. Sung speech gives us an opportunity to witness experimentation and playfulness in the low-stakes world of music and art.
Indie voice is valuable, not least because it allows us to witness an example of a register developing and spreading. It shows how complex language change is - we used concepts from language contact, articulatory acoustics, and sociolinguistics to explain the origin of indie voice.
There is no way to tell how long indie voice. Perhaps some features of it will end up assimilated into the general pop lexicon - the backed PRICE vowel has already solidified itself as a gentler (or brattier) alternative to the monophthongized [a]. The percussive potential of de-aspiration (not to mention its appeal to English L2 speakers) might keep it around for longer, too. These aren't any weirder than pronouncing 'I' as 'ah' and 'me' as 'may', and we've been letting that into our lives for years. Who knows what new possibilities new singers will bring?
Today I present to you a very simple example of stops becoming affricates: /t/ becoming /tʃ/ before a high vowel. This is happens commonly across languages. Those of you studying Japanese may know that historically, the sequence /ti/ became [tɕi] and /tu/ became /tsu/.
I don't know exactly how common this is in English, but I've found /t/ becoming /tʃ/ before /i/. Example from Phil Collins:
In learning you will [tʃ]eech (teach), and in [tʃ]eeching (teaching) you will learn
Example number two is from the Backstreet Boys. Notice that although there is a /tʃ/ in 'reach' before the 'to', the singer clearly stops and produces a second /tʃ/ sound for "to." I would imagine that this one is influenced by the nearby /tʃ/ in the environment (he doesn't affricate the 't' in 'two worlds' one line before), but it's still neat.
In honor of the spookiest month of the year, I've been watching the Halloween-themed video "Everybody (Backstreet's Back)" by the famous 90s band the Backstreet Boys.
Despite having listened to it for several years, this week was the first time I noticed that the chorus was not, in fact, "everybody (yeah) rock your body (yeah)", but in fact had a phantom 'yeah' added in: "Everybody yeah (yeah) rock your body yeah (yeah)." I checked several lyrics sites to confirm this and yes, these are the official lyrics. How did I miss this yeah for so long?
Well, the introduction of the song has a very clear extra yeah.
Everybody, yeah. Rock your body, yeah.
ɛvrɪbadɪ jɛə. rɑk jo bɑdɪ jɛə
You can clearly hear that there is a 'y' in that 'yeah'. Compare how they say "body yeah" with the next section, with no 'yeah':
Rock your body right.
rɑk jo bɑdɪ rait
It turns out that this pattern of a lyrics plus a 'yeah' actually repeats in every chorus. But I had never noticed because, well, there is some pretty wild smoothing going on:
Everybodye (yeah) rock your bodye (yeah)
ɛvrɪbadɪɛ (jɛə) rɑk jo bɑdɪɛ (jɛə)
Did you catch it? [ɛvrɪbadɪ jɛə] has become [ɛvrɪbadɪɛ] or [ɛvrɪbadɪə]. The [j] has mysteriously disappeared!
It doesn't help that the Backstreet Boys use a lax-HAPPY vowel in 'everybody' and 'body,' which means there is less distance between the 'ih' in 'everybody' and the 'eh' of the swallowed yeah. I think I would have noticed it if they had used a tense HAPPY-vowel - it would have been harder to ignore the difference in vowel quality.
This particular conversion of a falling diphthong to a monophtong is called smoothing, or at least it is in the study of English. Indeed, the only other example I can think of smoothing happens in RP, where a word like 'fire' /faɪə/ can become [faə].
Is this part of some larger linguistic trend? Not that I can tell. The lax-HAPPY is definitely very typical of the period (as I shall write about soon), but I don't think smoothing of this sort was widespread. This smoothing seems to have been motivated by the meldody, which was melismatic in the first chorus ("Everybody, ye-e-ah") and then became syllabic in the following choruses ("everybody-e"). It was easier to reduce the 'yeah' to a monophthong than to try and produce the full form, especially since the full form was repeated by the backing vocals anyway.