Paul Linse | Getty Images
If a simple ha doesn't work on Twitter, there's always hahahaaaa, haaaahaaaa, or even hahahahahahahahahahahahaha, indicating that you've just read the funniest thing you've ever seen. (Or that you are a sarcastic speaking raccoon.) These words are called stretchable or elongated words. Now researchers from the University of Vermont have discovered how widespread they are on Twitter, and have discovered fascinating patterns for their use.
Stretch is a powerful speech device that visually opens a written word and conveys a wide range of emotions. That goes for the gooooooaaaaaaal of a soccer announcer, for the angry teenage boy of a teenage boy, and for the awe-inspiring surfer. And booooyare popular on Twitter. In PLOS One, the researchers write today how they combed 100 billion tweets, how often these words are stretched and how far they are extended – haha versus hahahahaaaa, for example.
Consider guy and his many phrases. "Basically, it can convey everything like," Duuuuude, that's terrible, "" says applied mathematician Peter Sheridan Dodds of the University of Vermont, one of the study's co-authors. On the other hand "Dude!" is different. "It could be excitement; It could be joy, ”says Dodds.
But not everyone can use exclamation marks for emphasis or emotion, not even yours. "I hate using exclamation marks because they just don't fit my personality," I tell Dodds and his co-author Chris Danforth, also a mathematician at the University of Vermont. But I stretch out words: "I recently found myself in texts to friends or messages to colleagues who do Thaaanks with three aces to mean a kind of excitement and appreciation without having to use a stupid exclamation mark."
"Only three?" Danforth asks. "This is reluctance. Because two wouldn't work. Two is like, this person doesn't know how to spell. You made a mistake. "
All right, sooooo, we're always using stretchy words to convey additional meaning – sadness, anger, excitement. And that can be particularly powerful on a platform like Twitter, the inherent brevity of which does not exactly promote nuanced communication. These extra letters add momentum to a short message and make it more eye-catching. "They take what we would think of as dictionary text and turn it into something visual," says Danforth. "It can't be ignored if you see 20 in a row."
To quantify this, Dodds, Danforth, and the paper's lead author, computational linguist Tyler Gray of the University of Vermont, randomly selected 10 percent of all tweets sent between 2008 and 2016, a total of around 100 billion. (You have an agreement with Twitter to get this data.) Gray wrote a program that searched the data for stretched words and specifically searched for repeated letters.
First, they wanted to quantify which letters were repeated and how often. Take gooooaaaaal for example. The program "sees a G and then an O," says Dodds. It also counts the Ace and Ls. Even if it only counts a G, it will see that the rest of the letters repeat themselves strongly – maybe there are 20 Os and 20 As. "So that seems like a candidate for a stretchy word," continues Dodds.
The system then represents these stretchy candidates in simple notation. If the G and L are not repeated in gooooaaaal, the formula looks like this: g (o) (a) l. Gggooooaaaallll, on the other hand, would look like (g) (o) (a) (l) because each letter is repeated.
This quantifies what the researchers call the "balance" of an elastic word. Goooooaaaal is not very balanced because the four different letters repeat at different speeds. Hahahahaha, on the other hand, is very balanced because H and A repeat at the same frequency. However, Haaaaa is unbalanced.
Tyler J. Gray, Christopher M. Danforth and Peter Sheridan Dobbs
The researchers were then able to visualize the average number of repetitions per character as in the graphic above. With the different stretched spellings of the word target on Twitter, the G may repeat itself once or twice. (Imagine a football announcer shouting guh-guh-guh-guh-guh-oal and how quickly they are fired.) Here in this graphic you can see the number of characters as a vertical axis and the repetition of certain characters as a horizontal one Axis. As you move down from the top of the chart, the word is stretched. However, if you look at G, the frequency doesn't increase much as the word increases. You can see that O, A and L, on the contrary, are repeated more as the word stretches.
This is because the G-tone is explosive, a consonant that is spoken by stopping the flow of air in your mouth. You can't pull it out like an aaaaah or ooooh. So in the case of the word goal, it is the vowels that cause the lengthening, and they tend to lengthen in step with each other. "What we didn't know before was that these lines are pretty linear," says Dodds. "So if you make 140 or 80 characters, the balance between O, A and L is pretty much the same." What is in line with the classic football announcer scream "Gooooooaaaaaaaaalllllll" – it is easy for the Gs and difficult for the rest of the word.
Tyler J. Gray, Christopher M. Danforth and Peter Sheridan Dobbs
Now let's consider ha. Boring, unused, but stretchable into a galaxy of different shapes, as shown in the picture above – call it the tree of laughter. Every tweeted "ha" begins in this H above. A branch to the left is what happens when the tweeter adds another H instead of an A for some reason. Some tweeters eventually add an A to make hha and branch to the right, but on the far left you can see what happens if you keep adding Hs at the beginning.
Back to the top of the picture: If we move to the right from start H, tweeters add an A to make hahahaha instead of hhhhaaaa. This is the most popular path, so the bars connecting the letters here are thicker. For example, going from ha to hah is more popular than going from ha to haa. The predominant path, as you would expect, is a beautiful, clean, very balanced hahahahahaha. The different haaha or hahhah is probably just a mistake.
Generally, two-letter words extend further than normal words, such as finallyyyyy. The words in the trees above also play out as we might expect. Fuuuuuu is a popular expression of this special linguistic anger. "People start with F and are then in the United States," says Danforth. The same applies to awwwwwww.
Because stretched words can be embedded beyond words with so much additional meaning, understanding artificial intelligence that analyzes text, like chatbots, is critical. At the moment, a stretched word for an AI can be so confusing that the program simply skips it completely. We don't want to have to bold or italicize words to highlight them so the chatbot can analyze them – and even then, such formatting cannot reproduce the range of emotions conveyed by stretched words.
"If we ever get to a point where an AI can understand the amount of communication people use every day, it will be one of the places it is," said Sam Brody, who published his own research on the 2011 Word extension on Twitter before joining the Bloomberg AI group as a senior scientist. This new research, in which Brody was not involved, is a step towards quantifying and translating stretched words into subtle linguistic rules that machines can understand.
After all, who will help save Justin Bieber from attention-grabbing fans? A quirk that struck the researchers was that when Twitter users tried to be overemphasized and attract a celebrity's attention, it lengthened everything. "There was a second kind of word," says Dodds, "like:" fffffooooolllllllloooooowwwwww mmmmmmeeeeee, Justin Bieber. "Because there was a feeling that it would be exciting for Justin."
It probably doesn't work. But no harm tttttrrrrrryyyyyiiiiiinnnnggggg.
This story originally appeared on wired.com.