Speech pauses in speakers with and without aphasia: A usage-based approach.
In a conversation, we can make pauses to emphasize, to give the listener time to reflect, or for other pragmatic reasons. But most pauses in speech, whether filled (“um …”) or silent, are disruptions. We may struggle with finding a word, with planning a sentence, or with putting together a thought. Most pauses happen when our cognition can’t keep pace with our speech rate.
For that reason, we investigate speech pauses to learn more about language as we produce it. The presence and duration of pauses reveal when we face additional processing demands, and can inform us about which properties of language matter. For example, we are more likely to make a pause before a complex sentence, or before a verb rather than a noun. Pauses also provide insights into language impairment, for example following stroke. This paper identifies a new variable for understanding pauses in people with aphasia (a language disorder following brain damage) and healthy controls.
This is work by Sebastián Bello-Lepe under my supervision. Putting the final touches on it after he died was extremely difficult for me. It also made me think more about the “bus factor”, about which I am going to write later in this post.
This paper addresses questions which for me arose about ten years ago, after I had begun researching collocation strength in language. When I realized what kind of data (and how much) my methods would generate, I thought it would be good for looking not just at the properties of word combinations in language production, but also at the hesitations. However, I had my hands full even without considering pauses, and before reading the work by Angelopoulou and colleagues I didn’t even understand how to best handle them.
But this is essentially Sebastián’s work. When Covid crossed his plans for an EEG study, I asked him if he was interested in analysing speech pauses, and he was. He added much of his own insight, solved theoretical and methodological problems, and introduced new avenues for analysis. He made the project his and I am happy that this work is central to his posthumous award of a PhD. I am also very glad he had opportunities to present the work in person, as he cannot be here to experience any of its impact now that it’s published.
The idea behind collocation strength is pretty straightforward. Let’s consider a phrase like “it’s lovely”. Taking into account how often the individual words appear in everyday language use, how usual is it for them to appear together? If you do the math (using values from large language corpora), you find out that “it’s lovely” is a strong collocation. When “lovely” is spoken in English, very often it is within this combination. Some suggest we process strong collocations like big words, with the entire chunk as one lexical unit. Speech pauses serve as a way to test this assumption, because we usually don’t pause in the middle of a word. We pause between words. If strong collocations are processed in a word-like manner, we shouldn’t observe many pauses within them. Our results showed just that.
We used spontaneous descriptions of a comic strip by people with aphasia and healthy controls. We measured the pause duration before every word as well as the collocation strength between each grammatical word combination, along with the lexical frequency and grammatical function (content vs. function word) of every word. We determined frequency and collocation strength values with FLAT, the tool that I conceived and used in a number of previous studies.
In essence, pauses were fewer within stronger collocations. They were also fewer before more common words, but importantly, this lexical frequency effect (which had been investigated in other studies) was not as strong and statistically robust as the effect of collocation strength.
Speakers with aphasia made more and longer pauses (as expected), and, strikingly, the effect of collocation strength of their word combinations on pauses was stronger in speakers with aphasia than in controls. This makes a lot of sense to me. I believe strong collocations are lexicalized in every speaker, but in healthy brains, there is a working lexical-grammatical network to handle the less common, or even entirely novel, word combinations. In aphasia, this apparatus is impaired, and therefore the difference in effort between common and uncommon is greater.
These are very nice results, and I hope they will contribute to people paying attention to collocation strength (there are still many in the field who do not know what it is). Most of our assessment tools are designed according to “word and rules” approaches, but once we allow collocation strength (and other ideas from usage-based lingusitics) into the field, they may transform assessments in very positive ways. Some interventions, like UTILISE and Melodic Intonation Therapy, already apply usage-based principles.
However, it is striking that, despite some strongly significant effects in our study, the amount of variance in the pause data explained by our models is very small, meaning that models were fairly weak. It also looks like our models were better at explaining the presence of pauses rather than their duration. There are two reasons for this. First, there is a lot more to consider than the variables we managed to include. Second, pause duration may simply be very noisy data, with some amount of chaos or randomness.
I have written about Seb in an earlier post. Here, I want to talk about how his death affected the work. Seb was murdered in Madrid after the paper had been accepted by Cortex, but before final changes were implemented. I picked up this work, which may sound romantic in some way, but to me was emotionally draining. In addition, I noticed a mistake in the analysis (nothing that would have changed the main conclusions), and without Seb to consult it took a lot of work to find out what had gone wrong. Rosemary Varley asked Seb’s colleagues in Chile to break into his work laptop to make sure we had all his research data. I ended up re-running all the models, digging deep into raw data and different stages of the analysis, just to be safe.
I want to thank the team at the journal Cortex, and in particular Dan Mirman and Cheryl Phillips, who were very kind, patient, and supportive during that time.
Which leads us to the “bus factor”. I had not heard this term before. The idea is that you need to consider how your work can be continued should you get hit by a bus next time you leave the house. We talk about data management and documentation in collaborations, or in context of open science, but never have I even heard anyone raise this issue. Research is bigger than you and your particular approach should not die with you. Since my experience with this paper I have started raising the bus factor with current students - perhaps too often. However, because of ignorance or egoism, my own adherence to the bus factor is fairly poor. All my relevant data are on encrypted devices at UCL, to which some others have the password, or on UCL servers, but my filing system is still a mess, which is a complete disservice to collaborators who’d want to continue from some more recent work should I die. I will have to do better.