Formulaic language in probable Alzheimer's disease: a replication (in prep.).

In 2016, we made our first contribution to the hunt for language variables which may help identify and track dementia by the way an individual speaks.

We published our first paper using the FLAT: “Frequency in Language Analysis Tool”, a programme I designed with Mark Wibrow and Michael Coleman. I wanted to use collocation strength (a measure of how much words in a combination are associated with another) as a measure of “creativity” of the individual’s language system. Combinations of strongly associated words (whether rare, such as plate tectonics, or more common, such as I don’t know) are easier to recognize and to produce. Strong collocations may be stored as one holistic unit, or “formula”. This makes their production lazy in a completely legitimate way. Why go through the whole effort of producing a new combination, when “pre-packaged” ones are available? However, as the effort of retrieving and combining words becomes much greater in individuals with lexical and/or grammatical impairment, we prediced that they would produce more strongly collocated word combinations as a result of lost capacity.

This is exactly what we found. We used the FLAT to analyze each two-word combination in a language sample. Speakers with Alzheimer’s disease produced stronger collocations, likely the result of progressing language impairment. We also found that collocation strength was higher in people who had likely lived with the disease for longer.

Since then, we have found the same effect in rare dementias, four times in focal aphasia (twice recently and in prep.), and in children with Williams Syndrome (also in prep.). We have also found that a group of children with acquired language disorders (“childhood aphasia”) produced more weakly collocated combinations than controls (which was, to be frank, unexpected, and I don’t trust our explanation). Clinical language scientists need to pay more attention to collocation strength.

This week, we managed to replicate the initial study on Alzheimer’s disease.

“The Boston Cookie Theft”.

“We”, that is me with the wonderful help of Lin Wang, a visiting undergraduate student from the University of Waseda, Tokyo, using data from an Alzheimer’s Society funded project run by Rosemary Varley and me. We had a new sample of 18 speakers with Alzheimer’s disease (two with rare variants: primary progressive aphasia and posterior cortical atrophy), and 21 controls. This is a small, but well-tested sample: Montreal Cognitive Assessment (MoCA), Boston Naming Test, Test of Reception of Grammar, digit span, Pyramids and Palm Trees, Brixton’s. We analyzed a “Cookie Theft” picture description (probably the last time I will use it, given how old-fashioned it is and tailored to the white experience).

What I love about this replication is that I have become more confident with the FLAT. When it comes to usage-frequency and related variables such as collocation strength, there is a great number of measures from which to choose carefully. Over time, I have developed my own, reasoned (I think) protocol, and it is with these developed methods that we replicated the original findings, which makes for a leaner report.

The other thing I love is how there is a nice correlation between language measures and MoCA scores (see figures below), suggesting that FLAT variables are responsive to the degree of cognitive impairment. We didn’t get these correlations in 2016, when Mini Mental State Exam scores were available.

This time, we (thanks to great work from another student, Chui Thing Kiew) also have other language measures on grammatical complexity. Our next job is to determine how these different effects relate to another and to the other standardized measures I used when collecting the data. This means that the final report will be more than just a replication, as we widened our scope.

Let’s get to the goods (disclaimer - this is work in progress):

Relationship between MoCA scores (cognitive capacity; x-axis) and bigram t-scores (collocation strength of two-word combinations; y-axis). We find controls clustered on the higher end of the MoCA scale. Lower cognitive function is associated with production of more strongly collocated combinations (one aspect of “formulaic language”).

Frequency of content words in Cookie Theft samples against MoCA results. Lower cognitive function = more frequent words.

Function word frequency, same contrast. It’s interesting that there are trends, although they are not significant. Frequency effects for function words are a little underresearched.