Automated profiling of spontaneous speech in primary progressive aphasia and behavioral-variant frontotemporal dementia: An approach based on usage-frequency.

Radar plot depicting the language profile of individuals with logopenic variant primary progressive aphasia using five variables. The outer line represents the healthy control mean, each step away from it one standard deviation from the control mean…

Radar plot depicting the language profile of individuals with logopenic variant primary progressive aphasia using five variables. The outer line represents the healthy control mean, each step away from it one standard deviation from the control mean. People with semantic variant frontotemporal dementia produced fewer word combinations (combination ratio), which were also more strongly collocated, and more frequent content words.

We looked at language in rare dementias: Primary progressive aphasia, which mostly affects an individual’s ability to use language (we include the three major types, logopenic variant, semantic variant, and non-fluent variant), and behavioral-variant frontotemporal dementia, which primarily causes behaviour and mood change. Data were provided by colleagues at UCL’s Dementia Research Centre.

What I find fascinating about dementia is that each type has been associated with some language symptoms. In some dementias, the easiest to detect are speech motor coordination difficulties, as is the case with Parkinson’s disease and Huntington’s disease. In others, word retrieval is primarily impaired, as we see in Alzheimer’s disease. But if one looks closer, one finds so much more. Individuals with Parkinson’s have specific difficulties with motion words. In people with Huntington’s disease, a reduction of sentence complexity has been discovered even before onset of motor symptoms. I am sorry, Broca and Wernicke, but language takes deep roots in our brains, and interacts with many other aspects of cognition. It seems that degeneration, almost no matter where, affects language in some way, if one knows where to look. More and more colleagues are now seeing the relevance of this conclusion for dementia diagnosis and tracking.

But where should we look? I have talked before about my approach to pay more attention to usage-frequency, which quantifies how often speakers encounter a word or other language unit. I am not merely talking about word frequency, since its usefulness has been repeatedly demonstrated: all other things being equal, more common words are easier to process, and we see that many people with a neurological disorder are biased towards using those. But I would like more researchers to also pay attention to word combinations (either via n-gram frequency or collocation strengt. Combinations of words that appear together often are also processed more quickly and accurately. In the language production of most people with language difficulties, we therefore predict an increase of strongly collocated combinations, especially fixed expressions (i.e. formulaic language). In this publication, we show how collocation strength makes an interesting contribution to the profiling of primary progressive aphasia and behavioural-variant frontotemporal dementia.

With the help of Leo Varnet, with whom I shared an office for a while, we added some machine learning categorization to the usual group comparisons. Can a computer make a good guess about the condition of a speaker based on the variables we extracted? The answer is yes, at least to promising levels, though some difficulties were striking (see figure below). First, a too high proportion of healthy controls were categorized as having semantic variant primary progressive aphasia. Second, individuals with behavioural-variant frontotemporal dementia were more likely to be characterized as members of another dementia group than their own group. These may be problems which can be solved with larger groups and more variables, be they additional language variables, demographic variables, variables from other tests, or a combination of them. Or maybe we find out that the language profile of behavioural-variant frontotemporal dementia is simply too diffuse.

dementia matrix.jpg

So, why did it take so long for this paper to get published? I wanted this work to appear in a neurology journal, but alas, I spent a long time unsuccessfully pitching it to them. Responses were mostly similar: Neurology journals are, at least when it comes to language, rather interested in more developed, ready-to-go solutions to diagnosis, not in something which is ultimately still basic (albeit applied) research. After we gave up, I submitted to Cortex, where the work got accepted with no problems. Cortex has a section titled “Behavioural Neurology” for work which fits neurology in its nature, but is yet too experimental. Stefano Cappa, in his introduction of this section, writes: “[I]t is not always easy to find an adequate outlet to publish papers in this area. I myself had the experience of a paper that was considered ‘more appropriate for a clinical audience’ by a cognitive neuroscience journal and ‘perhaps more suitable for a specialized neuropsychology audience’ by a neurological one”.

After a year of rejections, these words felt like a hug. ❤