In the defense of ‘naming’ as an outcome measure in aphasia therapy studies

“Driving into work this morning I set a new personal record: I was able to name 324 items that I saw on my way…” said no one, ever. Naming things out loud seems like a bizarre activity and not something we engage in on regular basis. Yet, almost all comprehensive tests of aphasia include a section on picture naming even though the assessment of naming probably has limited ecological validity. So, what gives? Below, I will discuss why assessing naming ability among persons with aphasia makes sense and why it is a useful outcome measure in therapy studies, either by itself or in combination with other measures targeting, for example, communication effectiveness.

At its most fundamental level, aphasia is a language impairment. Of course, for most of us who study aphasia and anyone afflicted by it, it is much more than that. The typical rundown goes something like: Language impairment caused by brain damage > Results in communication difficulties > Affects the individual and communication partners > Induces social handicap > Is life altering. Nevertheless, without the first level “language impairment caused by brain damage” the rest of the cascade doesn’t happen. The specifics of the chronic language impairment are related to multiple factors, probably most important the size and location of the brain damage, as well as other things like the brain’s capacity to adapt and support recovery, which, in turn, is probably determined by genetics and overall brain health.

In her recent book with Roberta Elman “Neurogenic Communication Disorders and the Life Participation Approach” (Plural Publishing, 2020), Audrey Holland suggests that early intervention for aphasia should focus mostly at the impairment level. This is the time when most persons with aphasia show the greatest improvements and when they are most likely to receive aphasia therapy that is funded by a third party, in the USA either by private insurance or government insurance. When those initial funds for rehabilitation dry up, a new reality of having chronic aphasia calls for a different emphasis focused on acceptance, maximizing social participation, and finding joy and purpose in spite of great challenges. To that I would add that several studies, including one of Audrey’s (Holland et al, 2017, Aphasiology), one of ours (Johnson et al., 2019, Am J Speech Lang Pathol), as well as Hope and colleagues (2017, Brain) suggest that at least half of all persons with chronic aphasia continue to experience recovery from the language impairment. As importantly, numerous studies show unequivocally that aphasia therapy in the chronic phase can improve language processing (e.g. Breitenstein et al., 2017, Lancet; Brady et al., 2016, Cochrane Database Syst Rev). Accordingly, impairment based aphasia therapy might be a worthwhile endeavor for some individuals with chronic aphasia, assuming that the cost/benefit ratio is reasonable (I would not recommend anyone go into major financial debt to obtain aphasia therapy in the chronic phase).

If we can agree that impairment based therapy may be a reasonable way to maximizing recovery from aphasia, at least for some people, then we need to also agree on how best to measure language improvements. Fortunately, a major undertaking by a large group of researchers in the field of aphasia therapy has already provided guidance in this area (Wallace et al., 2019, Int J Stroke). Comprised of 27 experts in the field, this group provided recommendations on what should be core outcome measures in aphasia therapy trials: The ROMA (Research Outcome Measurement in Aphasia) Statement. One of their core recommendations was to use the Western Aphasia Battery, Revised (WAB-R; Kertesz, A., Harcourt Assessment, 2006) to assess improvements in language. The WAB-R, and its predecessor, the WAB (Kertez, A, Grune & Stratton, 1982), have certainly been amply criticized in the past and is not without flaws. Even so, in the absence of a better measure, the WAB-R seems like a sensible test to gauge the level of overall aphasia severity and the patterns of language impairment across modalities. One of the four sub-sections of the WAB-R focuses on naming; therefore, it would seem reasonable that the overall severity measure on the WAB-R, Aphasia Quotient (AQ) would correlate with an independent measure of naming. Below, I have included a scatter plot showing the relation between WAB AQ and the proportion of items named correctly on the Philadelphia Naming Test (PNT; Roach et al., 1996, Clin Aphasiol). Not only are the scores highly correlated (r(148)=.90, p<.00001), correct naming on the PNT explains almost 81% (R²=.807) of the variance for predicting WAB AQ (Figure 1). A correlation of this magnitude allows one to infer aphasia severity based on how accurate someone is on a naming test. I believe the most important point here is that naming accuracy can be considered as a proxy for the intactness of the language system. To make this point even more clearly, in a smaller sample of participants with aphasia (N=81), proportion of correct naming on two consecutive PNTs administered on separate days was highly predictive of WAB AQ (Figure 2). A leave-one out cross-validation (LOOCV) analysis that relied on general linear regression with correct naming on two separate PNTs as predictors revealed very high correlation between actual WAB AQ scores and predicted WAB AQ scores, r=.92, p=4.068^-34, RMSE=8.74, R²=.85). Based on these data, we can safely say that knowing how well someone performs on the PNT provides robust information about their overall aphasia severity. Not only does naming accuracy reflect aphasia severity but it is also related to discourse ability in aphasia (Richardson et al., 2018, Am J Speech Lang Pathol). Assuming that overall aphasia severity is also related to discourse abilities, this finding is certainly not surprising.

Figure 1: WAB AQ (x-axis) is a 100 point scale reflecting aphasia severity is highly correlated with the proportion of pictures named correctly on the Philadelphia Naming Test (y-axis).

Figure 2. Scatterplot showing the relation between actual WAB AQ scores (x-axis) and predicted WAB AQ scores (y-axis) based on naming accuracy on two separate PNT tests.

One argument against using naming assessment in aphasia therapy studies could be that naming does not reflect real life communication abilities. We already know that overall language severity is highly related to communication effectiveness (Bakheit et al., 2005, Disabil Rehabil), suggesting that this is not a valid argument since naming scores correlate so highly with overall aphasia severity. Still, you might not be convinced that language impairment is that important for communication abilities since some people with severe aphasia can still communicate fairly effectively. I do not disagree that some persons with relatively severe aphasia are nevertheless effective communicators. However, I think it would be difficult to argue that at the group level, having more severe aphasia is better than having milder aphasia. If you are still not convinced, I would invite you to consider the following scenario: In a group of 100 individuals who are about to have a stroke and, subsequent, aphasia, you have supreme power to determine which individuals end up with mild or severe aphasia. If you are familiar with aphasia of various severity levels and its effects on the individual, you would never choose ‘severe’ for any of the 100 individuals, not even your worst enemy. Why? Because you would know that the likelihood that someone with severe aphasia will have an easier time getting along in the world than someone with milder aphasia is probably very low. Bringing things back to naming, you would always wish that someone’s naming impairment was less, rather than more, severe because you would infer that less severe anomia (naming impairment) tends to be associated with less severe aphasia and, by extension, more effective communication abilities.

Before moving onto the issue of naming recovery and what it means for overall aphasia recovery, I want to make one more point: At the level of the impairment, I believe aphasia is essentially a lexical-semantic problem. You could add to that ‘syntactic impairment’ but I am not convinced that syntactic rules, at least at the level of the brain, can be clearly separated from the lexicon. Some recent work from Ev Fedorenko’s group (MIT) and my colleagues, Greg Hickok and William Matchin, is very enlightening in this context (e.g. Fedorenko et al., 2020, Cognition; Matchin & Hickok, 2020, Cereb Cortex). Pierre Marie, a French neurologist perhaps best known for his challenges to Broca’s work on speech localization in the third frontal convolution of the left hemisphere, believed that aphasia was only caused by damage to Wernicke’s area (1906, Sem Méd). He suggested that all other speech problems (e.g. impaired speech articulation) were ancillary to aphasia and caused by damage to other non-language related regions of the brain (for further discussion see Fridriksson et al., 2015, Cerebr Cortex). I don’t agree with this very strong view but, again, believe that the lexical-semantic impairment primarily caused by posterior damage is the most salient feature of aphasia. Computational modeling work led by Grant Walker and Greg Hickok (UCI) provides strong evidence that the integrity of the lexical-semantic system in persons with aphasia can be gauged using naming data (2018, Psychol Assess). Thus, although naming may seem like an odd task, it does provide insight into the core language impairment in aphasia.

So far, I have focused mostly on the association between anomia severity and the overall language impairment in aphasia. How about the association between those two factors in relation to recovery? In order to use naming assessment to chart therapy related recovery from aphasia, changes in naming would preferably correlate with changes in aphasia severity. More specifically, changes in naming would have to correlate with therapy related changes in aphasia severity. New data from our group suggest this is indeed the case. A longitudinal study led by Alexandra Basilakos (UofSC) retested 39 participants who had completed all aspects of our POLAR (Predicting Outcomes of Language Rehabilitation in Aphasia) study. The average time between baseline testing and follow-up testing was 22 months (range=12-34 months), which is, as far as I can tell, unusual among aphasia therapy studies. All participants were tested at both timepoints with the PNT and the WAB-R, which allowed us to calculate changes on both tests and then examine the correlation between the two. Figure 3 shows the relationship between change scores on the PNT and change scores on the WAB-R. The correlation between the two factors was r(39)=.50, p<.001, R²=.25, suggesting that changes in naming accuracy and changes in overall aphasia severity before and after therapy are linked. In relation to aphasia severity, the pattern of improvements was a mixed bag but, perhaps not surprisingly, the most severe participants tended to decline.

Figure 3. Scatterplot showing the relation between raw changes in correct naming on the PNT (x-axis) and changes in the WAB-R AQ (y-axis). Aphasia severity based on the WAB-R severity classification is color coded.

Based on these data, it seems that changes in overall aphasia severity are reflected in changes in naming accuracy. Although the correlation between the two factors is not terribly high, I think it is important to keep in mind that many different factors probably influence recovery and decline in chronic aphasia, which adds error to the correlation between any two or more outcome measures. The amount of error is probably directly related to the length of time between the baseline assessment and the follow-up assessment. Unfortunately, we do not have data for both PNT and WAB-R immediately before and after therapy, which I suspect would show a considerably higher correlation than our relatively long follow-up data. In any case, our data suggest that changes in naming are robustly related to changes in overall language impairment, which lends credibility to previous studies that have relied on naming assessment to chart therapy related changes in language processing in aphasia.

One issue I have not addressed is whether anomia severity is related to health related quality of life in aphasia. There is considerable evidence to suggest quality of life is related to overall aphasia severity (for review see Hilari et al., 2012, Arch Phys Med Rehabil). To the best of my knowledge, no studies have directly addressed the relation between quality of life and anomia severity. Nevertheless, I would be surprised if more severe anomia is not related to worse quality of life in aphasia even though such a relationship is not very strong. As a side note, based on my experience, there are so many factors that can influence quality of life in chronic aphasia, including some that I have never seen addressed in formal studies. Unfortunately, here in the USA, I suspect family income/wealth is strongly related to quality of life among persons with aphasia. Wealthier individuals simply have more access to the resources that may ameliorate the difficulty of living with aphasia: Aphasia groups, aphasia centers, residential aphasia treatment programs, and, perhaps most importantly, access to higher quality healthcare, which may be crucial for long-term recovery from aphasia.

In any case, the next time you drive home from work and try to set a new personal record for naming, I hope you will reflect on the fact that this is an activity that taxes your language system to the max! In all seriousness, I hope I have convinced you that assessing naming may be worthwhile for at least some aphasia therapy studies. Ideally, I think the assessment of anomia could serve as an important part of a larger outcome set that also includes estimates of communication effectiveness and, perhaps, quality of life.