J Am Acad Audiol 2018; 29(01): 073-082
DOI: 10.3766/jaaa.16168
Articles
Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.

Listener Factors Associated with Individual Susceptibility to Reverberation

Paul N. Reinhart
*   Department of Communication Sciences and Disorders, Northwestern University, Evanston, IL
,
Pamela E. Souza
*   Department of Communication Sciences and Disorders, Northwestern University, Evanston, IL
†   Knowles Hearing Center, Evanston, IL
› Author Affiliations
Further Information

Corresponding author

Paul N. Reinhart
Department of Communication Sciences and Disorders, Northwestern University
Evanston, IL 60208

Publication History

Publication Date:
29 May 2020 (online)

 

Abstract

Background:

Reverberation is a source of acoustic degradation, present to varying extents in many everyday listening environments. The presence of reverberation decreases speech intelligibility, especially for listeners with hearing impairment. There is substantial variability in how susceptible individuals with hearing impairment are to the effects of reverberation (i.e., how intelligible reverberant speech is to a listener). Relatively little is known about the listener factors which drive that susceptibility.

Purpose:

To identify listener factors that are associated with an individual’s susceptibility to reverberation. Another purpose was to investigate how these listener factors are associated with reverberant susceptibility in relation to the amount of reverberation. The listener factors investigated were degree of hearing loss, age, temporal envelope sensitivity, and working memory capacity.

Research Design:

This study used a correlational design to investigate the association between different listener factors and speech intelligibility with varying amounts of reverberation.

Study Sample:

Thirty-three older adults with sensorineural hearing loss participated in the study.

Data Collection and Analysis:

Listener temporal envelope sensitivity was measured using a gap detection threshold task. Listener working memory capacity was measured using the Reading Span Test. Intelligibility of reverberant speech was measured using a set of low-context sentence materials presented at 70 dB SPL without individual frequency shaping. Sentences were presented at a range of realistic reverberation times, including no reverberation (0.0 sec), moderate reverberation (1.0 sec), and severe reverberation (4.0 sec). Stepwise linear regression analyses were conducted to model speech intelligibility using individual degree of hearing loss, age, temporal envelope sensitivity, and working memory capacity. A separate stepwise linear regression model was conducted to model listener speech intelligibility at each of the three levels of reverberation.

Results:

As the amount of reverberation increased, listener speech intelligibility decreased and variability in scores among individuals increased. Temporal envelope sensitivity was most closely associated with speech intelligibility in the no reverberation condition. Both listener age and degree of hearing loss were significantly associated with speech intelligibility in the moderate reverberation condition. Both listener working memory capacity and age were significantly associated with speech intelligibility in the severe reverberation condition.

Conclusions:

The results suggest that suprathreshold listener factors can be used to best predict speech intelligibility across a range of reverberant conditions. However, which listener factor(s) to consider when predicting a listener’s susceptibility to reverberation depends on the amount of reverberation in an environment. Clinicians may be able to use different listener factors to identify individuals who are more susceptible to reverberation and would be more likely to have difficulty communicating in reverberant environments.


#

INTRODUCTION

Reverberation is a common source of acoustic degradation present in the real world. Reverberation refers to the sound that persists in a space due to continued reflections off environmental features, even after the source of the sound has stopped. These continued reflections degrade the transmission of speech information by smearing the spectral and temporal information of the speech across phoneme and word boundaries ([Nábělek et al, 1989]). As a result, reverberant degradation can hinder speech perception. Generally, the greater the amount of reverberation, the larger the effect reverberation will have on speech perception.

Individuals with hearing impairment have significantly poorer recognition of reverberant speech than individuals with normal hearing ([Gordon-Salant and Fitzgibbons, 1993]). That is, individuals with hearing impairment are substantially more susceptible to the effects of reverberant degradation than normal-hearing individuals. While hearing impairment is generally associated with poorer speech intelligibility in reverberation, there is also substantial variability in the hearing-impaired population with respect to how susceptible they are to reverberant degradation ([Nábělek and Mason, 1981]; [Reinhart et al, 2016]). Nábělek and Mason investigated reverberant speech perception in individuals with hearing impairment. They found some individuals who were highly susceptible to reverberant degradation and performed near chance under mild reverberant degradation. Other listeners in that study were less susceptible to reverberation and were able to maintain relatively good performance even under relatively high amounts of reverberant degradation. It is not clear what listener factors determine how susceptible an individual is to the effects of reverberant degradation.

One listener factor that has been associated with reverberation susceptibility is a listener’s degree of hearing impairment ([Nábělek and Mason, 1981]; [Gordon-Salant and Fitzgibbons, 1993]; [Marrone et al, 2008]). Hearing loss reduces audibility of speech but also degrades the processing of auditory signals ([Tremblay et al, 2003]). Thus, even when signal audibility is sufficient, individuals with hearing impairment experience increased susceptibility to reverberation compared to their normal-hearing peers ([Gordon-Salant and Fitzgibbons, 1993]). It is believed that this is because of the auditory processing deficits associated with hearing impairment. Because greater hearing loss is associated with poorer overall processing of auditory signals ([Oates et al, 2002]), it has been proposed that there may be a direct relationship between the degree of hearing loss of a listener and their susceptibility to reverberant degradation.

A second factor that has been associated with reverberation susceptibility is a listener’s age ([Gordon-Salant and Fitzgibbons, 1995]; [1999]; [Marrone et al, 2008]; [Reinhart et al, 2016]). These studies have suggested the effect of age on reverberation susceptibility to be independent of hearing loss. [Gordon-Salant and Fitzgibbons (1995)] proposed that age may predict reverberation susceptibility due to the overall slowing of processing, deterioration of central timing mechanisms, and declines in auditory processing associated with the aging process (e.g., [Pichora-Fuller and Singh, 2006]). Due to these global age-related processing decrements, age may substantially affect an individual’s susceptibility to reverberant degradation.

A third factor proposed to be associated with a listener’s ability to cope with reverberant degradation is his or her sensitivity to temporal envelope cues ([Gordon-Salant and Fitzgibbons, 1993]). Temporal envelope cues refer to the low-frequency modulations of a signal that transmit important linguistic information ([Rosen, 1992]). Reverberation predominantly attenuates the transmission of high-frequency modulation cues in a signal, while primarily leaving the temporal envelope of a signal intact ([Houtgast and Steeneken, 1985]). Because information encoded in the temporal envelope is predominantly what remains, the ability of a listener to access information within the temporal envelope may predict his or her reverberant speech perception.

A fourth factor that may influence how susceptible a listener is to the effects of reverberation is the listener’s working memory capacity ([Kjellberg, 2004]; [Reinhart and Souza, 2016]). Working memory is a limited-capacity system engaged when simultaneously processing and storing incoming information. When listening to speech that has been acoustically degraded, a listener will often recruit additional working memory resources as a compensatory mechanism to facilitate speech understanding ([Rönnberg et al, 2008]; [2013]). It is believed that working memory resources are selectively deployed to reconstruct the intended message of a signal by using contextual and lexical cues to fill in missing information that has been made inaccessible to a listener, as a result of acoustic degradation. Working memory is envisioned as a limited-capacity system, which means that high storage and processing demands may surpass the available resources of a listener. When task demands surpass available cognitive resources, speech intelligibility begins to decline. For this reason, the working memory capacity an individual has available to repair degraded speech may predict how susceptible that listener is to the effects of reverberation.

While degree of hearing loss, age, temporal envelope sensitivity, and working memory capacity have all been implicated as listener factors underlying susceptibility to reverberation, there are several limitations in the extant literature. One limitation is that these listener factors have not been investigated collectively in relation to reverberation susceptibility. This is important due to the confounding intercorrelations among the various listener factors. Because of these intercorrelations, the association between a given listener factor and susceptibility to reverberation observed in a previous study may have been significant only because that effect was mitigated by another listener factor that was not examined. For example, age has been associated with reverberation susceptibility; however, age is also associated with declines in temporal envelope processing. Because of this, it is impossible to determine whether age directly affects reverberation susceptibility or whether that effect is an indirect influence of temporal envelope sensitivity. To clarify the relationships between listener factors and reverberation susceptibility, there is a need for a multiple-approach study considering degree of hearing loss, age, temporal envelope sensitivity, and working memory capacity simultaneously.

A second limitation in the literature is the heterogeneity of reverberation conditions examined. Previous studies have all used different reverberation times and often only a single level of reverberant degradation. Because the amount of degradation increases with increasing reverberation, it is likely that the processing demands required for speech perception will vary based on the amount of reverberation. For example, more reverberant degradation will decrease the amount of usable acoustic information made available to the listener. As a result, listeners will have to deploy greater working memory resources to compensate for the degradation. Thus, working memory capacity may be associated with reverberation susceptibility at higher levels of reverberant degradation but not lower levels of degradation. To further clarify the relationships between listener factors and reverberation susceptibility, there is a need to consider these relationships at a range of levels of reverberant degradation.

The purpose of the current study was to investigate what listener factor(s) are most closely associated with listener intelligibility of reverberant speech across a range of levels of reverberant degradation encountered in the real world. Based on previous studies, the factors we examined were degree of hearing loss, age, temporal envelope processing, and working memory capacity. We predicted that reverberation susceptibility can best be modeled using multiple listener factors. Furthermore, we expected that the best set of predictors would change based on the amount of reverberant degradation.


#

MATERIALS AND METHODS

Participants

Participants included 33 adults (20 females and 13 males) with symmetrical, sloping hearing loss. None of the participants were hearing aid wearers at the time of testing. Air-conduction thresholds were measured at 0.25–8 kHz octave frequencies and interoctaves at 3 and 6 kHz. Bone-conduction testing was performed at octave frequencies 0.5–4 kHz. All participants had a symmetrical sensorineural hearing loss, defined as presenting with no more than a single air-bone gap >10 dB in each ear in conjunction with normal Type A tympanometry in both ears using a standard 226-Hz tone ([Jerger, 1970]). A participant was considered to have symmetrical loss if they had no more than two inter-ear air-conduction threshold differences >10 dB. Because degree of hearing loss was a predictor factor of interest in the study, we sought to test a sample with a wide range of audiometric thresholds. The range and mean of participant audiograms for both ears can be seen in [Figure 1]. Additionally, because age was also a factor of interest, we sought to sample a wide range of listener ages among listeners with acquired hearing loss (mean age = 76.2 yr; range = 59–88 yr). Younger individuals with hearing loss were excluded due to differing etiology and underlying pathology of their hearing loss, compared to older listeners with presbycusis-type hearing impairment, who comprise the majority of the hearing-impaired population.

Zoom Image
Figure 1 Mean air-conduction thresholds of participants. Shaded area represents the range. Error bars represent ±1 standard deviation.

All participants had normal or corrected vision by self-report so that they could reliably complete visual tasks on the testing monitor. Participants received a passing score of ≥26 (mean = 28.9) on the Mini-Mental State Examination ([Folstein et al, 1975]) to screen for mild cognitive impairment. All participants spoke English as their sole or primary language. Additionally, participants reported good general health at the time of testing and no history of neurologic impairment, dyslexia, or reading disability.


#

Temporal Envelope Sensitivity Task

Sensitivity to temporal envelope was measured using the gap detection threshold task. This measure was selected because gap detection was the previous measure of temporal envelope sensitivity associated with reverberation susceptibility ([Gordon-Salant and Fitzgibbons, 1993]). The current procedure measured gap detection thresholds using a three-alternative, forced-choice procedure in which one of the alternatives contained a temporal gap and the others did not. The carrier signal was a broadband noise (100–8000 Hz). Total stimulus duration was 250 msec with a 10-msec raised-cosine ramp, and the gap signal was multiplied by a gap containing 0.5-msec cosine-squared ramps. The gap duration started at 20 msec and initially varied by a factor of 1.4. After four reversals, the gap duration varied by a factor of 1.2, using a two-up, one-down procedure to converge on discrimination performance of 70.7% correct ([Levitt, 1971]). Participants were tested monaurally in their right ear. Presentation level was set to 35 dB SL for each participant based on his or her four-frequency pure-tone average (average of audiometric thresholds at 500, 1000, 2000, and 4000 Hz). Participants completed a single practice block, and then data were collected for an additional three blocks. The gap detection threshold for each test block was taken as the geometric mean gap duration of the last ten reversals. Final gap detection threshold was taken as the geometric mean of each individual run threshold across the three blocks.


#

Working Memory Capacity Task

Individual working memory was assessed using a computerized, English-language version of the Reading Span Test (RST) ([Daneman and Carpenter, 1980]; [Rönnberg et al 1989]). In the RST individuals were tasked with rapidly processing incoming information by making semantic judgments of sentences while also attempting to store information for later recall. The test materials consisted of 54 five-word sentences that were presented on a 26-inch computer monitor. Half the sentences made semantic sense (e.g., “the ball bounced away”), and half the sentences were semantically absurd (e.g., “the pear drove the bus”). Sentences were displayed at a rate of one word (e.g., “home”) or word cluster (e.g., “the spider”) every 800 msec. Sentences were presented in sets of three, four, five, and six sentences, and sets were presented in order of increasing length. Within each set, a 1,750-msec interval separated the end of one sentence from the beginning of the next sentence. Participants were required to read the sentence aloud as words flashed across the screen and to make a semantic judgment about whether that sentence made sense at the end of a sentence. At the end of each sentence set, participants were asked to recall either the first or last word of each sentence within that set of sentences. This was done without the participants’ knowledge of whether the first or last word would be prompted before they saw the set of sentences. Whether the experimenter asked for the first or last word of each sentence was pseudo-randomized across the RST, such that first-word and last-word recall conditions occurred an equal number of times. Each participant received a different randomization. Before data collection, participants completed a single, three-sentence set until they were comfortable beginning the test phase. The RST score was the percentage of first or last words correctly recalled in any order out of the 54 sentences.


#

Speech Task

Susceptibility to reverberation was quantified as a listener’s sentence intelligibility score and was assessed under various levels of reverberation. Target sentences were taken from the main corpus of IEEE sentences. To control for regional dialects, recordings were locally produced and recorded by two female and two male talkers from the Greater Chicagoland area ([McCloy et al, 2015]). These sentences were low context and contained five key words per sentence (e.g., “Take the winding path to reach the lake”).

The entire set of sentence stimuli was processed using a binaural virtual reverberation simulator. Sentences were processed and presented at three possible reverberation times: no reverberation (0.0 sec), moderate reverberation (1.0 sec), or severe reverberation (4.0 sec). These three reverberation conditions were selected because they represent the range of reverberant degradation found in the real world ([Hodgson et al, 2007]; [Kociński and Ozimek, 2015]). See [Figure 2] for an example of the same sentence waveform in each of the three reverberation conditions. The simulation method computed the directions, delays, and attenuations of the early reverberant reflections and spatially rendered those early reflections along the direct path, using nonindividualized head-related transfer functions. The no reverberation condition (0.0-sec reverberation time) consisted of only these early reflections. For the moderate and severe reverberation conditions, late reverberant energy was simulated statistically, using exponentially decaying Gaussian noise. The final output reverberation time was varied by manipulating the simulated absorptive properties of the reflective surfaces in ⅓-octave bands from 125 to 4000 Hz. In the present simulation, room size was fixed at 5.7 m × 4.3 m × 2.6 m to represent a typical room size, and source-listener distance was fixed at 1.4 m to represent a typical conversational distance. Room size and source-listener difference were fixed so that only reverberation time was varied across the three speech conditions. Overall, this method has been found to produce binaural reverberation simulations that are reasonable physical and perceptual approximations of naturally occurring reverberation ([Zahorik, 2009]).

Zoom Image
Figure 2 Example sentence waveform processed at each of the three reverberation conditions.

Stimuli were counterbalanced such that each of the four unique talkers were equally represented across the three reverberation conditions (no, moderate, and severe). Sentence stimuli were not repeated within participants. Sentences were presented binaurally at 70 dB SPL to represent a conversational speech level. The sentence presentation level was not frequency shaped for individuals’ hearing losses. This was done to simulate how a listener perceives reverberant speech as if they were unaided in the real world. The binaural presentation was dichotic to provide listeners with head-related transfer functions and spatial cues that would typically be available under real-world conditions. The task was to repeat orally as many words as possible from each target sentence. Intelligibility scoring was performed by a single scorer to ensure consistency, and participant response time was unlimited. Final speech intelligibility scores for each reverberation condition were taken as the number of key words reproduced correctly for all sentences within that reverberation condition.


#

Procedure

All testing took place in a large, double-walled sound-attenuating booth. For both gap detection and the speech task, sound presentation and scoring were controlled via custom-written Matlab code. Stimuli were delivered at a sampling frequency of 48828 Hz through Tucker-Davis Technologies (Alachua, FL) equipment and presented through Etymotic-ER2 insert phones (Elk Grove, IL). For the temporal envelope sensitivity task, participants logged their responses manually using a computer interface. For the other tasks, participants indicated their responses orally to the experimenter, who was either seated in the booth with them (during the working memory capacity task) or outside the booth (during the speech task). Testing took place over two sessions, with each session lasting ∼2 hours. Sessions took place at least a week but no more than three weeks apart. During testing, breaks were given frequently to mitigate participant fatigue. All procedures were approved by the Northwestern University Institutional Review Board, and each participant underwent an informed consent process. Participants were compensated for their time.


#
#

RESULTS

Reverberation susceptibility for each level of reverberant degradation was quantified as speech intelligibility at each reverberant condition. Individual speech intelligibility scores were transformed into rationalized arcsine units for all subsequent statistical analyses ([Studebaker, 1985]). Individual speech intelligibility data are plotted in [Figure 3] Panel A, in which each line represents a single participant. Consistent with previous findings ([Gordon-Salant and Fitzgibbons, 1993]), even in the undistorted condition not all listeners with hearing impairment performed at ceiling. In general, individual speech intelligibility decreased as a function of increasing reverberant degradation; however, note the dispersion of the lines and increase in variability of speech intelligibility across the study sample. This suggests that reverberation susceptibility varied based on the degree of reverberant degradation, with greater variability under more degraded conditions. [Figure 2] panel B depicts group speech intelligibility data across the speech intelligibility conditions. The increase in variability in panel B is evidenced by the increases in the error bars. As a result of this, the data violated the assumption of sphericity required for standard parametric testing, as indicated by a significant result for Mauchly’s test of sphericity (p = 0.048).

Zoom Image
Figure 3 Results of speech intelligibility task in the three reverberation conditions. (A) Individual scores where each line is a different participant; (B) group score. Error bars represent ±1 standard deviation.

To analyze speech intelligibility scores, a one-way repeated measures analysis of variance test was conducted with reverberant condition as the single within-participants factor. A Greenhouse–Geisser adjustment was applied to correct for the nonsphericity of the data. There was a significant main effect of reverberation time [F (1.698,54.338) = 56.99, p < 0.001, η2 p = 0.722]. Post hoc analyses conducted using a Bonferroni correction indicated that all pairwise conditions among speech intelligibility in the three reverberation conditions were significant (p < 0.001).

Regression Analyses

To examine the relative contributions of different listener factors to speech intelligibility at the different reverberation conditions, stepwise linear regression analyses were conducted. This statistical approach was selected to remove redundant predictors and identify only the listener factors that most accurately predicted speech intelligibility. A separate stepwise linear regression model was conducted to model listener speech intelligibility at each of the three levels of reverberation for a total of three models.

The listener variables entered into the models were degree of hearing loss, age, gap detection threshold, and working memory capacity. Degree of hearing loss was quantified as binaural, four-frequency pure-tone average (average of audiometric thresholds for both ears at 500, 1000, 2000, and 4000 Hz). All four predictor variables were approximately normally distributed in the current sample, as indicated by nonsignificant Shapiro–Wilk tests for normality (p > 0.050). Collinearity variance inflation factor was <5.0 in each of the three regression models, which indicates that multicollinearity of the predictor variables was relatively low.

Results of the three stepwise linear regression analyses can be seen in [Table 1]. For the first model predicting listener speech intelligibility in the no reverberation condition, the only significant predictor was gap detection (p = 0.002) which explained 27.7% of the variance. Additionally, pure-tone average trended toward significance (p = 0.065). The second model examined which listener factors were mostly closely associated with speech intelligibility in the moderate reverberation condition. In this model, there were two significant predictors: age (p = 0.008) and pure-tone average (p = 0.014). These factors combined to explain 55% of the variance in listener speech intelligibility. The final model predicted listener speech intelligibility in the severe reverberation condition. Both working memory (p = 0.001) and listener age (p = 0.004) were significant and combined to explain 58.7% of the variance.

Table 1

Results of Stepwise Linear Regression Analyses across the Different Reverberation Conditions

No Reverberation

Moderate Reverberation

Severe Reverberation

Listener factor 1

Gap detection (p = 0.002)

Age (p = 0.008)

Working memory (p = 0.001)

Listener factor 2

Pure-tone average (p = 0.065)

Pure-tone average (p = 0.014)

Age (p = 0.004)

Listener factor 3

Working memory (p = 0.119)

Working memory (p = 0.223)

Pure-tone average (p = 0.119)

Listener factor 4

Age (p = 0.455)

Gap detection (p = 0.252)

Gap detection (p = 0.388)

Total r 2 of significant factors

0.277

0.550

0.587

Power analyses were conducted on the previous models to examine whether the sample size was sufficient to detect additional significant effects by the predictor variables. Because we were interested in listener factors that have the potential to be used as a clinical predictor of reverberant speech intelligibility, we were interested in factors that account for an additional r 2 increase of 0.10. Factors that explain <10% of additional variance we considered unlikely to be clinically significant. For the first model (no reverberation) where gap detection had an r 2 of 0.277, with 33 participants there was power of ∼0.61. For both the second model (moderate reverberation) and third model (severe reverberation) where factors already explained over 50% of the variance, there was power >0.80 with 33 participants.


#
#

DISCUSSION

Reverberation is a common source of acoustic degradation in the real world, and intelligibility of reverberant speech is highly variable in individuals with hearing impairment. The present study examined which listener factors are most closely associated with behavioral intelligibility of reverberant speech. The factors investigated were degree of hearing loss, age, temporal envelope sensitivity, and working memory capacity. These factors were investigated in relation to listeners’ speech intelligibility at three levels of reverberation times: no reverberation (0.0 sec), moderate reverberation (1.0 sec), and severe reverberation (4.0 sec).

In the no reverberation condition, temporal envelope sensitivity was the only significant listener factor associated with speech intelligibility. When speech was degraded by moderate reverberation, both age and degree of hearing loss were significantly associated with speech intelligibility. Lastly, listener working memory capacity and age were both significantly associated with speech intelligibility in the severe reverberation condition. These findings are further discussed below.

In the present study, temporal envelope sensitivity was significantly associated with speech intelligibility in the absence of reverberation; however, it was not significantly associated with speech intelligibility with either moderate or severe reverberation. This is in contrast with a previous study by [Gordon-Salant and Fitzgibbons (1993)] in which listeners’ gap detection thresholds were associated with speech intelligibility in reverberation. A potential reason for these discrepant findings is the difference in reverberation times examined between the studies. The reverberation times examined in the present study were 1.0 and 4.0 sec, which are relatively more severe than the 0.6-sec reverberation time tested by Gordon-Salant and Fitzgibbons. In the previous study, the authors hypothesized that a listener’s ability to detect small temporal gaps may be important for listening in reverberation, because reverberation reduces the gaps between word boundaries ([Gordon-Salant and Fitzgibbons, 1993]). With the reverberation times used in the current study, the gaps between words were mostly eliminated altogether rather than merely reduced ([Figure 2]), as might have been the case in the Gordon-Salant and Fitzgibbons study with a reverberation time of only 0.6 sec. Thus, gap detection may be associated with speech intelligibility at mild reverberation times, as it was here for nonreverberant speech. However, our findings suggest that once the reverberation in the signal is sufficient to eliminate the gaps between words, then whatever component of the auditory system that is responsible for accurately encoding reverberant speech may not be assessed by traditional gap detection.

Instead, there may be a separate bottom-up mechanism primarily responsible for encoding reverberant speech. A neurophysiology study by [Slama and Delgutte (2015)] measured how sound envelope was encoded by individual neurons in the inferior colliculus of rabbits with varying amounts of reverberation. In their study, reverberation decreased the amplitude modulation depth of the presented signal; however, they observed a subset of inferior colliculus neurons in which temporal envelope was encoded better for reverberant signals than for anechoic signals with the same modulation depth. Based on these results, they hypothesized that there may be a subsystem of neurons in the inferior colliculus that are relatively resistant to the acoustic effects of reverberation and are able to recover envelope modulations for encoding reverberant speech. Because of this specialization, this neural network might be the primary bottom-up mechanism for encoding reverberant speech. It is possible that the gap detection task used in the present study did not assess the integrity of this subsystem of inferior colliculus neurons. Thus, listener gap detection was not associated with behavioral performance under moderate and severe reverberation conditions that relied on this reverberation-specialized neural network.

While gap detection was not significant, age was associated with speech intelligibility in both moderate and severe reverberation conditions. Advanced aging is associated with declines in neural encoding throughout the auditory system ([Tremblay et al, 2003]; [Anderson et al, 2012]). Thus, aging may also be related to declines in this reverberation-specialized neural network primarily responsible for encoding reverberant speech. If this is true, one potential explanation is that age was a significant predictor in both moderate and severe reverberation because it was the best index from the current set of predictors of the integrity of this reverberation-specialized system.

In the severe reverberation condition, working memory capacity was most closely associated with speech intelligibility performance. This finding replicates that of [Reinhart and Souza (2016)], which used a group-split, categorical design and found a group of individuals with low working memory to have steeper declines in speech intelligibility across a range of reverberant conditions than did a group of individuals with high working memory. The current study expands on this finding by using a regression approach to treat working memory capacity as a continuous variable and additionally control for bottom-up factors such as degree of hearing loss and temporal envelope sensitivity. These findings further support cognitive models of speech perception in which explicit top-down processing facilitates the perception of degraded speech for older listeners with hearing impairment ([Rönnberg et al, 2008]; [2013]).

Such a finding for older listeners is in contrast to young listeners with normal hearing. Recent evidence suggests that working memory capacity plays an extremely minimal role for young individuals with normal hearing even when perceiving degraded speech ([Füllgrabe and Rosen, 2016]), although not specifically for reverberant speech. This is presumably because the perceptual task load placed on young listeners with normal hearing even with a degraded signal is not sufficiently taxing for cognition to be engaged and limit performance. However, working memory capacity does matter for older listeners with hearing impairment who experience internal degradation associated with age and hearing loss as well as external degradation associated with signal degradation.

Lastly, degree of hearing loss was never the predominant predictor of listener intelligibility across all three reverberant conditions examined. This is an interesting result, since participants were listening to speech without frequency shaping, which means that some of the speech might have fallen below threshold at times. Nevertheless, our findings suggest that suprathreshold measures (i.e., psychoacoustic and cognitive measures) are better predictors of speech perception, depending on the amount of reverberation, than is degree of hearing loss. This is similar to other findings that suggest that speech-in-noise performance cannot be accounted for by the audiogram (e.g., [Vermiglio et al, 2012]).

Clinical Implications

These results suggest that clinicians could rely on suprathreshold factors to predict a listener’s susceptibility to reverberation. However, the optimal set of factors one should use to predict speech intelligibility will vary based on the amount of reverberation in the environments in which listeners report attempting to communicate. For example, restaurants and classrooms typically have more moderate reverberation times varying from 0.45 to 1.80 sec ([Hodgson et al, 1999]; [2007]). Therefore, for listeners who are predominantly concerned with communication in these environments, listener age and degree of hearing loss (to a lesser extent) may be the best factors with which to infer a listener’s susceptibility to the reverberation. On the other hand, a large church may have a reverberation time as high as 5 sec ([Kociński and Ozimek, 2015]). For listeners most concerned with communication in an environment with such severe reverberation, it would be better to consider the listener’s working memory capacity (measured using the RST) and age as clinical predictors. Previous papers have similarly advocated for protocols evaluating working memory using the RST as a clinical prefitting measure (e.g., [Sirow and Souza, 2013]). See [Souza et al (2015)] for a review of working memory and its clinical applications. Including the RST may take only 10–20 min (depending if the long or short form is used) and provide additional information regarding listener speech perception capabilities and optimal signal processing. Overall, if these listener variables indicate that a listener’s speech intelligibility may be substantially affected by reverberation, then a clinician may consider adopting a rehabilitation approach that includes technology options that are specifically designed to improve communication in reverberant environments.

There are several technology options designed to mitigate communication difficulties in reverberation. The most effective solution is a remote microphone device, either alone or in conjunction with a hearing aid (depending on degree of loss). A remote microphone can be placed closer to a talker so that more of the direct acoustic energy than the reflected energy will be picked up by the microphone. This will improve the overall direct-to-reverberant ratio of the amplified signal delivered to the listener’s ears. This approach has been demonstrated to improve speech intelligibility under reverberant conditions (e.g., [Lewis, 1994]). Another option is selecting a hearing aid with a dereverberation algorithm ([Fabry and Tehorz, 2005]). This option is attractive because it does not involve an additional device; however, the efficacy of this type of digital signal processing solution has not been independently investigated in wearable hearing aids, to the authors’ knowledge. Overall, further research is needed in informing evidence-based practice of treatment options for reverberant conditions.


#
#

CONCLUSIONS

To summarize, perception of reverberant speech involves an interplay of listener factors, including bottom-up (e.g., hearing loss and temporal envelope sensitivity) and top-down (e.g., working memory capacity) variables. Moreover, the contribution of various listener factors may vary based on the amount of signal degradation. The purpose of the current study was to investigate what listener factors, among degree of hearing loss, age, temporal envelope processing, and working memory capacity, were mostly closely associated with speech intelligibility across a range of levels of reverberant degradation. These results support the use of primarily suprathreshold measures when it comes to predicting speech perception across a range of reverberant conditions.


#

Abbreviation

RST: Reading Span Test


#

No conflict of interest has been declared by the author(s).

Acknowledgments

The authors thank Frederick Gallun for helpful comments during the planning phase of the project, Pavel Zahorik for providing the reverberation simulation, Alfred Rademaker for his assistance with power analyses, and Laura Mathews for assistance with data collection.

This research was partially supported by a Student Investigator Research Grant from the American Academy of Audiology/American Academy of Audiology Foundation and the National Institutes of Health Grants R01 DC0060014 and R01 DC012289.


Portions of this work were presented at the Association for Research in Otolaryngology 39th MidWinter Meeting 2016, San Diego, CA, February 2016.


  • REFERENCES

  • Anderson S, Parbery-Clark A, White-Schwoch T, Kraus N. 2012; Aging affects neural precision of speech encoding. J Neurosci 32 (41) 14156-14164
  • Daneman M, Carpenter PA. 1980; Individual differences in working memory and reading. J Verbal Learn Verbal Behav 19 (04) 450-466
  • Fabry D, Tehorz J. 2005; A hearing system that can bounce back from reverberation: a new hearing aid function has been created that reduces the influence of reverberation in both directional and omnidirectional microphone modes. Hear Rev 12 (10) 48
  • Folstein MF, Folstein SE, McHugh PR. 1975; “Mini-mental state”. A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res 12 (03) 189-198
  • Füllgrabe C, Rosen S. 2016. Investigating the role of working memory in speech-in-noise identification for listeners with normal hearing. In: van Dijk P, Başkent D, Gaudrain E, de Kleine E, Wagner A, Lanting C. Physiology, Psychoacoustics and Cognition in Normal and Impaired Hearing. Cham, Switzerland: Springer International Publishing; 29-36
  • Gordon-Salant S, Fitzgibbons PJ. 1993; Temporal factors and speech recognition performance in young and elderly listeners. J Speech Hear Res 36 (06) 1276-1285
  • Gordon-Salant S, Fitzgibbons PJ. 1995; Recognition of multiply degraded speech by young and elderly listeners. J Speech Hear Res 38 (05) 1150-1156
  • Gordon-Salant S, Fitzgibbons PJ. 1999; Profile of auditory temporal processing in older listeners. J Speech Lang Hear Res 42 (02) 300-311
  • Hodgson M, Rempel R, Kennedy S. 1999; Measurement and prediction of typical speech and background-noise levels in university classrooms during lectures. J Acoust Soc Am 105 (01) 226-233
  • Hodgson M, Steininger G, Razavi Z. 2007; Measurement and prediction of speech and noise levels and the Lombard effect in eating establishments. J Acoust Soc Am 121 (04) 2023-2033
  • Houtgast T, Steeneken HJ. 1985; A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria. J Acoust Soc Am 77 (03) 1069-1077
  • Jerger J. 1970; Clinical experience with impedance audiometry. Arch Otolaryngol 92 (04) 311-324
  • Kjellberg A. 2004; Effects of reverberation time on the cognitive load in speech communication: theoretical considerations. Noise Health 7 (25) 11-21
  • Kociński J, Ozimek E. 2015; Speech intelligibility in rooms with and without an induction loop for hearing aid users. Arch Acoust 40 (01) 51-58
  • Levitt H. 1971; Transformed up-down methods in psychoacoustics. J Acoust Soc Am 49: 2, Suppl):-467
  • Lewis DE. 1994; Assistive devices for classroom listening. Am J Audiol 3 (01) 58-69
  • Marrone N, Mason CR, Kidd Jr G. 2008; The effects of hearing loss and age on the benefit of spatial separation between multiple talkers in reverberant rooms. J Acoust Soc Am 124 (05) 3064-3075
  • McCloy DR, Wright RA, Souza PE. 2015; Talker versus dialect effects on speech intelligibility: A symmetrical study. Lang and Speech 58 (03) 371-386
  • Nábĕlek AK, Letowski TR, Tucker FM. 1989; Reverberant overlap- and self-masking in consonant identification. J Acoust Soc Am 86 (04) 1259-1265
  • Nábělek AK, Mason D. 1981; Effect of noise and reverberation on binaural and monaural word identification by subjects with various audiograms. J Speech Hear Res 24 (03) 375-383
  • Oates PA, Kurtzberg D, Stapells DR. 2002; Effects of sensorineural hearing loss on cortical event-related potential and behavioral measures of speech-sound processing. Ear Hear 23 (05) 399-415
  • Pichora-Fuller MK, Singh G. 2006; Effects of age on auditory and cognitive processing: implications for hearing aid fitting and audiologic rehabilitation. Trends Amplif 10 (01) 29-59
  • Reinhart PN, Souza PE. 2016; Intelligibility and clarity of reverberant speech: effects of wide dynamic range compression release time and working memory. J Speech Lang Hear Res 59 (06) 1543-1554
  • Reinhart PN, Souza PE, Srinivasan NK, Gallun FJ. 2016; Effects of reverberation and compression on consonant identification in individuals with hearing impairment. Ear Hear 37 (02) 144-152
  • Rönnberg J, Arlinger S, Lyxell B, Kinnefors C. 1989; Visual evoked potentials: relation to adult speechreading and cognitive function. J Speech Hear Res 32 (04) 725-735
  • Rönnberg JI, Lunner T, Zekveld A, Sörqvist P, Danielsson H, Lyxell B, Dahlström O, Signoret C, Stenfelt S, Pichora-Fuller MK, Rudner M. 2013; The Ease of Language Understanding (ELU) model: theoretical, empirical, and clinical advances. Front Syst Neurosci 7: 31
  • Rönnberg J, Rudner M, Foo C, Lunner T.. 2008; Cognition counts: a working memory system for ease of language understanding (ELU). Int J Audiol 47 (02) (Suppl) S99-S105
  • Rosen S. 1992; Temporal information in speech: acoustic, auditory and linguistic aspects. Philos Trans R Soc Lond B Biol Sci 336 (1278) 367-373
  • Sirow L, Souza P. 2013; Selecting the optimal signal processing for your patient. Audiology Practices 5: 25-29
  • Souza P, Arehart K, Neher T. 2015; Working memory and hearing aid processing: literature findings, future directions, and clinical applications. Front Psychol 6: 1894
  • Slama MC, Delgutte B. 2015; Neural coding of sound envelope in reverberant environments. J Neurosci 35 (10) 4452-4468
  • Studebaker GA. 1985; A “rationalized” arcsine transform. J Speech Hear Res 28 (03) 455-462
  • Tremblay KL, Piskosz M, Souza P. 2003; Effects of age and age-related hearing loss on the neural representation of speech cues. Clin Neurophysiol 114 (07) 1332-1343
  • Vermiglio AJ, Soli SD, Freed DJ, Fisher LM. 2012; The relationship between high-frequency pure-tone hearing loss, hearing in noise test (HINT) thresholds, and the articulation index. J Am Acad Audiol 23 (10) 779-788
  • Zahorik P. 2009; Perceptually relevant parameters for virtual listening simulation of small room acoustics. J Acoust Soc Am 126 (02) 776-791

Corresponding author

Paul N. Reinhart
Department of Communication Sciences and Disorders, Northwestern University
Evanston, IL 60208

  • REFERENCES

  • Anderson S, Parbery-Clark A, White-Schwoch T, Kraus N. 2012; Aging affects neural precision of speech encoding. J Neurosci 32 (41) 14156-14164
  • Daneman M, Carpenter PA. 1980; Individual differences in working memory and reading. J Verbal Learn Verbal Behav 19 (04) 450-466
  • Fabry D, Tehorz J. 2005; A hearing system that can bounce back from reverberation: a new hearing aid function has been created that reduces the influence of reverberation in both directional and omnidirectional microphone modes. Hear Rev 12 (10) 48
  • Folstein MF, Folstein SE, McHugh PR. 1975; “Mini-mental state”. A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res 12 (03) 189-198
  • Füllgrabe C, Rosen S. 2016. Investigating the role of working memory in speech-in-noise identification for listeners with normal hearing. In: van Dijk P, Başkent D, Gaudrain E, de Kleine E, Wagner A, Lanting C. Physiology, Psychoacoustics and Cognition in Normal and Impaired Hearing. Cham, Switzerland: Springer International Publishing; 29-36
  • Gordon-Salant S, Fitzgibbons PJ. 1993; Temporal factors and speech recognition performance in young and elderly listeners. J Speech Hear Res 36 (06) 1276-1285
  • Gordon-Salant S, Fitzgibbons PJ. 1995; Recognition of multiply degraded speech by young and elderly listeners. J Speech Hear Res 38 (05) 1150-1156
  • Gordon-Salant S, Fitzgibbons PJ. 1999; Profile of auditory temporal processing in older listeners. J Speech Lang Hear Res 42 (02) 300-311
  • Hodgson M, Rempel R, Kennedy S. 1999; Measurement and prediction of typical speech and background-noise levels in university classrooms during lectures. J Acoust Soc Am 105 (01) 226-233
  • Hodgson M, Steininger G, Razavi Z. 2007; Measurement and prediction of speech and noise levels and the Lombard effect in eating establishments. J Acoust Soc Am 121 (04) 2023-2033
  • Houtgast T, Steeneken HJ. 1985; A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria. J Acoust Soc Am 77 (03) 1069-1077
  • Jerger J. 1970; Clinical experience with impedance audiometry. Arch Otolaryngol 92 (04) 311-324
  • Kjellberg A. 2004; Effects of reverberation time on the cognitive load in speech communication: theoretical considerations. Noise Health 7 (25) 11-21
  • Kociński J, Ozimek E. 2015; Speech intelligibility in rooms with and without an induction loop for hearing aid users. Arch Acoust 40 (01) 51-58
  • Levitt H. 1971; Transformed up-down methods in psychoacoustics. J Acoust Soc Am 49: 2, Suppl):-467
  • Lewis DE. 1994; Assistive devices for classroom listening. Am J Audiol 3 (01) 58-69
  • Marrone N, Mason CR, Kidd Jr G. 2008; The effects of hearing loss and age on the benefit of spatial separation between multiple talkers in reverberant rooms. J Acoust Soc Am 124 (05) 3064-3075
  • McCloy DR, Wright RA, Souza PE. 2015; Talker versus dialect effects on speech intelligibility: A symmetrical study. Lang and Speech 58 (03) 371-386
  • Nábĕlek AK, Letowski TR, Tucker FM. 1989; Reverberant overlap- and self-masking in consonant identification. J Acoust Soc Am 86 (04) 1259-1265
  • Nábělek AK, Mason D. 1981; Effect of noise and reverberation on binaural and monaural word identification by subjects with various audiograms. J Speech Hear Res 24 (03) 375-383
  • Oates PA, Kurtzberg D, Stapells DR. 2002; Effects of sensorineural hearing loss on cortical event-related potential and behavioral measures of speech-sound processing. Ear Hear 23 (05) 399-415
  • Pichora-Fuller MK, Singh G. 2006; Effects of age on auditory and cognitive processing: implications for hearing aid fitting and audiologic rehabilitation. Trends Amplif 10 (01) 29-59
  • Reinhart PN, Souza PE. 2016; Intelligibility and clarity of reverberant speech: effects of wide dynamic range compression release time and working memory. J Speech Lang Hear Res 59 (06) 1543-1554
  • Reinhart PN, Souza PE, Srinivasan NK, Gallun FJ. 2016; Effects of reverberation and compression on consonant identification in individuals with hearing impairment. Ear Hear 37 (02) 144-152
  • Rönnberg J, Arlinger S, Lyxell B, Kinnefors C. 1989; Visual evoked potentials: relation to adult speechreading and cognitive function. J Speech Hear Res 32 (04) 725-735
  • Rönnberg JI, Lunner T, Zekveld A, Sörqvist P, Danielsson H, Lyxell B, Dahlström O, Signoret C, Stenfelt S, Pichora-Fuller MK, Rudner M. 2013; The Ease of Language Understanding (ELU) model: theoretical, empirical, and clinical advances. Front Syst Neurosci 7: 31
  • Rönnberg J, Rudner M, Foo C, Lunner T.. 2008; Cognition counts: a working memory system for ease of language understanding (ELU). Int J Audiol 47 (02) (Suppl) S99-S105
  • Rosen S. 1992; Temporal information in speech: acoustic, auditory and linguistic aspects. Philos Trans R Soc Lond B Biol Sci 336 (1278) 367-373
  • Sirow L, Souza P. 2013; Selecting the optimal signal processing for your patient. Audiology Practices 5: 25-29
  • Souza P, Arehart K, Neher T. 2015; Working memory and hearing aid processing: literature findings, future directions, and clinical applications. Front Psychol 6: 1894
  • Slama MC, Delgutte B. 2015; Neural coding of sound envelope in reverberant environments. J Neurosci 35 (10) 4452-4468
  • Studebaker GA. 1985; A “rationalized” arcsine transform. J Speech Hear Res 28 (03) 455-462
  • Tremblay KL, Piskosz M, Souza P. 2003; Effects of age and age-related hearing loss on the neural representation of speech cues. Clin Neurophysiol 114 (07) 1332-1343
  • Vermiglio AJ, Soli SD, Freed DJ, Fisher LM. 2012; The relationship between high-frequency pure-tone hearing loss, hearing in noise test (HINT) thresholds, and the articulation index. J Am Acad Audiol 23 (10) 779-788
  • Zahorik P. 2009; Perceptually relevant parameters for virtual listening simulation of small room acoustics. J Acoust Soc Am 126 (02) 776-791

Zoom Image
Figure 1 Mean air-conduction thresholds of participants. Shaded area represents the range. Error bars represent ±1 standard deviation.
Zoom Image
Figure 2 Example sentence waveform processed at each of the three reverberation conditions.
Zoom Image
Figure 3 Results of speech intelligibility task in the three reverberation conditions. (A) Individual scores where each line is a different participant; (B) group score. Error bars represent ±1 standard deviation.