Keywords otoacoustic emissions - auditory brainstem responses - efferent pathways - speech
Introduction
The human auditory system consists of afferent and efferent pathways that interact
with each other while processing the auditory information.[1 ] The efferent pathways of the auditory system are believed to aid in selective attention,[2 ]
[3 ] protect the inner ear from auditory fatigability and acoustic trauma,[4 ]
[5 ]
[6 ] and improve the coding of signals embedded in noise,[7 ]
[8 ]
[9 ] speech perception in noise,[10 ]
[11 ]
[12 ]
[13 ] and sound localization in noise.[14 ] The role of efferent auditory pathways on speech perception in noise has received
greater attention among researchers,[10 ]
[11 ]
[12 ]
[13 ]
[15 ]
[16 ]
[17 ]
[18 ]
[19 ]
[20 ]
[21 ]
[22 ] but their functional role in speech perception is not understood.
Several studies have investigated the role of efferent auditory pathways on the perception
of speech in noise. Earlier, investigations were performed among individuals who had
undergone transection of the olivocochlear bundle during vestibular neurectomy.[11 ]
[17 ] These investigations elicited efferent activity of the auditory system by presenting
noise to the contralateral ear of participants (referred to as contralateral noise).
Findings of the above investigations showed a significant improvement in speech identification
score (5–10%) among normally hearing individuals when noise was presented to the contralateral
ear. In contrast, no such improvement in speech identification score was observed
among individuals who had undergone vestibular neurectomy. These findings revealed
the significance of efferent pathways for the perception of speech in noise. Thus,
the efferent pathways are assumed to improve speech perception in noise.
On the other hand, most studies have investigated the role of efferent pathways non-invasively
by assessing the relationship between contralateral suppression of otoacoustic emissions
(CSOAEs) and perception of speech in noise. Several investigations have reported a
significant correlation between speech perception in noise and the magnitude of CSOAE.[10 ]
[11 ]
[12 ]
[13 ]
[18 ]
[19 ] In contrast, studies have also reported no relationship between speech perception
in noise and the CSOAEs.[20 ]
[21 ]
[22 ]
[23 ]
[24 ] The discrepancy in findings across investigations could be due to differences in
tasks used to measure the CSOAE and speech perception in noise. Though no conclusive
evidence is available, the efferent pathways are believed to improve the encoding
of speech in the presence of noise. However, the CSOAE, which is commonly used for
the assessment of the efferent system, measures the magnitude of efferent activity
and does not provide information about the encoding of speech in the auditory system.
Speech evoked auditory brainstem response (speech ABR) is a useful tool to investigate
the encoding of speech in the auditory system, and it provides reliable information
about the neural coding of speech sounds.[25 ] It has been widely used to investigate the encoding of speech among elderly adults
and children with auditory processing deficits as well as to explain their speech
perception difficulties.[26 ]
[27 ] By studying the effect of contralateral noise on the speech ABR, we could understand
the role of efferent pathways on the encoding of speech in the presence of noise at
the neural level. Recently, an investigation showed no significant effect of contralateral
noise on speech ABR in quiet and in noise. In addition, no significant correlation
was found between CSOAE and unmasking of speech ABR.[28 ] The findings of the above-mentioned study suggest that efferent activity may not
have any effect on the neural encoding of speech in the presence of noise. However,
other similar studies are required before generalizing the findings of the above-mentioned
investigation. The present study was aimed to investigate the relationship between
contralateral suppression of speech ABR and contralateral suppression of transient
evoked otoacoustic emission (CSTEOAE).
Material and Methods
Participants
A total of 23 adults aged between 18 and 40 years old (mean age: 29 years) participated
in the study. All the participants had hearing sensitivity within normal limits in
both ears. The immittance evaluation showed ‘A’ type tympanogram with ipsilateral
and contralateral acoustic reflex present for pure tones at normal levels. In addition,
the contralateral acoustic reflex threshold for white noise was greater than 70 dB
SPL. None of the participants had difficulty understanding speech in quiet or noise.
In addition, none of the participants had a history of otological or neurological
dysfunction, metabolic disorders (diabetes and hypertension), or exposure to hazardous
noise or ototoxic drugs. Finally, all the participants had TEOAEs present in both
ears for non-linear clicks at 80 dB SPL. The institutional ethics committee approved
the study, and an informed consent form was obtained from all the participants before
joining this study.
Stimuli
The consonant-vowel syllable /da/, spoken by a female native speaker of Kannada, was
used to elicit speech ABR. The waveform and spectrum of stimulus used in the present
study are shown in [Fig. 1 ]. It included a stop burst in the beginning, followed by a harmonically rich and
spectrally dynamic formant transition. The stimulus duration was 232.5 milliseconds,
the fundamental frequency was 162 Hz, and the first formant frequency (F1) was 820 Hz.
White noise was used to elicit the efferent activity, and it was generated using the
Praat software (Institute of Phonetic Sciences, University of Amsterdam, The Netherlands).[29 ]
Fig. 1 Waveform (A) and spectrum (B) of stimulus /da/ used for recording speech auditory
brainstem response.
Procedure
Recording of TEOAEs
During the recording of TEOAEs, participants were made to sit comfortably on a reclining
chair. The Echoport ILO 292 OAE analyser (Otodynamics Ltd., Hatfield, UK) and the
ILO V6 computer software (Otodynamics Ltd., Hatfield, UK) were used for the recording
of TEOAEs. Initially, the TEOAE was recorded using non-linear clicks at 80 dB peak
sound pressure level (SPL) to confirm the presence of TEOAE. The TEOAEs were considered
as present when the global signal-to-noise ratio (SNR) and reproducibility were greater
than 6 dB SNR and 80%, respectively. Following this, the TEOAEs were recorded using
‘linear clicks’ at 60 dB p.e SPL to measure the contralateral suppression of TEOAE.
First, a ‘baseline’ TEOAE was recorded without presenting noise to the contralateral
ear of participants. Following this, a second TEOAE was recorded by presenting white
noise to the contralateral ear of participants. Finally, an additional baseline TEOAE
was obtained at the end of the recording of TEOAEs. The white noise was delivered
to the contralateral ear of participants using ER-5A insert phones (Etymotic Research,
Elk Grove Village, IL) at 60 dB SPL. The TEOAEs for linear clicks were considered
as present when the SNR was greater than 3 dB SNR. The global amplitude of TEOAE and
noise floor were obtained from the ILO V6 software. The magnitude of CSTEOAE was computed
by subtracting the amplitude of TEOAE with contralateral noise from the amplitude
of baseline TEOAE.
Recording of Speech Evoked ABR
The speech ABR was recorded using the IHS Smart EP evoked potential system version
3.92 (Intelligent Hearing Systems, Miami, FL, USA). During the recording of speech
ABR, the participants were made to sit comfortably on a reclining chair. They were
instructed to relax and minimize extraneous body movements to reduce unwanted artifacts.
The electroencephalogram (EEG) was differentially recorded from the scalp using gold-plated
disc electrodes. The non-inverting electrode was placed on the vertex (Cz), inverting
electrode on the test ear mastoid (A2), and the ground electrode was placed on the
mastoid of the non-test ear (A1). To elicit the speech ABR, stimuli of single polarity
were presented to the right ear of participants using Etymotic ER-3A (Etymotic Research,
Elk Grove Village, IL) insert earphones. A total of 2,000 artifact-free responses
were collected and averaged to obtain the averaged waveform in each recording. Initially,
the speech ABR was recorded in quiet condition. Following this, the speech ABR was
recorded in noise conditions by presenting white noise to the test ear and then both
the ears, and the order of noise conditions was randomized. The white noise was delivered
to the contralateral ear of participants using ER-5A insert phones at 60 dB SPL. A
short break of 5 to 10 minutes was provided between the recordings as required by
the participants. Two recordings were obtained in each condition, once for rarefaction
and condensation polarities. The recording parameters used in the present study for
the recording of speech ABR are shown in [Table 1 ].
Table 1
Stimulus and acquisition parameters used to record the speech auditory brainstem response
Stimulus parameters
Stimuli
Natural speech /da/
Intensity
70 dB SPL
Repetition rate
3.234/sec
Polarity
Single polarity
(rarefaction and condensation)
Number of stimuli
2,000
Broadband noise
60 dB SPL
Acquisition Parameters
Electrode montage
Inverting – test ear mastoid
Non-inverting – vertex
Ground – non test ear mastoid
Filter
50 to 1,500 Hz
Analysis window
−30 to 250 times milliseconds
Amplification
1,00,000 times
Using the speech ABR waveforms for rarefaction and condensation polarities, the sum
and the difference waveforms were obtained. Adding and subtracting the waveforms of
rarefaction and condensation polarities selectively enhances the amplitude of components
of speech ABR.[30 ] The sum and difference waveform were subjected to fast Fourier transform (FFT) analysis
to measure the amplitude at the F0 and F1 components of speech ABR respectively. For
this purpose, a 50-millisecond segment of the averaged waveform was extracted in the
post-stimulus response waveform (88–138 milliseconds). To increase frequency resolution
in the FFT spectrum, the length of the extracted waveform was increased to 1,024 points
by zero-padding. From the FFT spectrum, the peak amplitude at frequencies between
150 and 174 Hz and 740 and 900 Hz was noted, referred to as peak F0 and peak F1 amplitudes,
respectively. In addition, the average F0 and F1 amplitude were computed by averaging
the amplitude at frequencies between 150 and 174 Hz and 740 and 900 Hz, respectively.
The difference in amplitude of F0 between ipsilateral noise and binaural noise conditions
was considered contralateral suppression (unmasking) of speech ABR.
Statistical Analysis
Initially, the amplitude of TEOAE in baseline and contralateral noise conditions,
the magnitude of CSTEOAE, the amplitude of F0 and F1 of speech ABR across conditions,
and the magnitude of unmasking of the amplitude of F0 were subjected to the Shapiro-Wilk
test to check for normal distribution. The paired samples ‘t’ test was performed to
investigate the effect of conditions on the amplitude of TEOAE. A repeated measure
analysis of variance (ANOVA) was performed to investigate the effect of conditions
(quiet, ipsilateral noise, and binaural noise) on the F0 amplitude of speech ABR.
A Pearson correlation analysis was performed to investigate the relationship between
CSTEOAE and the unmasking of the amplitude of speech ABR.
Results
The TEOAE data of 19 participants were available for statistical analysis; in the
remaining four participants, the TEOAE was judged to be absent due to high noise floor.
The mean amplitude of TEOAE in the baseline condition was 8.2 dB SPL (standard deviation
[SD = 6.3]), and, in the contralateral noise condition, it was 6.9 dB SPL (SD = 6.5).
The mean contralateral suppression of TEOAE was 1.3 dB (SD = 0.8). The amplitude of
TEOAE in the baseline condition was higher than the amplitude of TEOAE in the contralateral
noise condition. The Shapiro-Wilk test showed that the TEOAE amplitude difference
between baseline and contralateral noise conditions was normally distributed (W = 0.97,
p = 0.768). To investigate if the mean amplitudes are significantly different between
conditions (baseline and contralateral noise), a paired samples ‘t’ test was performed.
It showed a significant effect of conditions on the amplitude of TEOAEs (t [13] = 6.184,
p < 0.001). Thus, the reduction in the amplitude of TEOAE in contralateral noise condition
was significant.
The speech ABR was present in 15 individuals in quiet and ipsilateral noise conditions
and in 14 individuals in binaural noise condition. In the remaining eight participants,
the speech ABR was found to be absent. The amplitude of F0 was above the noise floor
for all participants across conditions. In contrast, the amplitude of F1 in quiet
and noise conditions was above the noise floor in 7 and 8 participants, respectively.
Since the amplitude of F1 was measurable only in 50% of the participants, it was not
considered for further statistical analysis. [Fig. 2 ] shows grand averaged waveforms of speech ABR in quiet and noise conditions. [Fig. 3 ] shows the spectrum of pre-stimulus activity and response waveform in quiet and noise
conditions. [Table 2 ] shows the mean peak F0 amplitude of the and average F0 amplitude for transition
and sustained portion of speech ABR across quiet and noise conditions. The mean peak
F0 and average F0 amplitude were highest in quiet condition and lowest in binaural
noise condition. In noise, the mean amplitude of F0 was similar in both ipsilateral
noise and binaural noise conditions. Furthermore, among 8 (57.1%) participants, the
amplitude of F0 in binaural noise condition was greater than in ipsilateral noise
condition.
Fig. 2 Grand averaged waveforms of the speech auditory brainstem response in quiet, ipsilateral
noise, and binaural noise conditions.
Fig. 3 The spectrum of response waveform of the speech auditory brainstem response (solid
line) and pre-stimulus activity (dotted line) in quiet, ipsilateral noise, and binaural
noise conditions (panels A and C). Grand averaged waveforms of the speech auditory
brainstem response in quiet, ipsilateral noise, and binaural noise conditions (panels
B and D). Panels A and B represent the transition portion of speech's auditory brainstem
response, and panels C and D represent the sustained portion of speech auditory brainstem
response.
Table 2
Mean and standard deviation (in parenthesis) of the amplitude of F0 for transition
and sustained portion of speech auditory brainstem response across quiet, ipsilateral
noise, and binaural noise conditions
Condition
Transition portion
Sustained portion
Average F0
Peak F0
Average F0
Peak F0
Quiet
Mean
(SD)
2.64
(0.83)
3.54
(0.84)
3.484
(1.757)
4.314
(1.882)
Ipsilateral noise
Mean
(SD)
2.32
(1.2)
3.11
(1.53)
3.200
(1.163)
3.951
(1.338)
Binaural noise
Mean
(SD)
2.86
(1.48)
3.75
(1.44)
3.162
(1.229)
3.952
(1.467)
The Shapiro-Wilk test showed that the peak F0 amplitude of the transition portion
and the peak F0 amplitude and average F0 of the sustained portion of speech ABR across
conditions was normally distributed (p > 0.05). But the average F0 amplitude of the transition portion in quiet condition
was not normally distributed (W = 0.819, p = 0.009). To investigate if the F0 amplitudes are significantly different across
conditions, a repeated-measures ANOVA was performed separately for the peak F0 amplitude
of the transition and sustained portions and average F0 amplitude of the sustained
portion of speech ABR with conditions (quiet, ipsilateral noise, and binaural noise)
as repeated measures. It showed no significant effect of conditions on the peak F0
amplitude of the transition portion (F [2.26] = 0.906, p = 0.417), peak F0 amplitude (F [2.26] = 0.42, p = 0.662), and average F0 amplitude (F [1.434,18.644] = 0.44, p = 0.586) of the sustained portion. The Friedman test was performed for the average
F0 amplitude of the transition portion of speech ABR with conditions (quiet, ipsilateral
noise, and binaural noise) as repeated measures. It showed no significant effect of
conditions on the average F0 amplitude of the transition portion (χ2 [2.26] = 0.571, p = 0.751).
The data was subjected to correlation analysis to investigate the relationship between
CSTEOAE and unmasking of the amplitude of F0 of speech ABR. The Shapiro-Wilk test
showed that the magnitude of unmasking of the amplitude of peak F0 and average F0
of speech ABR was normally distributed for the sustained portion (peak F0 [W = 0.9053,
p = 0.158]; average F0 [W = 0.9388, p = 0.403]), but was not normally distributed for the transition portion (peak F0 [W = 0.8231,
p = 0.01]; average F0 [W = 0.7049, p < 0.001]). The Pearson correlation analysis was performed to investigate the relationship
between CSTEOAE and unmasking of the F0 amplitude of speech ABR for the sustained
portion. The Spearman correlation analysis was performed to investigate the relationship
between CSTEOAE and unmasking of the F0 amplitude of speech ABR for the transition
portion. The results of the correlation analysis are shown in [Table 3 ]. The results showed a significant positive correlation between the magnitude of
CSTEOAE and the magnitude of unmasking of the peak F0 amplitude (both sustained and
transition portion) and the average F0 amplitude of speech ABR of sustained portion.
[Fig. 4 ] shows the scatter plot with trend line showing the relationship between CSTEOAE
and the magnitude of unmasking of the amplitude of ABR.
Table 3
Findings of the correlation analysis
Transition portion
Sustained portion
Average F0
Peak F0
Average F0
Peak F0
Contralateral suppression of TEOAE
Pearson's
p -value
–
–
0.832
0.003**
0.841
0.004*
Spearman's
p -value
0.34
0.336
0.663
0.037*
–
–
Abbreviation: TEOAE, transient evoked otoacoustic emission.
Note: * p < 0.05, ** p < 0.01, *** p < 0.001.
Fig. 4 Scatter plot with trend line showing the relationship between the contralateral suppression
of transient evoked otoacoustic emission and the magnitude of unmasking of the amplitude
of auditory brainstem response. Panels A and B show the relationship between the contralateral
suppression of transient evoked otoacoustic emission and the magnitude of unmasking
of the amplitude for sustained portion of the auditory brainstem response. Panels
C and D show the relationship between the contralateral suppression of transient evoked
otoacoustic emission and the magnitude of unmasking of the amplitude for transition
portion of the auditory brainstem response.
Discussion
The findings of the present study showed a significant reduction in the amplitude
of TEOAE when white noise was presented to the contralateral ear compared with the
baseline condition. These results are consistent with the findings of several investigations.[12 ]
[18 ]
[20 ]
[31 ] Further, the mean CSTEOAE obtained in the present study was comparable to the results
of earlier investigations.[31 ] On the other hand, the mean F0 amplitude of speech ABR was highest in quiet condition
than in noise condition. Several investigations have reported a similar finding.[28 ]
[32 ]
[33 ] The reduction of the F0 amplitude in noise conditions could be attributed to the
masking effects of noise on the speech ABR. Further, the mean amplitude of F0 was
similar in both binaural and ipsilateral noise conditions. These findings are comparable
to the findings of earlier investigation.[28 ] In addition, the amplitude of F0 was higher in binaural noise condition compared
with ipsilateral noise condition among 57% of the participants. The higher amplitude
of F0 in binaural noise compared with ipsilateral noise condition reflects the unmasking
effects of efferent activity in the auditory system.
The present study showed a significant positive correlation between contralateral
suppression of the amplitude of F0 speech ABR and TEOAE; that is, participants with
stronger efferent activity showed greater unmasking of speech ABR. This finding of
the present study contrasts with the results of an earlier investigation,[28 ] which showed no relationship between the two measures. The positive relationship
found in the present study suggested that the efferent pathways are involved in speech-in-noise
processing. This finding of the present study is consistent with several investigations
that have demonstrated a relationship between the magnitude of efferent activity and
speech recognition in noise.[11 ]
[12 ]
[13 ]
[18 ]
[19 ] However, further investigations are required before generalizing the findings and
also to evaluate the reproducibility of the findings of the present study. The limitation
of the present study is that perception of speech-in-noise was not measured in the
presence and absence of efferent activity. Assessing the relationship between the
difference in speech perception in noise scores (between presence and absence of efferent
activity) and unmasking of speech ABR could reveal the involvement of efferent pathways
in speech-in-noise processing. Furthermore, in the present study, the TEOAEs obtained
in linear mode were considered to be present when the SNR was greater than 3 dB. This
could be a limitation, as few studies have recommended very high SNR for the detection
of small changes in OAE.[34 ]
[35 ] Finally, another limitation of the present study was the small sample size. Although
a total of 23 adults participated in the study, the speech ABR was present only in
14 participants, which could be attributed to individual variability of speech ABR.
Conclusion
The findings of the present study showed that the efferent pathways are involved in
speech-in-noise processing. However, further research is required before generalizing
the findings of the study.