Open Access
CC BY-NC-ND 4.0 · Semin Hear 2022; 43(03): 162-176
DOI: 10.1055/s-0042-1756162
Review Article

Neonatal Frequency-Following Responses: A Methodological Framework for Clinical Applications

Natàlia Gorina-Careta
1   Brainlab - Cognitive Neuroscience Research Group, Department of Clinical Psychology and Psychobiology, University of Barcelona, Catalonia, Spain
2   Institute of Neurosciences, University of Barcelona, Catalonia, Spain
3   Institut de Recerca Sant Joan de Déu (IRSJD), Barcelona, Catalonia, Spain
4   BCNatal - Barcelona Center for Maternal Fetal and Neonatal Medicine (Hospital Sant Joan de Déu and Hospital Clínic), University of Barcelona, Barcelona, Catalonia, Spain.
,
Teresa Ribas-Prats
1   Brainlab - Cognitive Neuroscience Research Group, Department of Clinical Psychology and Psychobiology, University of Barcelona, Catalonia, Spain
2   Institute of Neurosciences, University of Barcelona, Catalonia, Spain
3   Institut de Recerca Sant Joan de Déu (IRSJD), Barcelona, Catalonia, Spain
,
Sonia Arenillas-Alcón
1   Brainlab - Cognitive Neuroscience Research Group, Department of Clinical Psychology and Psychobiology, University of Barcelona, Catalonia, Spain
2   Institute of Neurosciences, University of Barcelona, Catalonia, Spain
3   Institut de Recerca Sant Joan de Déu (IRSJD), Barcelona, Catalonia, Spain
,
Marta Puertollano
1   Brainlab - Cognitive Neuroscience Research Group, Department of Clinical Psychology and Psychobiology, University of Barcelona, Catalonia, Spain
2   Institute of Neurosciences, University of Barcelona, Catalonia, Spain
3   Institut de Recerca Sant Joan de Déu (IRSJD), Barcelona, Catalonia, Spain
,
M Dolores Gómez-Roig
3   Institut de Recerca Sant Joan de Déu (IRSJD), Barcelona, Catalonia, Spain
4   BCNatal - Barcelona Center for Maternal Fetal and Neonatal Medicine (Hospital Sant Joan de Déu and Hospital Clínic), University of Barcelona, Barcelona, Catalonia, Spain.
,
Carles Escera
1   Brainlab - Cognitive Neuroscience Research Group, Department of Clinical Psychology and Psychobiology, University of Barcelona, Catalonia, Spain
2   Institute of Neurosciences, University of Barcelona, Catalonia, Spain
3   Institut de Recerca Sant Joan de Déu (IRSJD), Barcelona, Catalonia, Spain
› Author Affiliations

FUNDING This work was supported by the Grant PGC2018–094765-B-I00 project funded by MCIN/AEI/10.13039/501100011033 and “ERDF A way of making Europe,” the Grant MDM-2017–0729–18–2 funded by MCIN/AEI/ 10.13039/501100011033, the 2017SGR-974 Excellence Research Group of the Generalitat de Catalunya, the Fundación Alicia Koplowitz (Madrid, Spain), a grant from the “Convocatòria d'ajuts a la recerca IRSJD – Carmen de Torres 2022 (2022AR-IRSJDCdTorres), and the ICREA Acadèmia Distinguished Professorship awarded to Carles Escera.
 

Abstract

The frequency-following response (FFR) to periodic complex sounds is a noninvasive scalp-recorded auditory evoked potential that reflects synchronous phase-locked neural activity to the spectrotemporal components of the acoustic signal along the ascending auditory hierarchy. The FFR has gained recent interest in the fields of audiology and auditory cognitive neuroscience, as it has great potential to answer both basic and applied questions about processes involved in sound encoding, language development, and communication. Specifically, it has become a promising tool in neonates, as its study may allow both early identification of future language disorders and the opportunity to leverage brain plasticity during the first 2 years of life, as well as enable early interventions to prevent and/or ameliorate sound and language encoding disorders. Throughout the present review, we summarize the state of the art of the neonatal FFR and, based on our own extensive experience, present methodological approaches to record it in a clinical environment. Overall, the present review is the first one that comprehensively focuses on the neonatal FFRs applications, thus supporting the feasibility to record the FFR during the first days of life and the predictive potential of the neonatal FFR on detecting short- and long-term language abilities and disruptions.


Universal neonatal hearing screening (UNHS), also known as early hearing detection and intervention program (EHDI), is a clinical strategy aimed at the identification, intervention, and follow-up of newborns with congenital deafness and hearing loss.[1] Performed in the hospital right after birth, its implementation has had a strong impact in the worldwide public health, as it has become an essential tool to make an early detection of hearing impairments in every single hospital birth. The failure to pass the auditory screening will generally lead to a referral for a thorough audiological examination. Thus, the UNHS provides an opportunity for an early intervention to address deafness and avoid the consequences of congenital hearing loss in neurodevelopment.

Currently, the UNHS is performed worldwide via otoacoustic emissions (OAE) and automated auditory brainstem responses (AABR). These are two noninvasive and automated screening tests that can be applied separately or sequentially and which are usually performed at the bedside in term and preterm infants.[2] However, despite being born healthy and passing the UNHS, a significant number of children suffer neurodevelopmental delays, which could lead to alterations or deficits in language acquisition and reading comprehension, with the consequent impact on their cognitive function and emotional regulation, and the corresponding socioeconomic negative impact. Among these newborns are those with some objective and known risk factors for early childhood hearing loss,[3] as well as those who come from a high-risk pregnancy, such as fetal growth restriction (FGR), fetal macrosomia, congenital syphilis, and those at risk for the development of disorders related to language and reading acquisition (dyslexia, specific language disorders, auditory processing disorders, etc.) or neurodevelopmental disorders (autism, among others).

To date, there is a lack of objective procedures that can aid in the early detection of newborns at risk for language processing difficulties during the first moments of life, beyond the implicit developmental compromises associated with high-risk pregnancies. In fact, most language-disorder diagnoses are not made until early childhood, when the infant does not present the expected typical behavior or even displays an altered or deficient behavioral pattern for its stage of development. Therefore, any chance to detect a potential language impairment at birth in a similar manner as congenital deafness is detected with the UNHS would be of remarkable impact, as preventive measures could be implemented during the first months of life when the nervous system is at its peak state of plasticity.

To address this important issue, we suggest the Frequency-Following Response (FFR) as a powerful and novel tool with high potential for the early detection of language disorders which will appear during childhood. The FFR is an auditory evoked potential that can be obtained with a similar equipment to that used in the AABR test of the UNHS and provides a window to explore the integrity of the auditory pathway beyond the mere transmission of sound, thus assessing the fidelity with which the rich acoustic information that characterizes language or music is encoded in the central auditory system.

THE FREQUENCY-FOLLOWING RESPONSE

The FFR is a sustained and periodic auditory evoked potential that reflects synchronous neural phase locking to the spectrotemporal components of the acoustic signal in the ascending auditory system. FFRs can be recorded noninvasively from the scalp with electroencephalography (EEG) and magnetoencephalography, and emerge between 7 and 15 milliseconds from sound onset to auditory frequencies in the range of 100 to 1,500 Hz.[4] [5] [6] By reflecting phase-locked neural activity to the incoming sounds, FFRs faithfully mimic and are as complex as the eliciting stimulus as it unfolds in time, to an extent that they can be recognized as such when played through a speaker.[7] [8]

While the term “frequency-following response” is the most commonly used and probably the most comprehensive, this neural response has been termed throughout the literature with other names that have been used interchangeably or which highlight a specific aspect or variant of the response. These include complex auditory brainstem response,[4] speech auditory brainstem response,[9] envelope-following response,[10] and amplitude-modulation following response.[11] In some contexts, it has even been included under the term 80-Hz ASSR[12] [13] or subcortical steady-state response.[13] [14] It is important to note that some of these variants include the term “brainstem.” This is due to the fact that it has long been considered a measure of sound encoding originating exclusively in the subcortical auditory system. Yet, it is currently widely accepted that FFR is better understood as an aggregate response reflecting the synchronized neural activity of multiple generators throughout the entire auditory system including subcortical and cortical levels. Specifically, seminal studies located the generator origins of the FFR in the inferior colliculi (for review see the article by Chandrasekaran and Kraus[15]). However, recent studies demonstrated that the FFR represents an integrated response of the entire auditory system, with contributions from both subcortical and cortical structures (for review see the article by Coffey et al[16]). Furthermore, it has been shown that the involvement and degree of contribution of the different structures of the ascending auditory pathway depend on the frequency of the incoming stimulus.[17] Thus, the scientific community has agreed that the term “FFR” is the most accurate one, as it refers exclusively to what the component is: a response that follows the frequencies of the incoming stimulus.[18]

The FFR has great potential to answer both basic and applied questions about the processes involved in sound encoding, language development, and communication. It can be obtained through passive and active listening paradigms and, by decomposing the recorded signal into temporal and spectral domains, this electrophysiological response provides an objective indicator of the fundamental acoustic features intrinsic to speech sounds, including time (onset and end latency), pitch (fundamental frequency, F0), and timbre (the harmonics).[4] [6] [18] [19] Specifically, the FFR allows studying the latency and amplitude of the neural response elicited to incoming sounds in the time domain. By analyzing the frequency components of the neural response in the spectral domain, the magnitude with which the fundamental frequency and harmonics have been encoded can also be studied using the FFR[18] ([Fig. 1]). With these analytical approaches, it is possible to read neural traces from the scalp as sounds are transcribed into neuronal aggregates. The analysis of the FFR provides a window to understand how these neural sound traces are shaped by experience, context, and challenging conditions, such as listening in noise, age, and speech and language disorders.

Zoom
Figure 1 Morphology and characteristics of the frequency-following response (FFR). The FFR is a periodic auditory evoked potential that can be recorded in response to both simple stimuli (i.e., pure tones) and complex stimuli. In the top panel, a consonant–vowel syllable /da/ is represented in gray and the corresponding FFR recorded at the Fpz electrode in a newborn is represented in red. As can be observed, the FFR mimics the incoming stimulus by synchronizing with its temporal features, thus capturing with high fidelity and accuracy the periodic characteristics of sound in the ascending auditory system. Additionally, the FFR also encodes the spectral features of the incoming stimulus, as demonstrated in the frequency spectrum, tone tracking, and spectrogram in the bottom panels. The frequency spectrum illustrates the amplitude of the spectral decomposition of the FFR, which reveals a clear peak corresponding to the fundamental frequency of the stimulus (113 Hz in this recording). In addition, the pitch tracking provides a measure of the precision with which the FFR encodes changes in the fundamental frequency over the duration of the stimulus (stimulus frequency in black; FFR pitch tracking in red). The spectrogram provides combined information on both the frequency and amplitude at which the FFR synchronizes with the different components of the incoming stimulus. Overall, the figure illustrates how the FFR synchronizes with the stimulus that elicits it even in a single individual, providing a very useful tool in the fields of audiology and auditory cognitive neuroscience and in the study of auditory abilities at the individual level. For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article. (Modified with permission from Ribas-Prats T, Almeida L, Costa-Faidella J, et al. The frequency-following response (FFR) to speech stimuli: a normative dataset in healthy newborns. Hear Res 2019;371:28–39.[19])

In particular, the FFR is modulated by context-dependent contingencies[20] [21] and by the real-time statistical properties of the incoming stimuli.[22] [23] Furthermore, the FFR is sensitive to short-term auditory training and to the different auditory experiences that occur throughout life, such as musical training[24] and language exposure,[25] thus revealing the intrinsic neural plasticity of the auditory system. For instance, by means of the FFR, it has been shown that the auditory subcortical processing is not static but can be modulated with short-term auditory training.[26] This suggests that some sensory deficits caused by degraded sound processing could be improved by auditory training procedures. In fact, it has been shown that specific auditory training can alter the neural encoding of complex sounds by improving synchronization at a subcortical level in children with learning difficulties.[27]

Going a step further, the FFR has also become an important tool to assess the neural encoding of sounds under adverse conditions, such as speech-in-noise,[28] age-related changes in sound encoding,[29] and language acquisition disorders.[18] In particular, disruptions on the FFR have been observed in children with reading or language disorders, specifically with deficits in reading and phonological awareness, as they depict a significantly slower neural response timing, a weaker neural encoding of formant-related stimulus harmonics, and less robust tracking of frequency contours than typically developing children.[30] [31] [32] [33] [34] [35] [36] Furthermore, neurodevelopmental disorders that are characterized by impaired communicative and literacy skills, such as dyslexia or autism spectrum disorder (ASD), have been associated with an abnormal subcortical representation of speech sounds. This is evidenced as deficits in timing and frequency encoding of speech sounds,[32] [34] [37] [38] [39] as well as reduced stability of the FFR,[13] [40] [41] are observed in children with dyslexia and ASD relative to their typically developing peers.

It is important to note that the FFR has been demonstrated to be able to assess not only how speech sounds are encoded but also to predict how the literacy in a year time will be in children who have not yet learned to read, thus becoming a putative tool to provide an early diagnosis of school-age learning disabilities.[13] [42]

Given its sensitivity and modulation by different auditory experiences, the FFR is becoming a promising tool for the assessment of the neural coding of speech sounds in both healthy and clinical populations.[18] Thus, the study of this electrophysiological response could allow early identification of possible language disorders and enable early interventions to prevent or ameliorate sound and language encoding disorders.


TECHNICAL ASPECTS FOR OBTAINING A NEONATAL FFR: STIMULATION, RECORDING, AND ANALYSIS PROCEDURES

The growing interest in the use of FFR resides not only in its relation to language neural encoding but also in the fact that it is a nonexpensive, noninvasive and easy-to-use technique. In practical terms, FFR recording procedures are very similar to those used in any typical EEG recordings to obtain a variety of electrophysiological responses in adults and children: it requires a good cleaning of the scalp area where the electrodes will be placed, a precise location and placement of the electrodes by specialized personnel, and adequate recording equipment, among other specifications. Nonetheless, it is important to highlight that the FFR recording in newborns entails additional characteristics that must be taken into account (for more information on the recording procedure, see Ribas-Prats et al[19]). Based on our experience of over 5 years collecting neonatal FFRs in newborns, we summarize and present the best methodological approaches to record the neonatal FFR in a clinical environment (for a comparison with parameters used to record an adult FFR, refer to [Table 1]).

Table 1

Recommended Preparation and Recording Parameters to Record Neonatal and Adult FFRs

Recommendation

Rationale

Preparation

Adults

Recording performed in an electrically and acoustically shielded room

Recording carried away from any type of electrical and acoustical interference

Participant instructed to relax and not move during the recording. Instructed to blink at a normal pace and eye activity recorded with a vertical and horizontal electrooculogram

Avoid possible muscle movements that could contaminate the recording

Clean the facial area where electrodes will be placed with (1) alcohol, (2) abrasive gel, and (3) alcohol again

Important to remove sweat, makeup, and other residuals that might be on the skin. The second cleaning with alcohol is important to remove all the rest of abrasive gel and avoid causing irritation when placing an electrode on top of it

After the recording, remove the electrodes using a gauze or a cotton pad soaked with alcoholic solution or using an adhesive remover such as Niltac

Remove the facial electrodes with minimum harm

Newborns

Recording performed in the hospital room at the maternity ward, with the baby in its crib or in the mother's bed without skin-to-skin contact

Recording carried away from any type of electrical interference, including the electricity from the adult's own skin. If the bed is electrical, it is recommended to be disconnected

Participant sleeping throughout the whole FFR recording. It is recommended that the recording is performed ∼15 minutes after feeding to ensure a deep sleep

Avoid possible muscle movements that could contaminate the recording

Clean the facial skin area where electrodes will be placed with (1) abrasive gel and (2) alcohol or saline solution

Remove residual substances from birth. Especially important as newborns can still have vernix caseosa on the skin

After the recording, remove the electrodes using a gauze or a cotton pad soaked with alcoholic solution or using an adhesive remover such as Niltac

Remove the electrodes with a minimum harm to the newborn

Recording

Adults

Vertical montage (active: Cz/FCz; reference: earlobe(s); ground: forehead)

Usually recorded with a 32-channel EEG cap for better placement of Cz/FCz electrodes. Reference positioned on the earlobe as it is a noncephalic site that causes fewer artifacts from bone vibration

Type of earphone: insert earphone

Enhanced interaural attenuation with the transducers far from the reference electrodes

Recording after an audiometry

To ensure a proper transmission of the sounds through the inner ear

Recording with a recommended maximum duration of 2 h

Participants need to be relaxed and stay still, so if the recording lasts more than 2 hours, pauses are recommended to allow the participant to move and go to the bathroom

Newborns

Vertical montage (active: forehead; reference: ipsilateral mastoid; ground: forehead)

Reference positioned on the mastoid due to the small size of the earlobes. Although possible, usually not recorded with EEG caps as newborns can still have vernix caseosa or blood residuals from birth on the head

Type of earphone: circumaural earphone

Avoid canal collapse and attenuation of ambient noise allowing the screening in any environment and decreasing the discomfort caused by standard earphones

Recording after the UNHS has been passed. Before the FFR recording, an ABR recording to a click stimulus is also advisable

The passing of the UNHS ensures the integrity of the auditory pathway and the external auditory canal. An ABR recording ensures the proper neural transmission of sound through the auditory brainstem

Recording with a recommended maximum duration of 35–40 minutes, ideally between 30 and 35 minutes

Newborns need to be fed regularly, with a maximum space between feedings of 2 hours. If recording lasts more than 40 minutes, the chances that they wake up and recording can't be finished increase

Abbreviations: ABR, auditory brainstem response; EEG, electroencephalography; FFRs, frequency-following responses; UNHS, universal neonatal hearing screening.


Recommendations and Parameters for Recording the Neonatal FFR

First of all, an important emphasis should be made on cleaning the sensor placement area on the scalp using an appropriate abrasive gel, since the skin presents the highest cutaneous electrical impedances during the first week of life, decreasing to reach adult-like levels around 1 to 4 months of age.[43] Additionally, while it is common to use the earlobes as a reference point in adults to avoid the postauricular muscle response of the muscle located behind the ear over the mastoid bone, in newborns the mastoids are the most frequently used reference point, due to the small size of the earlobes.

Second, before starting any EEG recording in newborns, it is important to ensure the integrity of the auditory pathway and the external auditory canal. The presence of secretions, mucus, or amniotic fluid from the uterine period and childbirth could be partially obstructing the ear canal, hence interfering in the transmission of the sound wave through the external auditory pathway and, therefore, impoverishing the quality of the physical sound input. In this regard, it is recommended that the FFR recording is performed after the universal newborn hearing screening has been passed, thus ensuring that the newborn is receiving the sounds in the cochlea properly. An ABR recording to a click stimulus is also advisable, to ensure the proper neural transmission of sound through the auditory brainstem and a good positioning of the electrodes and earphones. Likewise, it is advisable for the study to be performed while the mother and baby are in the maternity ward during the hours or days following birth to avoid the discomfort derived from a second appointment. In addition to considering all the aforementioned particularities, it is also recommended that the newborn is sleeping throughout the whole FFR recording to avoid possible muscle movements that could contaminate the recording ([Fig. 2]). It is also advisable to conduct the test in the baby's crib, away from any type of electrical interference, including the electricity from the adult's own skin—in the event that the baby is being cuddled or breast-fed, or from electronic devices that monitor the vital signs of the newborn or the mother.

Zoom
Figure 2 Recording setup. The recommended recording setup to obtain a neonatal FFR is with a minimum of three disposable snap Ag/AgCl electrodes placed in a vertical montage: the active electrode was located at Fpz (white electrode), reference at the mastoid behind the ear, and ground electrode at the forehead (black electrode). In this example, two reference electrodes are positioned in the mastoid bones behind the right and left ears (red and blue electrodes, respectively). Although possible, neonatal FFR is not typically recorded with EEG caps as newborns can still have vernix caseosa or blood residuals from birth on the head. The reproduction of the participant's picture is with the written consent of her mother. (The photograph has been published with consent.)

Overall, several important details need to be taken into account to record a neonatal FFR (see [Table 1] for a summary of the recommended parameters; [Fig. 2] for the recommended setup). In this regard, the use of amplifiers designed by IHS (Intelligent Hearing Systems, Miami, FL), which currently provide the only portable amplifiers on the market with the capability to record FFRs to long-duration speech, has become widespread. Their system is small, with an integrated electrical and magnetic isolation and is formed by a single device which can both present auditory stimuli and, at the same time, record the FFR in a flexible way. However, it is also possible to record the FFR in newborns using the classic experimental setup that comprises two separate units, one for stimulation and another one for acquisition. When using the classic EEG experimental setup, it is recommended to have a dedicated room for the recordings, to ensure a quiet environment and minimum electrical interference.

At the end, each of the different amplifier options has different advantages, and the actual needs of your recording design have to be taken into account. In particular, the portable IHS system makes the recording in the hospital room in the maternity ward easier, as it is smaller and with only one piece of hardware both the stimulation and the acquisition can be achieved. On the other hand, its main disadvantage is that it allows the FFR recording only with few active channels and the EEG signal is saved in blocks of a minimum of 50 trials. Using a setup with two units has other advantages, such as increasing options in FFR analysis, allowing, for example, the analysis of the FFR signal in single trials, the analysis of phase-locking elements of the response, and even other aspects concerning the stability of the response, for which a longer stimulus would be required, an option that is not possible with the portable IHS equipment. Also, such two-unit systems are needed if the stimulation plan involves several stimulus types, complex sequences, or masking by specific types of noise (e.g., babble).

Independently of the setup and amplifiers used, to record a newborn's FFR, the same technical considerations as for the recording of an FFR in adults or a click-ABR in newborns have to be taken into account. Being the FFR a low-voltage neural response, speech stimuli of short duration, such as phonemes or syllables, have to be presented in a repetitive manner to obtain thousands of repetitions of neural brain responses elicited to the same stimulus. In later stages of data analysis, these thousands of repetitions will be averaged to obtain a clear and visible FFR. Certain signal acquisition and preprocessing parameters have been established as the most appropriate for obtaining the neonatal FFR (for a systematic review of the acquisition parameters of FFR in neonates and infants, see Lemos et al[44]). The best time window to record the FFR is when the newborn is between 1 and 3 days old. The stimulus should be presented with a minimum of 2,000 repetitions, ideally 4,000, to ensure a strong FFR, with a presentation rate between three and four times per second. Sounds should be presented monaurally in the right ear in alternating polarities with an intensity set between 50 and 70 dB SPL.

Concerning the acquisition parameters, neural activity should be recorded with a minimum sampling rate of 5,000 Hz to ensure that sampling rate is at least twice the maximum frequency to be analyzed, as per the Nyquist-Shannon sampling theorem. We recommend a sampling rate between 5,000 and 20,000 Hz, ideally 10,000 Hz, to have at least three times the Nyquist frequency to ensure a better frequency precision. Regarding the filtering parameters, neural responses should be recorded with an open online bandpass filter if possible, so we recommend it to be set between 30 and 3,000 Hz. To obtain a clear FFR, in later steps of analysis an offline bandpass filter should be applied to the recorded neural activity between 80 and 1,500 Hz. Neural responses with voltage exceeding ± 30 microvolts (µV) should be removed.


Choosing the Best Stimulus

The auditory stimulus most used in the study of the FFR is the syllable /da/,[4] [44] [45] especially in infant and adult populations. Consonant–vowel syllables are sounds considered acoustically complex and, in particular, the syllable /da/ is relatively universal, since it is included in the phonetic inventories of most languages. The encoding of the initial “explosive pause” [/d/] generates perceptual challenges even in healthy adult population, especially in noisy environments. The consonant [/d/] is followed by a formant transition and a sustained segment [/a/], for which the response properties of the auditory system have been widely characterized.[4] Other stimuli which have been used are the syllables /ba/ and /ga/ and the Mandarin pitch contours of the syllables /yi/ and /mi/.

Nevertheless, the same properties that justify the use of the aforementioned consonant–vowel stimuli in the study of FFR in infant and adult populations make it unsuitable to widely characterize the neonatal FFR. This is due to technical aspects of the stimulus, such as the short duration of the transition between the consonant (i.e., /d/) and the vowel (i.e., /a/) and the high spectral content of the formants (over 500 Hz) that compose the stimulus, among others.

In this sense, it is important to highlight that after birth, the newborn's auditory system is still immature and is not yet able to encode high spectral components of speech.[46] [47] Therefore, the typical stimuli commonly used for the study of the neural encoding of the characteristics of speech sounds in adults, when used in newborns, only permit the optimal study of neural encoding of the fundamental frequency of the stimulus and its low-frequency harmonics. This provides a measure to study whether the newborn brain is capable of encoding inflections in the pitch contour of the voice, which are one of the main speech features in tonal languages such as Mandarin. However, the /da/ stimulus is not the most appropriate to study one of the most relevant aspects for language acquisition in nontonal languages such as English or Spanish: the precision with which the neonatal brain encodes the spectrotemporal fine structure of the incoming sounds.

The existence of such limitations has led to the recent development of a new stimulus that would allow the study of not only the encoding of its fundamental frequency, associated with the pitch of the voice, but also the fine structure of speech sounds. In this framework, the diphthong /oa/ has been proposed.[48] This is a stimulus of 250-millisecond duration and its internal structure (two different vowels, /o/ and /a/; together with an ascending change in the voice pitch, from a lower to a higher frequency; see [Fig. 3]) allows for the rapid and accurate evaluation of the neural encoding of these two sound features simultaneously in a suitable period of time, given the limitations imposed by recording in hospital environments.

Zoom
Figure 3 FFR fundamental frequency and temporal fine structure to the /oa/ diphthong. (A) Temporal (top) and spectral (bottom) representations of the /oa/ syllable, with a schematic overlay of its formant structure trajectory. The fundamental frequency is stable at 113 Hz from 0 to 160 milliseconds, with a linear increase to 154 Hz from 160 to 250 milliseconds. The section of the vowel /o/ (F1 = 452 Hz, F2 = 791 Hz) spans from 0 to 80 milliseconds and the section of vowel /a/ (F1 = 678 Hz, F2 = 1,017 Hz) from 90 to 250 milliseconds (F0 and F1 in solid lines; F2 in dotted line). (B) and (C) Grand averaged time-domain FFR and its spectral decomposition, recorded to the /oa/ stimulus in a sample of 34 newborns. As demonstrated in the spectral decomposition, when averaging the responses to alternating polarities (B) only the fundamental frequency of the stimulus (113 Hz) can be observed. On the other hand, when subtracting the neural responses to the two stimulus polarities (C), the stimulus fine-structure (the first formant for the /o/ and the /a/ region is observable). Note that the amplitude of the scale in (B) is twice that of (C). (Modified with permission from Arenillas-Alcón S, Costa-Faidella J, Ribas-Prats T, Gómez-Roig MD, Escera C. Neural encoding of voice pitch and formant structure at birth as revealed by frequency-following responses. Sci Rep 2021;11(1):1–16.[48])

To obtain both the fundamental frequency and the fine structure of the sound separately from a single recording, the neural signal acquired to the stimulus presented in alternating polarities—condensation and rarefaction—has to be analyzed by means of two very simple operations: average and subtraction of the two response polarities.[10] In other words, to accentuate the components of the FFR that correspond to the encoding of the stimulus fundamental frequency, and to minimize the artifacts generated by the movement of the basilar membrane itself (i.e., cochlear microphonics), an averaging of the sum of the two opposite signal polarities has to be performed (averaging = [rarefaction + condensation]/2). On the other hand, to emphasize the encoding of the stimulus spectrotemporal fine structure and, at the same time, minimize the contribution of the activity related to the fundamental frequency, an average of the subtraction of both polarities must be obtained (subtraction = [rarefaction − condensation]/2; [Fig. 3B, C]). In addition to the frequency domain analysis described earlier, the FFR can be analyzed in the time domain, obtaining several parameters related to neural phase locking.[6] [19] [48]



THE NEONATAL FREQUENCY-FOLLOWING RESPONSE IN CLINICAL POPULATIONS

In contrast to the multitude of studies on the characterization of the FFR in both adults and children in school age, very few studies have focused on the neonatal period. The first FFR recording in human neonates was performed by Gardi et al in 1979 by presenting pure sinusoidal tones of 10-millisecond duration to 22 three-day-old babies.[49] In their study, they demonstrated that not only it is feasible to record the FFR during the first days of life but also that the neonatal FFR has important similarities in both amplitude and morphology with the FFR recorded in adults.

Interestingly, despite this fundamental electrophysiological finding, it was not until three decades later when the field of neonatal FFR research expanded, and the neonates' adult-like capabilities to neural phase-lock to the incoming stimulus fundamental frequency were fully characterized. In the early 2010s, Jeng et al confirmed the feasibility of recording FFR in infants by leading a series of studies featuring more complex Mandarin-derived tones. Notable among the reported findings were the observed similarities between the neural responses recorded in American and Chinese newborns to the same speech sound,[50] [51] which confirmed the universal, innate nature of the encoding of voice pitch. Likewise, the effect of linguistic experience was corroborated, demonstrating a more robust encoding of voice pitch in Chinese adults than in Chinese newborns.[51]

Once the possibility of recording the FFR during the first days of life at the maternity hospital room was confirmed and this electrophysiological response was conceptualized as a biological snapshot of the integrity of complex sound encoding along the auditory pathway, there was a growing interest in the standardization of the recording procedures for the neonatal FFR, in both clinical populations and clinical routines. As discussed earlier, an ideal scenario for recording the neonatal FFR in the maternal ward has been implemented by our research group, in which, after obtaining the transient-evoked auditory brainstem responses to click stimulus, the FFR can be recorded by only replacing the click stimulus by a speech sound. However, the first important milestone before using the FFR to standardize the neonatal FFR recording in clinical routines was the establishment of a normative database depicting the standard variability of the newborn FFR. This comprehensive normative database was created by recording the neonatal FFR to a /da/ stimulus in 50 healthy newborns born at term, between 37 and 42 gestational weeks. In this characterization, several FFR parameters were retrieved in the time and frequency domains.[19] A normative database for the /oa/ stimulus with about hundred neonates will be offered soon by our research group.[52]

It is important to highlight that, in full-term neonates, phase-locking and spectral representation of the fundamental frequency is developed in the early days of life,[19] and does not increase significantly in older adults.[53] On the other hand, spectral representation of higher harmonics is still developing in term infants and increases significantly with increasing age[53] through at least 10 months of age.[54] The mandatory delay of exposure to high frequencies until birth caused by the high-frequency filter of the mother's womb[55] [56] along with the increasing myelination of the auditory system during the first year of life[47] may explain the lower sensitivity to the frequencies located at the upper end of the spectrum during the first years of life. Thus, the high frequencies of the first formants of the /da/ stimulus (i.e., 688 and 1,214 Hz for the first and second formants, respectively) could impede the encoding of the sound fine structure in neonates.

To explore if newborns could encode high-frequency information, a recent study used the /oa/ diphthong to record for the first time the FFR to temporal fine structure in neonates. The /oa/ is a syllable created specifically to record the encoding of the temporal fine structure in neonates, as it has a trajectory of the first formant located in low frequencies (i.e., 452 Hz in the /a/ and 678 Hz in the /o/ section, respectively). In addition to confirming previous findings regarding the similarity of the FFR between newborns and adults in terms of neural encoding of the voice pitch, results revealed that the ability of the newborn brain to encode spectrotemporal fine structure is weaker than that of adults, especially for higher sound frequencies.[48] In line with the previous literature, their results suggest that the neural encoding of the temporal fine structure is not yet fully mature at birth, but would require sound, language exposure, and time[47] [57] [58] to further develop to the level of maturity observed in adults. Overall, newborn studies demonstrate that the FFR is a brain response whose maturity would be affected by not only the maturation of brain tissues but also by the existence of certain postnatal stimulation received by the infant. Also, it has been shown that the FFR could be a very useful tool to detect deficiencies in speech and sound neural encoding during the first years of life, and it could even be hypothesized that the maturation of the neural encoding of the fine structure of speech sounds could be accelerated through some early interventions, thus taking advantage of the extraordinary brain plasticity of the first 2 years of life in infancy.

Following this first normative study and having in mind the potential of the FFR to detect difficulties in language development, several studies aimed at exploring the FFR in neonatal clinical populations. Taking into account the sensitivity of the auditory system to the state of the fetal environment and the innate and essential role of pitch encoding in language acquisition, we have been studying whether abnormal intrauterine growth could negatively affect the functionality of the auditory system. To this end, the encoding of speech sounds during the first days of life was analyzed to the /da/ stimulus with the FFR in two phenotypes representing the two ends of the birth weight continuum: babies affected by FGR[59] and babies classified as large for gestational age (LGA[60]). Altered neural encoding of the fundamental frequency was observed in both clinical groups. In particular, in the FGR newborns, a reduced encoding was observed in the vowel region of the presented stimulus, while in infants born LGA, the attenuated encoding was observed both in the vowel and consonant regions, indicating that both clinical groups present a deficit in the neural pitch tracking of sounds already present at birth. It was hypothesized that the deficient neural encoding observed in the infants born after FGR could derive from white matter alterations associated with perinatal complications, such as anoxia during birth. The attenuated encoding of the fundamental frequency observed exclusively on the vowel region of the stimulus could be explained by the stability of its spectral components, which facilitate the encoding of the fundamental frequency. Conversely, the results observed in infants born LGA were hypothesized to be due to the high accumulation of adipose tissue, which could alter, through proinflammatory agents, the microstructure of the auditory system necessary for the rapid processing of speech sounds, becoming noticeable even in the encoding of F0 in the consonant region. Other studies demonstrated that the FFR is a precise indicator of the levels of neurotoxicity in newborns affected with hyperbilirubinemia.[61] In particular, it was observed that infants with elevated bilirubinemia encoded the frequency of the first formant of the presented sounds less robustly than healthy infants. Interestingly, a significant improvement in the FFR was observed after receiving phototherapy and paralleling reduction of the bilirubin levels.

Recently, another study addressed the neonatal FFR in preterm infants.[62] These authors recorded the FFR to a syllable /da/ in infants which were born before 33 weeks of gestation. The preterm FFR was found to have a remarkable similarity in waveform to that recorded in term infants. Furthermore, a decrease in the latency of the FFR with age was described, in line with previous literature on neonatal, infant, and adult FFR.[48] [53] [54] [63] This finding was interpreted as a reflection of the myelination pattern experienced by the auditory pathway.

In conclusion, there is an emerging body of studies focused on exploring the neural encoding of complex speech sounds in both healthy newborns and clinical populations at birth. However, longitudinal studies are still needed to support the predictive potential of the neonatal FFR on short- and long-term language skills and disorders.


FINAL CONSIDERATIONS AND CONCLUSIONS

In this article, the FFR has been described as an auditory evoked potential that provides a window into the neural mechanisms of complex sound encoding in the auditory system. By using the appropriate stimulus and sophisticated analysis tools, both in the time and frequency domains, the FFR provides accurate and detailed information on the ability of the individual's auditory system to analyze and encode (through specific patterns of neural activity) auditory information. By doing so, it becomes a useful tool to investigate the ability to distinguish between the pitch of different speaker's voices and, moreover, the ability to encode the fine spectrotemporal details that distinguish between different speech sounds (phonemes). Thus, abnormal FFR determinations with regard to the normative population would indicate that the patient has difficulties in the detection and encoding of speech sound cues essential for language acquisition (voice pitch, formants, rhythm, response stability), and would open the door to the early detection of possible language speech disorders and the subsequent implementation of preventive or diagnostic procedures.

Given the putative predictive value of the FFR on language development, the diagnosis of possible language disorders and the feasibility of its recording in a clinical setting, it seems of utmost importance to continue this promising line of research to validate its possible application as a diagnostic tool. Neonatal and infant FFRs have been previously studied in response to speech sounds such as /da/ or /ga/,[54] [63] and even a normative database for neonatal FFR elicited to speech sounds has been established.[19] Recently, the FFR elicited to the same speech sounds has been characterized in preterm infants.[62] Current studies performed by our research group are making significant progress, allowing the exploration of innate abilities in the neural encoding of the fine temporal structure of speech sounds (high-frequency formants, up to 800 Hz) through the use of a stimulus designed specifically for this purpose (stimulus /oa/[48]). Likewise, several studies are being performed in our research group combining neonatal FFR recordings with other approaches of widespread clinical use, such as magnetic resonance imaging or fetal neurosonography, to further establish the normative maturation pattern of the encoding capacity of the spectrotemporal fine structure of language sounds during the first years of life. Overall, there is an expanding set of studies centered on exploring the neural encoding of complex speech sounds in both healthy newborns and clinical populations, which, together with longitudinal studies, aim at supporting the neonatal FFR predictive potential on short- and long-term language abilities and disruptions.

And yet, until this ideal scenario becomes possible, further studies in normative, clinical, and at-risk populations are needed to establish normative values and to identify the parameters in FFR analysis which yield the greatest sensitivity and precision for specific disorders and impairments. In any case, these studies would benefit from international and multicenter collaborations and the involvement of broad multidisciplinary consortia in the fields of audiology, speech therapy, communication and language sciences, and neurosciences. The FFR represents a significant advance over the classic UNHS, given that, beyond indicating simply whether the child can hear or not, it informs about the “quality” with which they do so, a capacity that will undoubtedly determine their future competence in the language which is beginning to acquire since birth. Furthermore, the UNHS is applied worldwide, but each country might have slightly different implementations of it, as it is composed of two independent tests that can be applied individually or sequentially. However, the principles and uses of neonatal FFR recordings are universally applicable, which emphasizes its value as a predictive tool for short- and long-term language skills and disorders.



CONFLICT OF INTEREST

The authors have no conflict of interest to declare.


Address for correspondence

Carles Escera, Ph.D.
Brainlab - Cognitive Neuroscience Research Group, Department of Clinical Psychology and Psychobiology, University of Barcelona
Passeig Vall d'Hebron 171, 08035 Barcelona
Spain   

Publication History

Article published online:
26 October 2022

© 2022. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (https://creativecommons.org/licenses/by-nc-nd/4.0/)

Thieme Medical Publishers, Inc.
333 Seventh Avenue, 18th Floor, New York, NY 10001, USA


Zoom
Figure 1 Morphology and characteristics of the frequency-following response (FFR). The FFR is a periodic auditory evoked potential that can be recorded in response to both simple stimuli (i.e., pure tones) and complex stimuli. In the top panel, a consonant–vowel syllable /da/ is represented in gray and the corresponding FFR recorded at the Fpz electrode in a newborn is represented in red. As can be observed, the FFR mimics the incoming stimulus by synchronizing with its temporal features, thus capturing with high fidelity and accuracy the periodic characteristics of sound in the ascending auditory system. Additionally, the FFR also encodes the spectral features of the incoming stimulus, as demonstrated in the frequency spectrum, tone tracking, and spectrogram in the bottom panels. The frequency spectrum illustrates the amplitude of the spectral decomposition of the FFR, which reveals a clear peak corresponding to the fundamental frequency of the stimulus (113 Hz in this recording). In addition, the pitch tracking provides a measure of the precision with which the FFR encodes changes in the fundamental frequency over the duration of the stimulus (stimulus frequency in black; FFR pitch tracking in red). The spectrogram provides combined information on both the frequency and amplitude at which the FFR synchronizes with the different components of the incoming stimulus. Overall, the figure illustrates how the FFR synchronizes with the stimulus that elicits it even in a single individual, providing a very useful tool in the fields of audiology and auditory cognitive neuroscience and in the study of auditory abilities at the individual level. For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article. (Modified with permission from Ribas-Prats T, Almeida L, Costa-Faidella J, et al. The frequency-following response (FFR) to speech stimuli: a normative dataset in healthy newborns. Hear Res 2019;371:28–39.[19])
Zoom
Figure 2 Recording setup. The recommended recording setup to obtain a neonatal FFR is with a minimum of three disposable snap Ag/AgCl electrodes placed in a vertical montage: the active electrode was located at Fpz (white electrode), reference at the mastoid behind the ear, and ground electrode at the forehead (black electrode). In this example, two reference electrodes are positioned in the mastoid bones behind the right and left ears (red and blue electrodes, respectively). Although possible, neonatal FFR is not typically recorded with EEG caps as newborns can still have vernix caseosa or blood residuals from birth on the head. The reproduction of the participant's picture is with the written consent of her mother. (The photograph has been published with consent.)
Zoom
Figure 3 FFR fundamental frequency and temporal fine structure to the /oa/ diphthong. (A) Temporal (top) and spectral (bottom) representations of the /oa/ syllable, with a schematic overlay of its formant structure trajectory. The fundamental frequency is stable at 113 Hz from 0 to 160 milliseconds, with a linear increase to 154 Hz from 160 to 250 milliseconds. The section of the vowel /o/ (F1 = 452 Hz, F2 = 791 Hz) spans from 0 to 80 milliseconds and the section of vowel /a/ (F1 = 678 Hz, F2 = 1,017 Hz) from 90 to 250 milliseconds (F0 and F1 in solid lines; F2 in dotted line). (B) and (C) Grand averaged time-domain FFR and its spectral decomposition, recorded to the /oa/ stimulus in a sample of 34 newborns. As demonstrated in the spectral decomposition, when averaging the responses to alternating polarities (B) only the fundamental frequency of the stimulus (113 Hz) can be observed. On the other hand, when subtracting the neural responses to the two stimulus polarities (C), the stimulus fine-structure (the first formant for the /o/ and the /a/ region is observable). Note that the amplitude of the scale in (B) is twice that of (C). (Modified with permission from Arenillas-Alcón S, Costa-Faidella J, Ribas-Prats T, Gómez-Roig MD, Escera C. Neural encoding of voice pitch and formant structure at birth as revealed by frequency-following responses. Sci Rep 2021;11(1):1–16.[48])