CC BY-NC-ND 4.0 · Int Arch Otorhinolaryngol 2019; 23(03): e256-e261
DOI: 10.1055/s-0038-1668127
Original Research
Thieme Revinter Publicações Ltda Rio de Janeiro, Brazil

Comparing Individuals through the Speech Recognition Test Applied to Regional Live Voice and Recorded Speeches from Paraná State in Five Brazilian Counties

Nicoli Valverde Mafra
1   Department of Speech Therapy, Hospital Universitário da Universidade Federal de Santa Catarina, Florianópolis, Santa Catarina, Brazil
,
Angela Ribas
2   Communication Disorders Postgraduate Program, Universidade Tuiuti do Paraná, Curitiba, Paraná, Brazil
,
Claudia Moretti
3   Department of Speech Therapy, Universidade Tuiuti do Paraná, Curitiba, Paraná, Brazil
,
Bianca Simone Zeigelboim
2   Communication Disorders Postgraduate Program, Universidade Tuiuti do Paraná, Curitiba, Paraná, Brazil
,
Vinicius Ribas Fonseca
4   Department of Otorhinolaringology, Otorrinos - Hospital da Cruz Vermelha, Curitiba, Paraná, Brazil
5   Discipline of Otorhinolaringology, Universidade Tuiuti do Paraná, Curitiba, Paraná, Brazil
,
4   Department of Otorhinolaringology, Otorrinos - Hospital da Cruz Vermelha, Curitiba, Paraná, Brazil
6   Department of Otorhinolaringology, Centro de Estudos Otorrinolaringológicos Lauro Grein Filho, Curitiba, Paraná, Brazil
› Author Affiliations
Further Information

Address for correspondence

Rodrigo Marques Borburema, MD
Department of Otorhinolaringology, Otorrinos - Hospital da Cruz Vermelha
R. Vicente Machado, 1239
Batel UEM - Sala 11, Curitiba, PR 80420-011
Brazil   

Publication History

31 March 2018

26 May 2018

Publication Date:
24 October 2018 (online)

 

Abstract

Introduction Speech tests such as logoaudiometry measure the ability to perceive and recognize oral sounds. The Speech Recognition Index (SRI) is one of the speech tests adopted in clinical routine; it uses standardized live voice and recorded speeches. The live voice speech method can be influenced by intra and interspeaker variability, as well as by regionalism variability, whereas recorded tests show consistency in their presentation.

Objective Analyzing results of the SRI test applied to live voice and recorded speeches from Paraná State, in different Brazilian counties.

Method The sample comprised 125 individuals, 25 from each county (Rio de Janeiro, Florianópolis, Porto Alegre, Salvador and Curitiba), from both sexes, in the age group 20 to 70 years; the SRI was applied in both techniques.

Results The recorded speech method showed prevalence of hit improvement in Rio de Janeiro (40%), Salvador, Porto Alegre and Florianópolis (28%). Individuals from Salvador and Florianópolis subjected to the recorded speech method showed better results in the left ear. Individuals from Rio de Janeiro and Porto Alegre showed satisfactory results in both ears, whereas those from Curitiba did not show statistically significant difference between the left and the right ear.

Conclusion The recorded CD application method showed prevalence of hit improvement (%) in the SRI responses in comparison to the live voice speech technique in most of the studied counties. According to the hit rate measured in the herein investigated counties, Rio de Janeiro showed the best results in the recorded speech method.


#

Introduction

Logoaudiometry requires taking into consideration the understanding of words presented to tested individuals, who must have lexical resources capable of forming the functional vocabulary they use to access partially perceived words in an easier way. It corroborates the positive correlation between vocabulary explanation and tests involving linguistic aspects.[1]

The speech perception analyzed in logoaudiometric tests adopted in the clinical audiology practice is related to a recognition process, in which the listener perceives certain acoustic cues and selects them in a given category. Thus, the tested individual, upon hearing a particular word, relies not only on acoustic and phonetic factors, but also on syntax, semantics and on the general context. Speakers can be a factor of foremost importance in the performance of individuals subjected to tests investigating the SRI (speech recognition index).[2]

The accent was considered a noise in the communication process among the aspects of speech, since it was believed that it affected the course of information and reduced the communicative efficiency. Thus, there was the need of eliminating the regional features of speech to find a national and uniform pronunciation pattern. Speech and accent perception studies performed over the last ten years sought to help understanding how lay listeners process and interpret word variations. They concluded that people seem to use their dialect perception to categorize and attribute values to speakers.[3]

Tests using the live voice speech method can be influenced by intra and interspeaker variability and by linguistic regionalisms such as accent, speech rate, articulation pattern, intonation and sociocultural aspects of the speaker. On the other hand, the major advantage in recorded speech tests lies on consistency in speech presentation.[4]

Logoaudiometry using the live voice speech method has been adopted as routine test in most audiology and outpatient clinics in Brazil. However, according to the international literature, the recorded and standardized material enables data collection reliability, since it rules out the possibility of having the examiner interfering in the results.[5]

Based on these assumptions, a group of speech therapists prepared a logoaudiometry-specific material. They recorded a CD (compact disc) using speech material to standardize the application of the tests. The CD was recorded in the sound and image laboratory, according to standards set by ISO 8253.[6]

The project generated interest in studying about the application of the previously recorded and standardized material to different populations presenting distinct sociocultural aspects such as the regionalism embedded in the speech of individuals.

The object of study in the current research lies on the use of the instrument recorded by speech-therapy professionals from different Brazilian regions. If one takes into consideration that the recorded material may present the regional influence on the intonation and accent of the team that recorded it, it is possible saying that there may be interferences in logoaudiometric interpretations such as the worsening of responses when the test is applied to populations from other Brazilian counties.

Thus, the main aim of the present study was to analyze results of the Speech Recognition Index (SRI) test applied through two methods (live voice and recorded speeches from Paraná State) in different Brazilian counties.

Specific aims: measuring the hit rate of different counties, investigating the influence of regional aspects in the results of the recorded speech test, identifying the county presenting the highest distinction in the use of the Paraná State-related material, and comparing the results between the live voice and recorded speech methods to identify differences between them.


#

Method

The current sample comprised 25 subjects from each evaluated county (Porto Alegre, Rio de Janeiro, Salvador and Florianópolis), as well as 25 people from Curitiba (control group); the total sample comprised 125 individuals (73 women and 52 men) in the age group 20 to 70 years. Only individuals presenting hearing diagnosis within the normal range were included in the study. The speech-therapy professional who applied the live-voice SRI test and the examined participant should be natural to the collection region.

The present study was approved by the Research Ethics Committee of the involved institution, registered under number 00095/2008. The research participants signed the Free and Informed Consent Form to authorize the use of their data. Such procedure complied with the Guidelines and Norms for Research Involving Humans, Resolution N. 196/96.

The current research is an exploratory, descriptive and quantitative study. Data were collected from October 2014 to February 2015. Participants were randomly selected by each speech-therapy professional in each county (Florianópolis, Curitiba, Salvador, Rio de Janeiro and Porto Alegre). It was made the option for selecting individuals linked to educational institutions (mostly employees and students from Universities in each region) and/or patients from clinics linked to the aforementioned professionals. Data were collected by speech therapists, who worked in the aforementioned counties. This is the reason why such counties were selected to participate in the study.

The evaluation protocol was applied after a brief anamnesis performed to collect data capable of identifying each individual and study criterion. The protocol consisted of otoscopy, pure-tone air-conduction threshold audiometry (from 250Hz to 8KHz), and SRI applied as regional live voice and recorded speech material from Paraná State by using two lists with 25 monosyllable words (the same words were presented to all individuals).

The basic audiological evaluation followed the standards set by the Federal Council of Speech Therapy (CFFA, 2012); all speech therapists were instructed to follow the guidelines for test application. Pure-tone audiometry was first applied to the right ear (intensity: 50 dBNA; frequency: 1,000 Hz) and the downward technique was adopted to capture the audiological threshold. With respect to SRI, the tritonal means 500Hz, 1000Hz and 2000Hz was separately measured in each ear and 40dbNa was added to the intensity value. The examination started in the right ear, which was first subjected to the live voice method and, subsequently, to the recorded CD method. The same list of words was adopted in both methods: one for the right ear and the other one for the left ear. The introductory sentence was used in both techniques. Speech therapists were instructed to uniformly apply the SRI in both ways. There was no interval between applications. All the professionals collecting data had already used such material in their clinical routine, Ribas (2009); thus, they were familiar with the handling and application of the instrument, as well as trained.

All the sample examiners performing the live voice speech test were women; the recorded material also presented a female voice.

The test was performed in a sound booth (AD229-E and BETA6000 audiometer equipped with TDH 39 headphone), according to standards set by the Federal Council of Speech Therapy. A CD presenting standardized speech material developed for auditory perception evaluation purposes was also used in the SRI test conducted with recorded material. The herein adopted list of 25 monosyllable words lied on Lists 1 and 2 by Russo and Santos (1993), apud Ribas (2009).[7]

Statistical analyses performed in the current study used frequency tables, as well as descriptive and inferential statistics (Wilcoxon and Kruskal-Wallis ANOVA tests). The herein applied tests adopted significance level 0.05 (5%).


#

Results

The results analyzed according to the sex-based sample distribution in the five collection counties showed that 73 individuals were women, with prevalence in Curitiba; whereas 52 were men, with the highest prevalence in Rio de Janeiro.

According to the statistical analysis based on the age (in years) of participants per selected county, the mean age was 29.7 years, in Salvador; 24.7 years, in Rio de Janeiro; 42.1 years, in Porto Alegre; 35.8 years, in Florianópolis; and 36.4 years, in Curitiba. Rio de Janeiro presented the youngest participants (mean age 24.7 years), whereas Porto Alegre presented the oldest ones (mean age 42.1 years).

The interaural distinction in the results between the live voice speech and recorded CD methods applied to the Brazilian counties ([Table 1]) did not show changes between methods in the results of the right ear in Salvador, although the recorded test showed the best results in the left ear. Rio de Janeiro showed superior results in the recording test applied to both ears. The right ear did not present significant difference between methods in Porto Alegre and Florianópolis, whereas the recorded CD test applied to the left ear showed improved results. There was no significant difference between ears subjected to both methods in Curitiba.

Table 1

Descriptive statistics in the Speech Recognition Index by city

DESCRIPTIVE STATISTICS

City

n

Mean

Median

Minimum

Maximum

Standard deviation

Salvador

 SRI RE live voice

25

99,36

100,00

96,00

100,00

1,50

 SRI LE live voice

25

98,72

100,00

92,00

100,00

2,51

 SRI RE recorded CD

25

99,68

100,00

96,00

100,00

1,11

 SRI LE recorded CD

25

99,84

100,00

96,00

100,00

0,80

Rio de Janeiro

 SRI RE live voice

25

97,60

100,00

92,00

100,00

2,83

 SRI LE live voice

25

99,20

100,00

96,00

100,00

1,63

 SRI RE recorded CD

25

98,88

100,00

96,00

100,00

1,83

 SRI LE recorded CD

25

100,00

100,00

100,00

100,00

0,00

Porto Alegre

 SRI RE live voice

25

99,68

100,00

96,00

100,00

1,11

 SRI LE live voice

25

98,72

100,00

88,00

100,00

2,99

 SRI RE recorded CD

25

99,68

100,00

96,00

100,00

1,11

 SRI LE recorded CD

25

100,00

100,00

100,00

100,00

0,00

Florianópolis

 SRI RE live voice

25

98,56

100,00

92,00

100,00

2,80

 SRI LE live voice

25

98,72

100,00

92,00

100,00

2,76

 SRI RE recorded CD

25

99,52

100,00

88,00

100,00

1,33

 SRI LE recorded CD

25

100,00

100,00

88,00

100,00

0,00

Curitiba

 SRI RE live voice

25

98,72

100,00

92,00

100,00

2,23

 SRI LE live voice

25

98,56

100,00

92,00

100,00

2,27

 SRI RE recorded CD

25

97,92

100,00

88,00

100,00

3,49

 SRI LE recorded CD

25

98,24

100,00

88,00

100,00

3,07

Abbreviations: CD, compact disc; LE, left ear; RE, right ear; SRI, speech recognition index.


Thus, there was significant difference in both ears between the live voice and the recorded CD tests only in Rio de Janeiro, whereas Salvador, Porto Alegre and Florianópolis showed significant difference in the left ear. The recorded CD method always showed the best results ([Table 2]).

Table 2

Wilcoxon test comparing SRI live voice x SRI recorded CD by city

City

SRI RE live voice x

SRI RE recorded CD

SRI LE live voice x

SRI LE recorded CD

Salvador

Not applicable

p = 0,0277*

Rio de Janeiro

p = 0,0180*

p = 0,0431*

Porto Alegre

p = 1,0000

p = 0,0431*

Florianópolis

p = 0,0759

p = 0,0277*

Curitiba

p = 0,2863

p = 0,5002

Abbreviations: CD, compact disc; LE, left ear; RE, right ear; SRI, speech recognition index.


With respect to the better or worse response frequency rate based on the recorded CD and live voice speech methods applied to the five cities ([Table 3]), results showed that 7 (28%) out of 25 individuals in Salvador presented improved responses in the recorded method, as well as that there was no worsening in such method. Ten (40%) out of 25 individuals in Rio de Janeiro showed better results in the recorded method; there was no worsening in such method. As for Porto Alegre, 7 (28%) out of 25 individuals presented improved responses, whereas 2 (8%) presented worsened responses in the recorded method. Seven (28%) out of 25 individuals in Florianópolis presented better responses, whereas only 1 (4%) individual presented worsened responses in the recorded method. Finally, 3 (12%) out of 25 individuals in Curitiba showed better responses, whereas 6 (24%) presented worse results in the recorded CD-based SRI test.

Table 3

Better or worse response frequency method recorded CD

City

n

Better frequency

Worse frequency

Salvador

25

7 (28%)

-(0%)

Rio de Janeiro

25

10 (40%)

−(0%)

Porto Alegre

25

7 (28%)

2 (8%)

Florianópolis

25

7 (28%)

1 (4%)

Curitiba

25

3 (12%)

6 (24%)


#

Discussion

Hearing depends on individuals' innate biological ability and on their experience in the environment they were born and grew. To give meaning to the received acoustic signal, it is necessary associating it with previously acquired information and experiences.[8]

Speech consists of low and high frequency sounds whose intensity continuously changes, fact that may interfere in the communicative performance of individuals due to changes in the auditory thresholds recorded during audiological evaluations. Most communicative situations involve the intelligibility of speech; thus, speech therapists must include speech tests in their routine to assess patients in conditions close to the ones they face in their daily lives.[9]

It is essential measuring the speech recognition level, since it allows dimensioning issues in the communication of individuals, as well as intervening in the possible causes and manifestations of each case in a personalized way, to help improving the quality of life and well-being of these individuals.[10]

The use of tests capable of simulating real listening situations becomes an instrument to assess difficulties faced by individuals with hearing deficit.[11]

As it was previously presented in the current results, the age group in the present study ranged from 20 to 70 years. The difference between the ages of individuals in the studied counties resulted from to the difficulty in applying the standardized test in distinct locations based on the inclusion criterion that had participants who consider themselves as normal hearing individuals and who did not present auditory complaints.

Similar to the current study, Angra and Miranda (2008) adopted the speech recognition test based on the masking technique to evaluate the Speech Recognition Index (SRI) of individuals in the age group 34 to 80 years. Fortes et al (2012), assessed 43 individuals (31 women and 12 men) whose age ranged from 09 to 82 years to compare the performance of patients in the live voice- and recorded CD-based logoaudiometry tests. There were no exclusion criteria regarding sex and age in these studies, since the goal was to compare the methods (live voice and recorded voice) without taking into account the audiological profile of the participants.[2] [12]

The present research adopted the logoaudiometry list by Russian and Santos (1993) published in the book by Ribas (2009), which also presents the CD (with the recorded material) recorded. Such material was also adopted in the study by Franzoso (2010) and Chiesorin (2011), who evaluated the SRI test according to both analysis methods.[6] [7] [13] [14]

The herein reviewed literature did not identify the method used in the speech test; thus, it was not possible making the comparative analysis of results reported in several studies.[15] [16] [17] [18] Costa-Guarisco, Fernandes and Sousa (2014) investigated the audiometric configuration aspects influencing speech discrimination in downward-sloping sensorineural hearing loss cases. They analyzed values recorded for individuals subjected to SRI and SRT, but did not describe the adopted methodology (live or recorded voice).[15]

Soares (2014) conducted a literature review study and observed lack of description of the data collection methodology adopted in studies investigating the speech of individuals and their regional features in the last years. Such issue may compromise the interpretation of results, since speech rate levels and the context in which data related to spontaneous and controlled speech are collected, among other factors, have direct influence on individuals' responses.[19]

Nowadays, it is essential describing the logoaudiometric analysis methodology (live voice or recorded material) adopted in studies recorded in the literature, since both methods are available and may influence the methodological context and generate reproducibility bias in the studies.

Cunha and Silva (2011) conducted a study aimed at describing the regional variation in the intonation of statements by Brazilian capital speakers. They applied the speech perception test to four individuals from Recife, Rio de Janeiro and Florianópolis (each) and found significant difference in the pronunciation contours of each individual among the investigated cities. According to the aforementioned authors, the language is not a rigid system and, despite having a hard core, its phonetic implementation rules fit the sociocultural profile of its speakers. The intonational contours recorded in the investigated capital cities corroborate the malleability of the language.[20]

Cunha and Silvestre (2013) investigated the intonational behavior in assertive statements of the Portuguese spoken language in cities such as Natal, Rio de Janeiro and Porto Alegre and found low influence of social factors such as age group, sex and schooling in the prosody of the studied individuals; however, location emerged as an important variation factor. Thus, the intonational curve behavior should be carefully investigated to help better understanding the regional intonation profile in these capital cities.[21]

Battisti and Carvalho (2013) investigated socio-historical issues related to the profile of Porto Alegre residents to study the way these individuals produce tonic vowels and to compare differences in the pronunciation of each region. Their findings showed how these socio-demographic aspects influence the acoustic quality of the sound produced in the Brazilian Portuguese spoken language, which may lead to inferential distinction at the time the live reading of a list of words is performed in Brazilian Southern cities in comparison to cities in the Northern region.[22]

Andrade et al. (2016) assessed the SRI of 19 individuals - 13 men and 6 women (age group 16 to 59 years) presenting post-lingual bilateral mild-to-moderate symmetrical sensorineural hearing loss - using monosyllables presented through live voice and recorded material, besides recording words along with pictorial representation (pictures). The aforementioned authors found that 26% of the individuals subjected to the live voice monosyllable-based SRT test presented adequate results in both ears; however, none of the individuals subjected to the recorded material-based test presented adequate bilateral results, thus showing higher compatibility with the tonal audiometry.[23]

Based on their results, the authors pointed out that the recorded stimulus should be routinely used, since it standardizes the evaluation process, allows comparing participants' performance in different moments, as well as reduces the extrinsic redundancy and the influence of the evaluator on the final result.[23]

Vaucher et al. (2017) conducted a study aimed at validating the development of new monosyllable lists for logoaudiometric evaluation purposes and reinforced the need of standardizing the logoaudiometric evaluation to allow controlling variables inherent to the presentation of spoken (live voice) words.[24]

Duque, Garcia-Moreno and Soria-Urios (2011) conducted a study using a list of Spanish words applied to post-lingual young people to validate the previously recorded material, and emphasized the importance of using standardized material to assure reliable results.[25]

There was no significant difference in worse responses between the recorded material and live voice techniques in all cities where the SRI test was applied ([Tables 2] and [3]). Such result allows inferring that the Paraná State material can be adopted throughout Brazil without affecting individual results.

As a remote clinical examination inserted in the basic audiological evaluation, the logoaudiometry test should be given more status because it provides essential data about the auditory analysis of the oral aspects of the speech. It is expected that logoaudiometry tests based on pre-standardized recorded material will gain space and become part of the clinical routine of audiologists to assure reliable results and to enable further comparative studies.

The use of protocols based on recorded materials in studies conducted in the speech therapy field allow standardizing the presentation of the investigated stimuli to reduce the number of variables interfering in the findings. The current test application methods associated with speech therapy are increasingly focused on standardized artifacts, fact that evidences audiologists' awareness about the use of such standard instruments.


#

Conclusion

In light of the set of data collected in the current study, under the herein adopted experimental conditions, it was possible concluding that:

  1. Most of the studied cities showed prevalence of hit improvement (%) in the SRI responses of the recorded CD-based application method in comparison to the live voice technique;

  2. The hit rates recorded for the investigated cities showed that Rio de Janeiro presented the best results in the recorded method;

  3. According to the test results, the influence of the regional aspects in the recorded material was not significant, since the recorded material from Paraná State showed improved results when it was used in cities located in the Brazilian Southeastern region.


#
#

No conflict of interest has been declared by the author(s).


Address for correspondence

Rodrigo Marques Borburema, MD
Department of Otorhinolaringology, Otorrinos - Hospital da Cruz Vermelha
R. Vicente Machado, 1239
Batel UEM - Sala 11, Curitiba, PR 80420-011
Brazil