Subscribe to RSS
DOI: 10.1055/a-2759-7466
Reply to Letter to the Editor: One Size Doesn't Fit All: Four Score in the Pediatric ICU
Authors
To the Editor,
I read the letter to the editor titled “One Size Doesn't Fit All: Four Score in the Pediatric ICU” on my article published in Neuropediatrics titled “Comparison of FOUR score and GCS score for prediction of outcome in children with impaired consciousness.”
I want to clarify the concerns raised by Dr. Indar Kumar Sharawat, DM, Additional Professor and Chief, Pediatric Neurology Division, Department of Pediatrics, All India Institute of Medical Sciences, Rishikesh, Uttarakhand, India 249203.
Query: A primary concern lies in the application of the FOUR score across a broad pediatric age spectrum without prior validation or adaptation for younger children. The FOUR score was initially designed for adult populations and includes parameters such as brainstem reflexes and respiratory patterns that are age-dependent and less reliable in infants and toddlers. While the GCS has well-established pediatric modifications, no such validated adaptation of the FOUR score exists for children under 5 years of age. The authors included children aged 1 to 14 years but did not perform stratified analysis or discuss age-related validity, which raises concerns about content and construct validity. Previous studies have acknowledged the limited applicability of the FOUR score in preverbal children without age-specific calibration.
Reply: We included children between 2 and 18 years. He has mentioned the wrong age group. Most of the previous studies also had a similar age group and assessed the prognostic value of the FOUR score in children aged ≥2 years.[1] [2] [3] [4] [5]
Query: Furthermore, interrater reliability, which is critical in observational studies involving subjective scoring tools, was not assessed in this work. The GCS and FOUR scores are known to have variable interobserver agreement, particularly when performed by different cadres of clinicians (e.g., nurses, residents, intensivists). Czaikowski et al[6] demonstrated that interrater agreement for the FOUR score in pediatric patients can be excellent when scorers are adequately trained (weighted κ ≈ 0.95), but significantly lower when untrained or inconsistent scorers are involved. The current study notes that senior residents performed the assessments, but does not mention any standardization or agreement testing, introducing the possibility of measurement bias that could affect the results.
Reply: No need to assess interrater reliability because both scores were measured by the principal investigator (junior resident) only, in our study.
Query: We are also concerned by the use of the unmodified adult FOUR score in preverbal children, particularly as this group constitutes a significant portion of the sample. In this age group, certain components such as eye tracking or specific respiratory patterns may not be assessable or may not correlate with underlying brain injury severity in the same way they do in older children. A pediatric version or, at a minimum, a justification for unaltered application of the adult version would have improved confidence in the score's relevance. Prior literature has suggested the development of pediatric-adapted tools like the Pediatric FOUR (P-FOUR) score, which could have been a better choice than the original FOUR score for this study.[4]
Reply: Pediatric FOUR (P-FOUR) score has not been developed yet. Preverbal refers to the stage in early human development before speech or articulate language emerges. This term is often used to describe infants who have not yet acquired spoken language but are capable of other forms of communication, such as gestures, vocalizations, and facial expressions. All previous studies used the unmodified FOUR score in children.[1] [2] [3] [4] [5]
The reference cited for the P-FOUR score is wrong. Büyükcam et al[5] also used the same score in children aged 2 to 17 years for the prediction of outcome in children with head trauma.
Query: We also note that mortality was the sole outcome assessed, while functional outcomes among survivors were not explored. In pediatric critical care, survival alone is insufficient—recovery with meaningful neurologic function is a primary objective. Validated scales such as the Pediatric Cerebral Performance Category or Functional Status Scale are frequently used to evaluate neurologic outcome in survivors. Previous work by Fink et al[7] has demonstrated the relevance of such scales in PICU populations and emphasized their importance over crude binary outcomes like death or survival. Incorporating functional status would have provided a richer understanding of the scores' prognostic power.
Reply: In our study, we used the modified Rankin scale (MRS) for neurologic function recovery. We had mentioned the functional outcomes in the survivor group. Both scales (Pediatric Cerebral Performance Category or Functional Status Scale and MRS) are frequently used to evaluate neurologic outcome in survivors.
Query: We also observed that the study lacks a priori sample size or power calculation, despite comparing the predictive accuracy of two tools via an ROC analysis. With only 78 children enrolled and a small number of outcome events, the study may be underpowered to detect significant differences in the area under the curve (AUC). This is especially concerning because the reported AUCs had overlapping confidence intervals, and no statistical comparison (e.g., DeLong's test) was provided to validate any superiority claims. Without such analysis, conclusions about the relative performance of FOUR versus GCS remain speculative.
Reply: Sample size was calculated using the following formula:


where Z, standard normal variant
P, prevalence rate
E, allowable error
n, required minimal sample
Here, z_(α1/2) = 1.96 at 95% confidence interval
p = 5.34%
E = 5%
Hence, n = 77.6 = 78
Therefore, the sample size of our study will be 78.
Query: Finally, score assessments were not standardized in terms of timing relative to sedation or resuscitation. The use of sedatives, paralytics, or recent seizure activity can transiently alter consciousness levels, potentially leading to score misclassification. Standard protocols—such as assessing after a sedation hold or at a uniform time postresuscitation—would enhance comparability and reliability. Previous studies have highlighted how even minor timing discrepancies can lead to significant score variation, particularly in pediatric populations.[3] [5]
Reply: The study excluded children having traumatic brain injury, spinal cord injury, intellectual, motor, visual, or hearing impairment, and episodes of seizure in the preceding hour. Patients on neuromuscular function blockers or heavily sedated were also excluded from this study.
Publication History
Article published online:
10 December 2025
© 2025. Thieme. All rights reserved.
Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany
-
References
- 1 Almojuela A, Hasen M, Zeiler FA. The Full Outline of UnResponsiveness (FOUR) score and its use in outcome prediction: A scoping review of the pediatric literature. J Child Neurol 2019; 34 (04) 189-198
- 2 Mittal K, Kaushik JS, Dwivedi KD. Predictive value of full outline of unresponsiveness (FOUR) score and Glasgow Coma scale (GCS) in outcome of children aged 1–14 years admitted with altered sensorium. J Pediatr Crit Care 2020; 7: 14-21
- 3 Khajeh A, Fayyazi A, Miri-Aliabad G, Askari H, Noori N, Khajeh B. Comparison between the ability of Glasgow Coma scale and full outline of unresponsiveness score to predict the mortality and discharge rate of pediatric intensive care unit patients. Iran J Pediatr 2014; 24 (05) 603-608
- 4 Cohen J. Interrater reliability and predictive validity of the FOUR score coma scale in a pediatric population. J Neurosci Nurs 2009; 41 (05) 261-267 , quiz 268–269
- 5 Büyükcam F, Eken C, Kartal M. et al. Comparison of FOUR score and GCS in predicting morbidity and mortality in children with head trauma. Eur J Emerg Med 2012; 19 (02) 98-102
- 6 Czaikowski BL, Cohen J, Reeder RW. et al. Pediatric FOUR score coma scale: interrater reliability and predictive validity. J Neurosci Nurs 2014; 46 (02) 79-87
- 7 Fink EL, Kochanek PM, Clark RSB. et al. Defining the functions of outcome scores in pediatric critical care: lessons from the Functional Status Scale. Crit Care Med 2010; 38 (04) 1470-1477
