Key words
bone age estimation - medical issues - greulich and pyle - hand MRI - imaging technique
Introduction
The determination of skeletal age has been an important aspect of diagnosis in endocrinology
and pediatrics for decades. The close connection between biological maturation processes
and skeletal development, which can be used to determine the biological age of children
and adolescents, is used here [1]
[2]. Indications for determining skeletal age include the determination of mature body
size in small and tall children and diagnosis and follow-up of hormone treatments
in endocrine diseases and in adolescents with premature or delayed puberty (pubertas
praecox/tarda) [3]
[4].
The method according to Greulich and Pyle [GP] is the most commonly used method for
determining bone age. In a survey published in 2016, 97 % of surveyed radiologists
used the GP atlas for bone age determination in the age group between 3 and 18 years
[5]. This atlas which includes reference images for various age groups for girls and
boys was published in 1959 [6]. According to the GP method, a conventional X-ray image of the left hand and the
wrist is acquired and compared to the gender-specific reference images in the atlas.
The GP atlas with its reference images can still be used today to determine bone age
[7]. In 1962, Tanner and Whitehouse (TW) published another method for determining skeletal
age based on conventional X-rays. The revised versions (TW2 and TW3) are still used
today [8]. In the TW method the maturity of various bones in the hand and the wrist are categorized
according to stages. Using a point system, the skeletal age is then calculated from
the stages [9]. Studies comparing the GP method and the TW2 method concluded that the GP method
is more suitable for clinical practice due to the shorter application time [10].
In medical issues, follow-up, for example in the case of hormonal therapy, is often
performed. Even if the dose for an individual X-ray image is low, higher cumulative
doses can occur over time. Due to the overall higher sensitivity of pediatric tissue
to radiation, ionizing radiation must be used on a restrictive basis [11]. Ultrasound and MRI have been examined in most studies as possible alternatives
without the use of ionizing radiation [12]
[13]
[14]
[15]. Since there is no separate MRI atlas with reference images, the authors used the
method according to GP or TW.
The goal of the present prospective study is to examine whether MRI is a suitable
alternative to conventional X-ray images for determining bone age using the GP method
in medical issues. The bone ages determined with each method were compared. Moreover,
the goal of the study is to examine whether relevant advantages and disadvantages
of various MRI sequences can be identified and whether there are differences regarding
the time requirement for evaluation.
Materials and Methods
The study was approved by the local ethics committee (no. 351/16). The parents of
the patients were provided with information about the study both verbally and in writing
and gave their informed consent.
Patients
50 children and adolescents with growth and/or development disorders, e. g. as a result
of endocrine disease, were included in the study. Written informed consent from the
parents was required for participation. The exclusion criteria were prior surgeries
or prior fractures of the hand or wrist, upper extremity implants, and general (relative)
contraindications to MRI, e. g. claustrophobia. 19 female and 31 male children with
an age range of 5.08 years to 17.50 years were examined. The average chronological
age at the time of examination was 11.87 years. No child needed sedation to undergo
MRI examination. Prior to the examination, the children and adolescents as well as
their parents received a precise explanation of the examination procedure and any
questions were answered.
Imaging
A conventional X-ray image of the left hand was used to determine the bone age of
patients during diagnosis/follow-up. An MRI examination was also performed on the
same day ([Fig. 1], [2]).
Fig. 1 Pictures of a 7-year, 9-month-old male subject. Conventional X-ray, T1-weighted VIBE
and T1-weighted TSE. VIBE = Volumetric Interpolated Breathhold Examination, TSE = Turbo-Spin-Echo.
Fig. 2 Pictures of a 12-year, 11-month-old female subject. Conventional X-ray, T1-weighted
VIBE and T1-weighted TSE. VIBE = Volumetric Interpolated Breathhold Examination, TSE = Turbo-Spin-Echo.
The X-ray images of the left hand were acquired on one plane in anterior-posterior
projection (a. p.) on a digital X-ray device (Samsung Electronics GC 70, Samsung Healthcare,
Seoul, South Korea) (tube voltage 50 kV, tube current 1 mAs). MRI examinations were
performed on a 3-Tesla scanner (Magnetom Skyra, Siemens Healthcare, Erlangen, Germany).
The examination was performed in a prone position with the arm extended. The left
hand was positioned in a 16-channel hand coil (hand/wrist 16, Siemens Healthcare,
Erlangen, Germany). One T1-weighted turbo-spin echo sequence (TSE) and one T1-weighted
volumetric interpolated breathhold examination sequence (VIBE) were acquired. The
sequence parameters are shown in [Table 1].
Table 1
Sequence parameters for the MRI acquisition protocols.
|
T1-VIBE
|
T1-TSE
|
matrix
|
512 × 384
|
512 × 384
|
voxel size
|
0.4 × 0.4 × 0.9 mm
|
0.4 × 0.4 × 2.0 mm
|
field of view (FOV)
|
200 mm
|
200 mm
|
slice thickness
|
0.9 mm
|
2.0 mm
|
repetition time (TR)
|
14 ms
|
450 ms
|
echo time (TE)
|
5.94 ms
|
13 ms
|
flip angle
|
15°
|
180°
|
fat saturation
|
Spectral
|
None
|
Acquisition time
|
2:45 min
|
3:48 min
|
A coronal T1 VIBE sequence was acquired in all 50 children and adolescents and a coronal
T1 TSE sequence was additionally acquired in 34 participants The image material was
archived in a picture-archiving and communication system (PACS) (IMPAX EE R20, Agfa
Healthcare, Mortsel, Belgium) for evaluation.
Bone age determination
The method according to Greulich and Pyle was used to determine skeletal age [6]. The image material was compared to the corresponding male or female standardized
reference images from the GP atlas. The skeletal age of the children and adolescents
was determined in four regions (regions of interest (ROIs)): The distal forearm, the
carpals, the metacarpals, and the phalanges. The shape and size of the ossification
centers and the degree of ossification of the epiphyseal plates were analyzed. The
mean of the results from the four ROIs was defined as the calculated bone age. The
images shown in the GP atlas are conventional X-ray images of the left hand. There
is no corresponding atlas for MRI images. Therefore, the MRI images used in the study
were evaluated using the GP atlas based on the described ROI method. The carpograms
and MRI examinations were evaluated with an interval of 2 weeks. The various MRI sequences
were also evaluated with an interval of 2 weeks.
Statistics
The data were evaluated using IBM SPSS Statistics 24 (IBM Corporation, Armonk, NY).
The acquired data were checked for normal distribution using the Shapiro-Wilk test.
The correlations between the results of various modalities were tested using Pearson
correlation coefficients. The interrater variability was calculated using the Spearman
rank correlation coefficients. The interrater variability was also calculated using
the Pearson correlation coefficients. The results were shown with the help of scatter
plots. A line of origin and an adjustment line were drawn. Bland-Altman plots were
used to compare the methods to one another and for graphic representation of the interrater
variability (Altman and Bland, 1983). The mean difference between the results and
the limits of agreement (LoA), defined as 1.96 times the positive and negative standard
deviation (SD), were plotted as reference lines.
Results
The children and adolescents were divided into five age groups with an increment of
three years. Most participants (n = 41) were between the ages of 9.5 and 15.49 years.
[Table 2] shows the absolute and relative distribution.
Table 2
Distribution of children and adolescents by age group.
|
frequency
|
percentage
|
valid percentage
|
cumulative percentage
|
valid
|
1
|
2
|
4.0
|
4.0
|
4.0
|
|
2
|
4
|
8.0
|
8.0
|
12.0
|
|
3
|
22
|
44.0
|
44.0
|
56.0
|
|
4
|
19
|
38.0
|
38.0
|
94.0
|
|
5
|
3
|
6.0
|
6.0
|
100.0
|
|
total
|
50
|
100.0
|
100.0
|
|
age groups: 1 = 4–6.49 years; 2 = 6.5–9.49 years; 3 = 9.5–12.49 years; 4 = 12.5–15.49
years; 5 = 15.5–18.5 years
|
Comparison of skeletal age determined using conventional carpogram versus MRI
After no significant differences between the two observers were seen (see interrater
variability), the results of the comparison of the skeletal age calculated using conventional
carpogram versus MRI were averaged. The Pearson correlation coefficient was 0.986
for T1 VIBE and 0.982 for T1 TSE ([Fig. 3A, B]). In addition, the average of the results from both sequences was correlated to
the average of the conventional carpograms. The Pearson correlation coefficient was
0.987 ([Fig. 3C]).
Fig. 3 Comparison of the average calculated skeletal age using conventional carpogram versus
MRI T1 VIBE A, conventional X-ray versus MRI T1 TSE B, and conventional carpogram versus average of MRI sequences C. The results were averaged for observers A and B, respectively. A line of origin
is shown in blue. For points above the line of origin, the calculated skeletal age
by MRI exceeds the calculated skeletal age by conventional X-ray. An adjustment line
(red) illustrates the trend of the values.
The average difference between the skeletal age determined using conventional carpogram
and the skeletal age calculated using T1 VIBE was 0.51 years with a standard deviation
of 0.492 years. On average, the skeletal age was estimated to be older using T1 VIBE
than conventional carpogram. The average difference between the skeletal age determined
using conventional carpogram and the skeletal age calculated using T1 TSE was 0.18
years with a standard deviation of 0.566 years. Younger children were estimated to
be older in the case of T1 TSE, and the skeletal age of older children was estimated
to be older when using conventional carpogram. The results for conventional carpogram
versus T1 VIBE and T1 TSE, respectively, are shown as a Bland-Altman plot ([Fig. 4A, B]). The results of the two observers were averaged for this purpose. In the case of
T1 VIBE, 95 % of values were within the LoA, which corresponds to a sufficiently symmetrical
distribution ([Fig. 4A]). In the case of T1 TSE, all values were within the LoA ([Fig. 4B]).
Fig. 4 Bland-Altmann plot for method comparison of skeletal age determination using conventional
X-ray and MRI T1 VIBE A as well as conventional X-ray and T1 TSE B. Comparison of the difference between the results obtained by the two methods with
the average of the results obtained by the two methods. Results averaged for both
observers. The mean difference between the results (mean = –0.5010 years and –0.0858
years, respectively) and the Limits of Agreement (LoA) were plotted up and down as
1.96 times the standard deviation (SD = 0.474 years and 0.497 years, respectively).
95 % A and 100 % B of the values are within the LoA. Distribution of the data is sufficiently symmetrical
over all sections.
Interrater variability
The skeletal ages determined both observers were compared using the Spearman rank
correlation coefficient. The interrater correlation is significant with a Spearman-Rho
of 0.985 for the 50 carpograms, 0.966 for T1 VIBE and 0.971 for T1 TSE at a level
of 0.01.
The interrater variability was shown with the help of a Bland-Altman plot ([Fig. 5]). The percentage of values within the LoA was 92 % for conventional carpograms ([Fig. 5A]), 98 % for T1 VIBE ([Fig. 5B]) and 91 % for T1 TSE ([Fig. 5C]). Sufficiently symmetrical distribution was seen in all plots.
Fig. 5 Bland-Altmann plot to show interrater variability, conventional X-ray A, T1 VIBE B, and T1 TSE C. Dependence of the difference between the calculated skeletal ages by observers A
and B on the average of the calculated skeletal ages by observers A and B. As auxiliary
lines, the mean difference between the estimates of the two observers (mean = –0,311
years A, –0,101 years B und –0,626 years C) and the limits of agreement (LoA) were plotted up and down as 1.96 times the standard
deviation (SD = 0,729 years A, 0,527 years B und 0,806 years C). 95 % of the values are within the LoA. Distribution of the data is sufficiently
symmetrical over all sections. Interrater correlation with a Spearman rank correlation
coefficient of 0.985 A, 0.966 B, and 0.971 C significant at the level of 0.01.
Interrater variability
After four weeks, each observer reevaluated ten examinations (conventional carpograms,
T1 VIBE, and T1 TSE). The datasets were processed in an anonymized and blinded manner.
The observers were blinded to the chronological age and the skeletal age calculated
in the first evaluation. Attention was paid to uniform distribution according to age
and gender. The observers assessed the children and adolescents in the second evaluation
on average to be 0.04 years (observer A) and 0.11 years (observer B) younger.
For observer A, there was a total intrarater Pearson correlation of 0.995. Categorized
by examination type, an interrater correlation with a Pearson correlation coefficient
of 0.994 was seen for conventional carpograms, 0.995 for T1 VIBE and 0.998 for T1 TSE.
For observer B, there was a total intrarater Pearson correlation of 0.988. Categorized
by examination type, an interrater correlation with a Pearson correlation coefficient
of 0.994 was seen for conventional carpograms, 0.993 for T1 VIBE and 0.994 for T1
TSE. All correlations are significant with a level of 0.01.
Time requirement
Examination time
The average examination time for a conventional carpogram including informed consent
discussion with the children/adolescents and their parents, positioning, image acquisition,
and post-processing was approximately 3 minutes. The MRI examination, also including
informed consent discussion, positioning, image acquisition, and post-processing,
took approximately 15 minutes.
Evaluation time
For both observers, the time needed to determine bone age for 15 evaluations (conventional
carpograms, T1 VIBE, and T1 TSE) was measured. The average time requirement for the
evaluation of a conventional carpogram was 147 seconds for observer A and 127 seconds
for observer B. The maximum time requirement was 191 seconds for observer A and 174
seconds for observer B. The average time requirement for the evaluation of an MRI
sequence was 205 seconds for observer A and 163 seconds for observer B. The maximum
time requirement was 252 seconds for observer A and 215 seconds for observer B. The
minimum time requirement was 162 seconds for observer A and 135 seconds for observer
B. Thus, the difference between the average time requirements is 58 seconds for observer
A and 36 seconds for observer B.
Discussion
To avoid the use of X-ray and CT examinations to determine bone age, alternative methods
were used in some studies. In this connection, ultrasound and MRI of the medial clavicular
epiphysis as well as MRI of the knee joint were studied [16]
[17]
[18]
[19]
[20]. Nonetheless, MRI of the hand was used as the alternative method of bone age diagnosis
in most studies [21]
[22]
[23]. Hojreh et al. (patients n = 10, test subjects n = 50) and Urschler et al. (patients
n = 18) examined the extent to which MRI in direct comparison to conventional carpograms
can be used to determine bone age for medical issues [12]
[24].
Correlation between conventional carpograms and MRI
In the present study, the comparison of MRI with conventional carpogram was analyzed
for the first time on the basis of a greater number of cases (n = 50). There was a
very good correlation regarding the determined bone age between conventional carpograms
and MRI (T1 TSE: 0.976; T1 VIBE: 0.975). On average, the age of the children and adolescents
was overestimated in the case of both MRI sequences compared to the conventional carpograms.
The average difference was slightly higher for T1 VIBE (0.51 years) than for T1 TSE
(0.18 years). All age groups were estimated to be older with T1 VIBE while older children
were estimated to be slightly younger with T1 TSE. The difference between the two
MRI sequences may be due to the difference in the appearance of bony structures. 95 %
of the values for T1 VIBE and 100 % of the values for T1 TSE were within the 95 %
confidence interval.
In their study including 18 patients with growth disorders, Urschler et al. also achieved
a highly significant correlation (0.98) between the skeletal age calculated using
T1 VIBE and the one calculated using conventional carpograms. The skeletal age calculated
using T1 VIBE was less than the one calculated using conventional carpograms with
an average difference of -0.25 years [12]. In the study by Hojreh et al., bone age was estimated higher with T1 VIBE by one
examiner with an average difference of 0.175 years, with only minimal differences
being seen for a second examiner (0.05) [24].
As already discussed by Urschler et al., one possible explanation for the differences
in bone age determination between the two modalities could be the two-dimensional
representation particularly of the growth plates in conventional carpograms compared
to the non-overlapping three-dimensional representation in the case of MRI. The additional
MRI visualization of cartilaginous and soft-tissue structures could have a further
influence even if this was not directly taken into consideration in the GP method
[12].
Interrater and intrarater variability
The reproducibility of this good correlation was seen in the analysis of the interrater
variability. Very good interrater variability was seen for carpograms (0.985) as well
as for T1 VIBE (0.966) and T1 TSE (0.971). The percentage of values within the 95 %
confidence interval was 93 % for conventional carpograms, 98 % for T1 VIBE and 95 %
for T1 TSE. It should be noted that the conventional GP method with selection of the
most suitable standard was used in most studies and more exact agreement between different
observers is seen compared to the ROI method used in the present study. Nonetheless,
the present results are to be considered equally good to very good compared to the
interrater correlations of other studies [21]
[22]
[24].
There were also very good correlations regarding intrarater variability both for observer
A (r = 0.995) and observer B (r = 0.988). There were also no relevant differences
in the categorization according to different methods. The results showed that bone
age determination using conventional carpogram as well as MRI can be reliably reproduced
in repeated evaluations and between different examiners.
Differences among sequences
The present study also examined whether advantages or disadvantages can be identified
in the evaluation using T1 VIBE and T1 TSE. According to the currently available studies,
T1 VIBE should be used even if the sequences have comparable results [25]. Urschler et al. acquired three MRI sequences in their subjects: T1-weighted 3 D
VIBE, T1-weighted SE, and T2-weighted GRE. Due to the better visualization of the
epiphyseal structures, the authors chose to use T1 VIBE for the determination of skeletal
age [12]. Hojreh et al. acquired three different sequences (T1-weighted TIRM, T1-weighted
3 D VIBE WE, and T1-weighted SE) and had them evaluated by two radiologists regarding
quality and suitability for skeletal age determination. The authors chose T1 VIBE,
with the better contrast enhancement of cartilaginous components compared to other
sequences being a particular advantage [24].
However, the benefit of additional information regarding the development of cartilaginous
structures in the determination of skeletal age using the atlas from Greulich and
Pyle is questionable. Cartilaginous structures cannot be visualized on the conventional
carpograms that serve as reference images in the atlas. As a result, the additional
information must be independently interpreted by the observers and theoretically integrated
into the course of development. In the long term, the introduction of an atlas based
on MRI sequences would address this limitation. In particular, in such an atlas, the
carpals could be observed starting with early cartilaginous development and integrated
into the evaluation [23].
Examination time
It takes significantly less time to perform a conventional carpogram than an MRI examination.
This fact must also be considered with regard to the compensation situation. The restriction
to the acquisition of one sequence and future further developments with shorter acquisition
times could decrease the difference regarding examination time between the two modalities.
Evaluation time
According to Horter et al., the average time requirement for the evaluation of a conventional
X-ray image according to Greulich and Pyle is 46.7 seconds with a standard deviation
of 15.2 seconds [10]. On average, observer A needed 147 seconds to evaluate a conventional carpogram
and observer B required 127 seconds. However, the ROI method was used in the present
study, while Horter et al. used the most suitable reference image in the GP method.
This fact could explain the difference in evaluation time. With an average time requirement
of 205 seconds (observer A) and 163 seconds (observer B), the evaluation of an MRI
sequence takes more time than the evaluation of a conventional X-ray image. The time
required for evaluation varies between the observers. The difference regarding the
average time requirement is 58 seconds for observer A and 36 seconds for observer
B. The additional time needed for the evaluation of an MRI image can be explained
by the number of different slices.
Outlook
Systems with artificial intelligence (AI) were used in some studies to evaluate conventional
carpograms for determining bone age. These had good results compared to the established
methods according to GP and TW [26]
[27]
[28]. Initial studies have already addressed the use of AI systems for bone age determination
based on MRI of the left hand or the knee joint [29]
[30]
[31]. The reduction of MRI acquisition times was also already examined in initial studies
[32]
[33], resulting in promising approaches for future studies.
Limitations
The present study is a monocentric study. A multicentric study with a larger number
of cases is needed to confirm the results. In addition, the ethnicity of the included
patients was not analyzed. This aspect should also be taken into consideration in
subsequent studies.
Conclusion
MRI is a reliable method for determining skeletal age without the use of ionizing
radiation. The results can be reproduced with a high degree of accuracy at different
points in time and by different observers. The GP atlas can be used to evaluate MRI
images. However, a new atlas based on MRI reference images should be developed in
order to take into account the additional information provided by the visualization
of cartilaginous structures on MRI. Both sequences (T1 VIBE and T1 TSE) yielded comparably
good results. The more definitive visualization of cartilaginous structures on T1
VIBE is advantageous particularly with regard to an MRI-based atlas.
-
MRI is a reliable method for determining bone age.
-
T1 VIBE and T1 TSE provide comparable results with slight advantages for T1 VIBE.
-
The results can be reproduced with a high degree of accuracy at different points in
time and by different observers.