J Wrist Surg 2025; 14(06): 500-508
DOI: 10.1055/a-2674-3914
Special Review: AI in Wrist Surgery

AI-driven Technologies for Wrist Fracture Prediction: A Narrative Review of Emerging Approaches

Authors

  • Stefania Briano

    1   Hand and Upper Limb Surgery Unit, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
  • Maria Cesarina May

    2   Department of Integrated Surgical Diagnostic Sciences (DISC), Orthopedic Clinic, University of Genoa, Genoa, Italy
  • Giacomo Demontis

    1   Hand and Upper Limb Surgery Unit, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
  • Giulia Pachera

    1   Hand and Upper Limb Surgery Unit, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
  • Vittoria Mazzola

    2   Department of Integrated Surgical Diagnostic Sciences (DISC), Orthopedic Clinic, University of Genoa, Genoa, Italy
  • Federico Vitali

    2   Department of Integrated Surgical Diagnostic Sciences (DISC), Orthopedic Clinic, University of Genoa, Genoa, Italy
  • Alessandra Galuppi

    1   Hand and Upper Limb Surgery Unit, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
  • Emanuela Dapelo

    1   Hand and Upper Limb Surgery Unit, IRCCS Ospedale Policlinico San Martino, Genoa, Italy
  • Andrea Zanirato

    2   Department of Integrated Surgical Diagnostic Sciences (DISC), Orthopedic Clinic, University of Genoa, Genoa, Italy
  • Matteo Formica

    2   Department of Integrated Surgical Diagnostic Sciences (DISC), Orthopedic Clinic, University of Genoa, Genoa, Italy

Funding None.
 

Abstract

Wrist fractures account for approximately 18% of all fractures and are especially common in older adults with osteoporosis and in younger patients following high-energy trauma. Predicting healing outcomes in these cases remains clinically challenging due to variability in fracture types, patient-specific factors, and treatment pathways. Although artificial intelligence (AI) systems have already demonstrated diagnostic accuracies exceeding 95% in detecting and classifying wrist fractures on radiographs, their use in prognostic modeling is still emerging.

This narrative review examines recent developments in AI-driven approaches aimed at improving clinical prognosis following wrist fractures. Advanced models—such as convolutional neural networks (CNNs), transformers, and hybrid architectures—can identify subtle imaging and clinical features associated with complications like malunion, delayed healing, or nonunion. The integration of multimodal data, including comorbidities, imaging, and even osteogenomic profiles, shows promise in enhancing risk stratification and guiding more personalized follow-up strategies.

Emerging technologies such as explainable AI, synthetic data generation, and federated learning offer potential solutions to challenges related to data availability, interpretability, and model generalization across care settings. Despite encouraging results, further validation in real-world clinical environments and standardization of outcome definitions are needed.

In summary, AI-based prognostic tools for wrist fractures could support orthopedic decision-making by identifying high-risk patients early, tailoring follow-up protocols, and improving long-term outcomes through more individualized care.


Artificial intelligence (AI) is undergoing rapid expansion, with increasingly significant applications in the medical field. This technological advancement offers promising tools to address the growing complexity and volume of clinical data.[1] Notably, AI has demonstrated considerable success in medical imaging, diagnostics, and the planning of personalized treatments, contributing to more efficient and accurate healthcare delivery.[2]

Among medical specialties, orthopedics has particularly benefited from AI integration, especially in fracture detection and classification.[3] The adoption of advanced techniques such as artificial neural networks (ANNs) and convolutional neural networks (CNNs) has enhanced diagnostic accuracy to levels comparable to expert radiologists, reducing errors, streamlining clinical workflows, and improving patient outcomes.[4] [5] [6] However, although diagnostic applications are now well established, the use of AI for predictive analytics—particularly for forecasting postoperative recovery trajectories—remains underutilized.

Wrist fractures, accounting for approximately 18% of all fractures, occur across all age groups but are especially common in older adults with osteoporosis and in younger individuals following high-impact trauma.[7] [8] The wide variability in fracture patterns, treatment options, and patient-specific factors makes prognosis challenging and often reliant on complex, subjective assessments by clinicians.[9] In this context, AI could provide valuable support by enhancing risk stratification, anticipating complications such as nonunion or delayed healing, and guiding personalized follow-up plans.[10] [11]

This narrative review explores recent advances in AI-driven prognostic modeling for wrist fractures, with a focus on tools that go beyond diagnosis to support outcome prediction and risk-based patient management. It critically assesses current capabilities, identifies key limitations, and outlines future directions, including real-world validation and interdisciplinary integration.

Ultimately, the goal is to support the transition from standardized orthopedic care to a personalized approach, where predictive analytics inform clinical decision-making, optimize follow-up strategies, and improve recovery pathways for patients across diverse healthcare settings.

Materials and Methods

Although this work is framed as a critical narrative review—which, by definition, does not require a formal systematic approach—a structured and transparent methodology was adopted to enhance scientific rigor and ensure reproducibility.

Search Strategy and Databases

A structured literature search was conducted across PubMed, Scopus, and Web of Science, covering the period from January 2015 to April 2025. The search strategy combined controlled vocabulary (e.g., MeSH terms) and free-text keywords using Boolean operators. The main search terms included “wrist fracture,” “distal radius fracture,” “artificial intelligence,” “machine learning,” “deep learning,” “neural networks,” “transformer models,” “radiomics,” “prognostic model,” and “risk prediction.” The search was restricted to peer-reviewed articles in English, focusing on human subjects and clinical applications of AI in orthopedics.


Inclusion and Exclusion Criteria

Studies were considered eligible if they presented original research or high-quality reviews investigating the application of AI in the detection, classification, prognosis, or risk stratification of wrist or musculoskeletal fractures. Only studies that reported quantitative performance metrics—such as accuracy, area under the curve (AUC), sensitivity, or specificity—were included. Particular attention was given to those employing advanced machine learning approaches, including CNNs, residual networks (ResNets), transformers, ensemble models, or multimodal frameworks integrating diverse clinical data. Articles were excluded if they were non-peer-reviewed (e.g., preprints, editorials, or conference abstracts), lacked sufficient methodological detail or performance evaluation, or were based on animal models or in vitro experiments.


Study Selection and Data Extraction

All records retrieved from the database search were imported into Rayyan QCRI, a platform designed to support systematic review workflows, enabling blinded and independent screening by two reviewers. The selection process involved an initial screening of titles and abstracts, followed by full-text assessment of potentially relevant studies. In addition, the reference lists of selected articles were manually reviewed to identify further pertinent publications. Discrepancies between reviewers were resolved through discussion and consensus.

For each included study, data were extracted on the type of AI model used, the input modalities (such as radiographic images, clinical variables, or genomic data), the predicted outcomes (e.g., fracture classification, healing trajectory, or complications), and the validation strategy adopted (including internal cross-validation, external datasets, or prospective cohorts). The results were synthesized narratively and thematically, highlighting key trends in model architecture, multimodal data integration, and explainable AI approaches.

A summary of the study selection process is presented in [Fig. 1].

Zoom
Fig. 1 Literature selection flowchart: this is a summary of the literature selection process followed in this narrative review.


Clinical Applications and Impact on Orthopedic Practice

AI is transforming orthopedic workflows, particularly in the detection of fractures in emergency situations. Tools such as SmartUrgence, BoneView, and Rayvolve can identify wrist fractures on X-rays, speeding up diagnosis and reducing waiting times.[12] [13] AI can also reduce misdiagnosis, particularly for subtle or nondisplaced fractures, and improve radiologists' performance when used as a support tool.[4] [14] These AI solutions have already been CE-marked and integrated into the workflows of emergency departments in several European hospitals, where they perform triage support functions or act as second readers.[12] [13] [14]

AI has applications that extend beyond detection into personalized post-surgical management. By predicting the risk of complications, it enables targeted follow-up, ensuring that high-risk patients are monitored closely while unnecessary imaging for low-risk cases is reduced.[15] This optimizes the use of resources, minimizes radiation exposure, and facilitates shared decision-making.[16]

In order to transition AI-based prognostic models from research to clinical practice, their effectiveness in real-world settings must be validated through prospective studies. Several clinical centers have launched pilot programs testing AI-driven risk stratification tools for post-fracture monitoring, integrating them into real-time decision-making workflows. Once validated, these models could replace rigid postoperative protocols with more flexible, personalized approaches, thereby optimizing clinical outcomes and costs.[17]


AI Architectures for Prognosis: From Image Analysis to Comparative Evaluation

Early applications of deep learning in orthopedics have leveraged standard CNN architectures—most notably VGG, ResNet, and DenseNet—adapted from ImageNet-pretrained models for the classification of fractures in radiographs.[18] CNNs are particularly effective for analyzing medical images, as they automatically extract hierarchical features, from basic edges to complex anatomical structures, without manual feature engineering.[19] These architectures have been extensively evaluated in fracture classification tasks, each demonstrating specific advantages. VGG employs a simple, sequential structure; ResNet introduces residual connections to facilitate the training of deeper networks; and DenseNet promotes efficient feature reuse through dense connectivity. Although performance differences are often modest, DenseNet models frequently outperform others, particularly in small or heterogeneous datasets.[20] In the “DeepWrist” study, multiple CNNs achieved area-under-the-curve (AUC) scores exceeding 0.95 in internal validation.[21] [Fig. 2] illustrates the architectural differences and representative diagnostic outputs of VGG, ResNet, and DenseNet models when applied to wrist radiographs.

Zoom
Fig. 2 Architectural overview of three convolutional neural network (CNN) models applied to wrist radiographs: VGG (sequential layers), ResNet (residual blocks with skip connections), and DenseNet (dense layer connections). Diagnostic outputs simulate common fracture classifications such as “no fracture” or “comminuted fracture.”

More recent approaches have improved basic CNNs using ensemble models and architectural enhancements to boost sensitivity and robustness.[22] A multi-center study reported 97% accuracy using a hybrid DenseNet201-VGG16 model.[23] Object detection networks like YOLOv8 also enable real-time localization, even in complex cases such as pediatric wrist fractures.[24]

Despite their success, traditional CNNs are limited in their capacity to model global contextual relationships across an image. Transformer-based models, originally designed for natural language processing, overcome this limitation through self-attention mechanisms that capture long-range dependencies. This ability to integrate information across an entire image enhances diagnostic accuracy in complex or subtle cases.[25] Furthermore, multi-view deep learning models are now being employed to process and combine information from different projections—such as frontal and lateral X-rays—mirroring clinical reasoning by synthesizing complementary spatial cues.[18]

One example of transformer-based modeling in orthopedics is the Cross-View Deformable Transformer, which integrates multiple projections to detect non-displaced hip fractures using deformable self-attention.[26] Given the standard dual-view protocol for wrist radiographs, similar models could enhance 3D fracture characterization. Meanwhile, object detection networks like YOLO and Faster R-CNN also perform well in fracture localization; in pediatric wrist imaging, they match radiologists in accuracy, though false-positive rates vary across models.[27] [Fig. 3] provides a conceptual comparison between traditional CNNs and transformer-based models in the analysis of wrist radiographs.

Zoom
Fig. 3 Comparison of local and global feature processing. (A) Convolutional neural networks (CNNs) extract visual features locally through small receptive fields. (B) Transformers apply global self-attention, linking spatially distant regions to integrate anatomical context and improve detection of subtle fractures.

Multimodal and Multitask Models in the Orthopedic Context

The decision-making process for fractures is inherently multimodal, integrating radiographs, clinical data, demographics, and contextual information about the trauma. Combining these heterogeneous sources significantly improves the performance of artificial intelligence models.[28] Multimodal architectures process each type of data through dedicated neural networks, the outputs of which are then combined to make the final prediction. For instance, features extracted from images using CNNs can be combined with clinical information processed by a multilayer perceptron (MLP), thereby enhancing predictive accuracy.[29]

The PRAISE project (Predicting Fracture Outcomes from Clinical Registry Data using Artificial Intelligence Supplemented models for Evidence-Informed Treatment) is an example of multimodal modeling that combines radiographic images with unstructured data (e.g., surgical and radiological reports) to improve outcome prediction and risk assessment of complications in wrist fractures.[30] Additionally, multitask learning (MTL) is emerging as a promising strategy, where a single model is trained to solve multiple related tasks simultaneously. In the context of wrist fractures, for instance, a multitask network has the capacity to classify the probability of a fracture healing with or without complications, while concurrently predicting the time until radiographic union. Rather than constructing distinct models for each outcome, MTL employs a shared feature extractor with separate prediction “heads” for each task, a strategy that reduces the risk of overfitting and enhances generalization.[31] [Fig. 4] illustrates a representative multitask and multimodal deep learning architecture applied to wrist fracture prognosis, combining radiographic and structured clinical data to simultaneously predict fracture class and revisit frequency.

Zoom
Fig. 4 Multimodal and multitask deep learning model for wrist fracture prognosis. Radiographic images are processed via an encoder, and structured clinical data via a multilayer perceptron (MLP). Feature fusion is followed by a recurrent neural network (RNN) and fully connected (FC) layers to simultaneously predict fracture class and revisit frequency.

Finally, a growing frontier in multimodal AI is the integration of osteogenomics—the analysis of genetic factors affecting bone health—into predictive models.[32] Combining genomic data with imaging, clinical records, and biomechanics allows for a more accurate diagnosis and treatment.[33] For example, incorporating SNP profiles associated with fracture risk could facilitate personalized implant selection or pharmacological strategies. The PRECISE study exemplifies this precision-medicine approach by integrating diverse, patient-specific data to inform optimal timing and treatment decisions in orthopedic trauma cases.[34]


Comparative Evaluation of AI Models

AI models often match or outperform radiologists in fracture detection tasks, particularly when CNNs are used to assist in clinical decision-making. However, there is no universally superior architecture. CNNs excel at extracting localized features from images, while transformers offer greater contextual awareness through self-attention. Ultimately, model performance depends more on the quality of the training data, task design, and evaluation metrics than on the architecture of the model alone.[35] Traditional approaches such as logistic regression and decision trees have been used with some success in predicting outcomes,[36] [37] [38] but ensemble learning techniques tend to offer superior accuracy and robustness in internal and external validations.

Among ensemble methods, random forests build multiple independent decision trees on random subsets of data, thereby improving generalizability and reducing the risk of overfitting.[39] In contrast, gradient boosting builds trees sequentially, with each one correcting the errors of the previous one. This allows the model to capture complex nonlinear interactions between variables, such as patient's age or comorbidities.[40] In a comparative study of postoperative complications after distal radius fracture fixation, gradient boosting outperformed both logistic regression and random forests, demonstrating its potential as a clinically useful tool for risk stratification.[15]

Recent advances have focused on multimodal models that integrate images and structured clinical data in order to improve predictive accuracy and generalizability.[41] [42] [43] [44] [45] Theoretical and empirical works support the superiority of these models over monomodal approaches, since they utilize a more comprehensive patient profile. For instance, imaging features derived from CNNs can be combined with clinical variables processed through fully connected neural networks to improve outcome prediction. However, the success of multimodal AI depends heavily on how data sources are fused; naive or poor integration strategies can reduce performance, particularly when incorporating noisy or incomplete variables. A study comparing models using only imaging data and models using only clinical data demonstrated that imaging inputs are often more predictive in certain fracture-related tasks.[46] Nevertheless, early implementations of multimodal systems, such as real-time pipelines integrating X-rays with electronic health record (EHR) data in emergency settings, have demonstrated improved triage efficiency and decision support in orthopedic consultations.

[Table 1] summarizes the most common types of AI models for fracture prediction, detailing their inputs, strengths, and limitations in clinical and imaging contexts.

Table 1

Summary of common input data types, key strengths, and principal limitations of each model architecture, including both classical machine learning approaches (e.g., logistic regression, random forest, gradient boosting) and modern deep learning frameworks (e.g., convolutional neural networks, transformers, multitask learning)

Model

Input

Advantages

Limitations

Technical notes

Convolutional neural network (CNN)

Imaging

Excellent for medical image analysis; enhances sensitivity and specificity in fracture detection

Requires large, annotated datasets; may be less effective in capturing long-range spatial dependencies

U-Net based architectures with heatmap output for fracture localization

Logistic regression

Clinical data

High interpretability; easy to implement; useful in identifying significant risk factors

Limited in modeling nonlinear relationships; reduced performance with high-dimensional or sparse data

Linear model with L2 regularization and sigmoid activation

Random forest

Clinical data

Handles missing data and categorical variables effectively; reduces overfitting compared to single decision trees

Can become complex with many trees; requires more computational resources

Ensemble of decision trees built on random data subsets using bagging

Gradient boosting (e.g., XGBoost)

Clinical data

Excellent at capturing complex variable interactions; provides feature importance; high predictive accuracy

Sensitive to hyperparameters; risk of overfitting if not properly regularized

Builds trees sequentially, correcting previous errors; uses shrinkage and regularization

Multitask learning (MTL)

Imaging + Clinical data

Shares representations across related tasks (e.g., fracture detection and risk stratification); improves generalization

Challenging to implement; requires balanced task weighting and well-annotated multimodal datasets

Weight-sharing architectures; uses joint loss functions for simultaneous task optimization

MTL often improves model performance by leveraging shared representations.[47] [48] [49] However, unrelated or noisy tasks can degrade accuracy, making careful task selection and weighting critical. As Lin et al have noted, only closely related tasks should be included to maximize the benefits of MTL in orthopedic prediction.[37] Preliminary MTL frameworks that predict both healing time and follow-up needs in post-fracture care have been tested in retrospective cohorts, and early results suggest greater clinical relevance compared to single-output networks.


Challenges and Limitations in AI-driven Fracture Healing Prediction

Despite notable advances, there are still several challenges hindering the clinical translation of AI models in fracture healing. A major limitation is the scarcity and poor quality of data. Deep learning requires large, well-annotated datasets; however, complications such as nonunion or hardware failure are relatively rare, which introduces class imbalance and skews predictions toward uncomplicated healing. For instance, a study of distal radius surgery found that only 21.5% of patients experienced short-term complications, which limits the generalizability of the results to other populations.[15] Furthermore, outcome labels frequently lack standardization and can vary in meaning between centers, particularly when long-term functional recovery, such as mobility or pain, is more difficult to quantify than radiographic healing. To address this issue, ongoing initiatives such as the PRAISE registry aim to collect longitudinal, multimodal data to support real-world model training and benchmarking.[30]

To compensate for sparse and imbalanced datasets, the generation of synthetic data has emerged as a powerful complementary strategy. Techniques such as generative adversarial networks (GANs) can simulate realistic musculoskeletal images with labeled fractures, while synthetic patient profiles help to capture variability across underrepresented demographics.[50] These artificially constructed datasets enhance model generalizability, particularly when access to data is restricted due to privacy concerns.[51] [52] In orthopedics, synthetic data have been used to simulate rare fracture types, surgical outcomes, and diverse patient trajectories. Several developers have begun incorporating such synthetic cases into hospital validation environments, enabling robust stress-testing of AI systems under challenging clinical conditions.[53]

Beyond data limitations, a lack of interpretability remains a critical barrier to building clinical trust. Clinicians are more likely to adopt AI tools if they can understand the rationale behind the models' outputs. However, deep learning models often operate as “black boxes,” which makes it difficult to explain individual risk predictions. Explainable AI (XAI) tools, such as Grad-CAM heatmaps, can highlight the image regions most responsible for prediction. For example, they can show fracture fragments associated with a higher risk of complications.[54] These visual cues build trust and aid real-time decision-making. Without such transparency, even accurate predictions may be met with skepticism. There are also ethical and legal risks arising from false positives or negatives, which underscores the need for clear clinical guidelines and responsible implementation pathways.[55] Early pilot studies have begun embedding XAI visualizations into picture archiving and communication system (PACS) workstations, enabling clinicians to review explanatory overlays during radiological assessments.

Finally, technical and regulatory hurdles complicate the integration of AI into daily practice. Predictive models must interface seamlessly with hospital systems, such as PACS and EHRs, to automatically retrieve relevant data and provide clear, actionable outputs, e.g., “85% malunion risk.” This requires infrastructure upgrades, workflow redesign, and clinician training.[56] Furthermore, regulatory approval is essential before any AI system can be used to guide care.[57] [58] Even the most accurate models may be subject to slow adoption due to these barriers. However, some tools are progressing: BoneView, for example, has received CE certification and is being piloted in European hospitals. Clinical studies are also underway to test the effectiveness of incorporating risk calculators into radiology workflows for real-time fracture management.[59]

These three interrelated barriers—data availability, interpretability, and clinical integration—are summarized in [Fig. 5], which provides a visual overview of the main technical obstacles to implementation alongside the most promising current strategies.

Zoom
Fig. 5 Visual summary of the key technical barriers to clinical adoption of AI-driven fracture healing prognostic models. The infographic highlights three domains—data scarcity and quality, interpretability and trust, and integration and regulation—each accompanied by typical challenges and current solution strategies.

Conclusion

AI is set to transform orthopedic care, shifting the focus from reactive treatment to proactive, personalized management. Besides detecting fractures, AI can now predict healing trajectories, identify high-risk patients early, and recommend personalized follow-up plans in real time. For instance, a patient identified by an AI model as being at high risk of malunion following distal radius fixation could undergo more intensive monitoring or begin rehabilitation earlier, while those at low risk could safely avoid unnecessary diagnostic imaging, thereby reducing clinical burden and costs.

In a hospital setting, AI tools integrated with PACS and EHRs can highlight complications directly on postoperative images, supporting rapid decision-making during rounds or multidisciplinary meetings. In occupational medicine, such systems can stratify injured workers based on their potential for recovery, thereby guiding return-to-work planning and resource allocation.

Several pilot studies and validation activities are already underway to support the clinical translation of these technologies. The PRAISE registry is collecting multimodal and longitudinal data to train and compare prognostic models, while PACS platforms integrated with XAI are being tested in radiology departments to improve model interpretability and user confidence. Additionally, AI-based risk stratification tools are being tested in real-world orthopedic workflows at several centers in Europe and North America, offering a clear route from research to routine care.

Ultimately, AI will not replace clinical judgment but enhance it. The prospect of every fracture benefiting from both surgical experience and predictive analysis is becoming increasingly tangible. Through continued interdisciplinary collaboration, rigorous prospective validation, and responsible implementation, AI has the potential to ensure safer, more efficient, and more outcome-oriented fracture management.



Conflict of Interest

None declared.

Authors' Contributions

All listed authors meet the ICMJE criteria for authorship. Each author contributed substantially to the conception, research, drafting, and final review of the manuscript. No writing assistance was used.


Ethical Approval

This is a narrative review article and does not involve original research with human or animal subjects. The authors have adhered to the ethical standards set by the Committee on Publication Ethics (COPE) and the International Committee of Medical Journal Editors (ICMJE). All sources have been appropriately cited to ensure academic integrity and transparency. This manuscript is original, has not been published previously, and is not under consideration for publication elsewhere.



Address for correspondence

Maria Cesarina May, MD
Orthopedic Clinic, University of Genoa, IRCCS Ospedale Policlinico San Martino
Largo Rosanna Benzi, 10, 16132 Genoa
Italy   

Publication History

Received: 20 May 2025

Accepted: 31 July 2025

Article published online:
20 August 2025

© 2025. Thieme. All rights reserved.

Thieme Medical Publishers, Inc.
333 Seventh Avenue, 18th Floor, New York, NY 10001, USA


Zoom
Fig. 1 Literature selection flowchart: this is a summary of the literature selection process followed in this narrative review.
Zoom
Fig. 2 Architectural overview of three convolutional neural network (CNN) models applied to wrist radiographs: VGG (sequential layers), ResNet (residual blocks with skip connections), and DenseNet (dense layer connections). Diagnostic outputs simulate common fracture classifications such as “no fracture” or “comminuted fracture.”
Zoom
Fig. 3 Comparison of local and global feature processing. (A) Convolutional neural networks (CNNs) extract visual features locally through small receptive fields. (B) Transformers apply global self-attention, linking spatially distant regions to integrate anatomical context and improve detection of subtle fractures.
Zoom
Fig. 4 Multimodal and multitask deep learning model for wrist fracture prognosis. Radiographic images are processed via an encoder, and structured clinical data via a multilayer perceptron (MLP). Feature fusion is followed by a recurrent neural network (RNN) and fully connected (FC) layers to simultaneously predict fracture class and revisit frequency.
Zoom
Fig. 5 Visual summary of the key technical barriers to clinical adoption of AI-driven fracture healing prognostic models. The infographic highlights three domains—data scarcity and quality, interpretability and trust, and integration and regulation—each accompanied by typical challenges and current solution strategies.