J Neurol Surg B Skull Base 2022; 83(S 01): S1-S270
DOI: 10.1055/s-0042-1743643
Presentation Abstracts
Podium Abstracts

Automatic Assessment of Surgical Performance Using Intraoperative Video and Deep Learning: A Comparison with Expert Surgeon Video Review

Dhiraj J. Pangal
1   Department of Neurosurgery, University of Southern California, Los Angeles, California, United States
,
Guillaume Kugener
1   Department of Neurosurgery, University of Southern California, Los Angeles, California, United States
,
Yichao Zhu
2   Viterbi School of Engineering, University of Southern California, California, United States
,
Aditya Sinha
3   Google, Inc.
,
Vyom Unadkat
2   Viterbi School of Engineering, University of Southern California, California, United States
,
David J. Cote
1   Department of Neurosurgery, University of Southern California, Los Angeles, California, United States
,
Arman Roshannai
1   Department of Neurosurgery, University of Southern California, Los Angeles, California, United States
,
Ben Strickland
1   Department of Neurosurgery, University of Southern California, Los Angeles, California, United States
,
Martin Rutkowski
4   Medical College of Georgia, Georgia, United States
,
Andrew Hung
5   Department of Urology, University of Southern California, Los Angeles, California, United States
,
Animashree Anandkumar
6   California Institute of Technology, California, United States
,
X. Y. Han
7   Operations Research and Information Engineering, Cornell University, Ithaca, New York, United States
,
Vardan Papyan
8   Department of Mathematics, University of Toronto, Toronto, Canada
,
Bozena Wrobel
9   Department of Head and Neck Surgery, University of Southern California, Los Angeles, California, United States
,
Gabriel Zada
1   Department of Neurosurgery, University of Southern California, Los Angeles, California, United States
,
Daniel A. Donoho
10   Division of Neurosurgery, Children's National Medical Center, Washington, Dist. of Columbia, United States
› Author Affiliations
 
 

    Introduction: Intraoperative video contains sufficient information for experts to assess surgeon skill and provide feedback, but expert evaluation is rarely available. Deep neural networks (DNN) capable of interpreting video can provide rich feedback, but often require prodigious amounts of training data. The ability of a deep neural network to predict the outcome of an attempt at surgical hemostasis and accurately quantify blood loss has never been compared with that of expert skull base surgeons.

    Methods: Simulated outcomes following carotid artery laceration (SOCAL) is a publicly available video dataset with 154 videos of surgeons managing a simulated carotid artery injury during endoscopic endonasal surgery. A model was developed which first extracted features from each frame of video individually using a DNN; these features were then passed in sequence to a long-short-term memory (LSTM) recurrent neural network. The model was trained on 134 videos (1 minute in length) and tested on 20 videos to predict: participant hemorrhage control ability (dichotomous success/failure) and cumulative blood loss (mL). Four skull base neurosurgeons with endoscopic expertise viewed the 20 test videos and predicted hemorrhage control ability, blood loss and overall technical skill (Likert's scale).

    Results: Surgeons successfully predicted the outcome of the hemorrhage control attempt in 14/20 trials (sensitivity: 79%, specificity: 56%, positive predictive value [PPV] 69%, and negative predictive value [NPV]: 71%). The interrater reliability for predicting outcome between surgeons was 0.95. After training, the model correctly predicted outcome of the same videos in 17 of 20 trials (sensitivity = 100%, specificity = 66%, PPV = 79%, and NPV = 100%). The interrater reliability was 0.43 between the model and expert cohort. Expert surgeons estimated blood loss with root mean standard error (RMSE) of 350 mL (R2 = 0.64), while the model had a lower (superior) RMSE of 295 mL (R2 = 0.74). We validated the model by inputting video segments of known high technical proficiency and low technical proficiency, as adjudicated by the experts, and the model universally predicted success and failure appropriately. In further validation testing, we explored two trials where experts and the model both incorrectly predicted outcome. Providing the model with video containing the critical error from these trials resulted in correct prediction of task failure.

    Conclusion: DNN can be trained to generate accurate predictions of hemorrhage control outcome and blood loss using small, surgical video datasets. After training, the DNN demonstrated similar outcome prediction and superior blood loss prediction compared with expert surgeons. Validation testing revealed that the DNN predictions were improved when provided with critical moments, just as human experts would be. A broad collection, classification, and annotation of surgical video could help develop DNN capable of predictions across a wide range of surgical tasks.

    Zoom Image
    Fig. 1
    Zoom Image
    Fig. 2
    Zoom Image
    Fig. 3

    #

    No conflict of interest has been declared by the author(s).

    Publication History

    Article published online:
    15 February 2022

    © 2022. Thieme. All rights reserved.

    Georg Thieme Verlag KG
    Rüdigerstraße 14, 70469 Stuttgart, Germany

     
    Zoom Image
    Fig. 1
    Zoom Image
    Fig. 2
    Zoom Image
    Fig. 3