Appl Clin Inform 2025; 16(05): 1837-1849
DOI: 10.1055/a-2765-6792
Research Article

A Sequence Clustering Approach to Mining Sleep Trajectories from Nursing Narratives and Structured Clinical Data

Authors

  • Alejandro García-Rudolph

    1   Departmento de Investigación e Innovación, Institut Guttmann, Institut Universitari de Neurorehabilitació adscrit a la UAB, Badalona, Barcelona, Spain
    2   Universitat Autònoma de Barcelona, Bellaterra (Cerdanyola del Vallès), Spain
    3   Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Badalona, Barcelona, Spain
  • Alicia Romero Marquez

    1   Departmento de Investigación e Innovación, Institut Guttmann, Institut Universitari de Neurorehabilitació adscrit a la UAB, Badalona, Barcelona, Spain
    2   Universitat Autònoma de Barcelona, Bellaterra (Cerdanyola del Vallès), Spain
    3   Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Badalona, Barcelona, Spain
  • Mónica López Andurell

    1   Departmento de Investigación e Innovación, Institut Guttmann, Institut Universitari de Neurorehabilitació adscrit a la UAB, Badalona, Barcelona, Spain
    2   Universitat Autònoma de Barcelona, Bellaterra (Cerdanyola del Vallès), Spain
    3   Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Badalona, Barcelona, Spain
  • Laura Jimenez Pérez

    1   Departmento de Investigación e Innovación, Institut Guttmann, Institut Universitari de Neurorehabilitació adscrit a la UAB, Badalona, Barcelona, Spain
    2   Universitat Autònoma de Barcelona, Bellaterra (Cerdanyola del Vallès), Spain
    3   Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Badalona, Barcelona, Spain
  • Susana Guillén Gazapo

    1   Departmento de Investigación e Innovación, Institut Guttmann, Institut Universitari de Neurorehabilitació adscrit a la UAB, Badalona, Barcelona, Spain
    2   Universitat Autònoma de Barcelona, Bellaterra (Cerdanyola del Vallès), Spain
    3   Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Badalona, Barcelona, Spain
  • Marc Navarro Berenguel

    1   Departmento de Investigación e Innovación, Institut Guttmann, Institut Universitari de Neurorehabilitació adscrit a la UAB, Badalona, Barcelona, Spain
    2   Universitat Autònoma de Barcelona, Bellaterra (Cerdanyola del Vallès), Spain
    3   Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Badalona, Barcelona, Spain
  • Eloy Opisso

    1   Departmento de Investigación e Innovación, Institut Guttmann, Institut Universitari de Neurorehabilitació adscrit a la UAB, Badalona, Barcelona, Spain
    2   Universitat Autònoma de Barcelona, Bellaterra (Cerdanyola del Vallès), Spain
    3   Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Badalona, Barcelona, Spain
  • Elena Hernandez-Pena

    1   Departmento de Investigación e Innovación, Institut Guttmann, Institut Universitari de Neurorehabilitació adscrit a la UAB, Badalona, Barcelona, Spain
    2   Universitat Autònoma de Barcelona, Bellaterra (Cerdanyola del Vallès), Spain
    3   Fundació Institut d'Investigació en Ciències de la Salut Germans Trias i Pujol, Badalona, Barcelona, Spain

Abstract

Background

Sleep quality critically influences recovery in neurological patients, yet its longitudinal monitoring during hospitalization remains limited. Nursing narrative notes offer an underutilized resource to track sleep trajectories objectively across time.

Objectives

To propose and apply a formal pipeline that integrates structured clinical data and unstructured nursing annotations to monitor sleep trajectories during post-acute inpatient neurorehabilitation, relying exclusively on free-to-use software tools and without increasing nursing workload.

Methods

A total of 17,039 nighttime nursing annotations were extracted and categorized into four sleep quality states. Two expert raters manually labeled a training set of 2,000 annotations (κ = 0.84). A random forest classifier achieved 0.93 sensitivity and 0.94 specificity and was used to classify the remaining notes. Sleep sequences were constructed and clustered using sequence analysis (TraMineR) and hierarchical clustering (AGNES, Ward's method). The obtained clusters (silhouette = 0.40) were compared using non-parametric statistics across clinical, functional, and social variables in a cohort of 303 post-acute consecutive neurorehabilitation inpatients.

Results

Four distinct sleep trajectory clusters were identified, each characterized by unique functional and socio-environmental profiles. The first group (n = 102; 33.7%) combined high functional independence, strong social support, stable economy, short hospitalization, and favorable sleep quality. The second group (n = 76; 25.1%) presented moderate functional independence, precarious economic conditions, and the highest proportion of poor sleep quality. The third group (n = 76; 25.1%) exhibited severe functional impairment, long hospitalization, poor housing conditions, but paradoxically the highest proportion of good sleep quality. The fourth group (n = 49; 16.2%) showed profound disability, relatively favorable socio-economic conditions, and predominance of intermediate sleep quality, likely influenced by medication. Distinctive sets of social and functional keywords emerged for each cluster.

Conclusion

This pipeline identified clinically meaningful sleep profiles from nursing notes, highlighting functional and social determinants' role in shaping neurorehabilitation sleep trajectories.

Protection of Human and Animal Subjects

This study involved secondary analysis of routinely collected nursing narrative annotations about patients' sleep. The protocol was reviewed and approved by the Institut Guttmann Ethics Committee. Data were extracted by authorized personnel and de-identified before analysis; no direct identifiers were available to the research team. All procedures complied with the Declaration of Helsinki and applicable data-protection regulations. Data were stored on secure, access-controlled servers at Institut Guttmann.


Data Availability Statement

The datasets generated for this study are available upon reasonable request to the corresponding author.


AI Disclosure Statement

During the preparation of this work the author(s) used ChatGPT-4o by OpenAI, in order to assist in drafting and editing text. After using this tool/service, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the publication.


Contributors' Statement

A.G.-R.: conceptualization, methodology, formal analysis, investigation, data curation, software, visualization, writing—original draft; A.R.M.: conceptualization, methodology, formal analysis, investigation, data curation, validation, writing—review and editing; M.L.A.: conceptualization, methodology, formal analysis, investigation, data curation, validation, writing—review and editing; L.G.P.: conceptualization, methodology, formal analysis, investigation, data curation, validation, writing—review and editing; S.G.G.: conceptualization, methodology, formal analysis, investigation, data curation, validation, writing—review and editing; M.N.B.: investigation, data curation, software, visualization, writing—review and editing; E.O.: conceptualization, project administration, formal analysis, software, supervision, writing—review and editing; E.H.-P.: conceptualization, project administration, formal analysis, supervision, writing—review and editing.




Publication History

Received: 29 April 2025

Accepted: 04 December 2025

Article published online:
18 December 2025

© 2025. Thieme. All rights reserved.

Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany