Subscribe to RSS
DOI: 10.1055/a-2617-6572
Summarize-then-Prompt: A Novel Prompt Engineering Strategy for Generating High-Quality Discharge Summaries
Authors
Funding The study was funded by the U.S. Department of Health and Human Services, National Institutes of Health, National Institute of Diabetes and Digestive and Kidney Diseases (fund no.: K08DK131286).

Abstract
Background
Accurate discharge summaries are essential for effective communication between hospital and outpatient providers but generating them is labor-intensive. Large language models (LLMs), such as GPT-4, have shown promise in automating this process, potentially reducing clinician workload and improving documentation quality. A recent study using GPT-4 to generate discharge summaries via concatenated clinical notes found that while the summaries were concise and coherent, they often lacked comprehensiveness and contained errors. To address this, we evaluated a structured prompting strategy, summarize-then-prompt, which first generates concise summaries of individual clinical notes before combining them to create a more focused input for the LLM.
Objectives
The objective of this study was to assess the effectiveness of a novel prompting strategy, summarize-then-prompt, in generating discharge summaries that are more complete, accurate, and concise in comparison to the approach that simply concatenates clinical notes.
Methods
We conducted a retrospective study comparing two prompting strategies: direct concatenation (M1) and summarize-then-prompt (M2). A random sample of 50 hospital stays was selected from a large hospital system. Three attending physicians independently evaluated the generated hospital course summaries for completeness, correctness, and conciseness using a 5-point Likert scale.
Results
The summarize-then-prompt strategy outperformed direct concatenation strategy in both completeness (4.28 ± 0.63 vs. 4.01 ± 0.69, p < 0.001) and correctness (4.37 ± 0.54 vs. 4.17 ± 0.57, p = 0.002) of the summarization of the hospital course. However, the two strategies showed no significant difference in conciseness (p = 0.308).
Conclusion
Summarizing individual notes before concatenation improves LLM-generated discharge summaries, enhancing their completeness and accuracy without sacrificing conciseness. This approach may facilitate the integration of LLMs into clinical workflows, offering a promising strategy for automating discharge summary generation and could reduce clinician burden.
Protection and Human and Animal Subjects
Human and animal subjects were not included in the project.
* Equal Contribution as Senior Author.
Publication History
Received: 15 February 2025
Accepted: 20 May 2025
Accepted Manuscript online:
21 May 2025
Article published online:
10 October 2025
© 2025. Thieme. All rights reserved.
Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany
-
References
- 1 Robelia PM, Kashiwagi DT, Jenkins SM, Newman JS, Sorita A. Information transfer and the hospital discharge summary: national primary care provider perspectives of challenges and opportunities. J Am Board Fam Med 2017; 30 (06) 758-765
- 2 van Walraven C, Seth R, Austin PC, Laupacis A. Effect of discharge summary availability during post-discharge visits on hospital readmission. J Gen Intern Med 2002; 17 (03) 186-192
- 3 Moore C, Wisnivesky J, Williams S, McGinn T. Medical errors related to discontinuity of care from an inpatient to an outpatient setting. J Gen Intern Med 2003; 18 (08) 646-651
- 4 Bergkvist A, Midlöv P, Höglund P, Larsson L, Bondesson A, Eriksson T. Improved quality in the hospital discharge summary reduces medication errors–LIMM: Landskrona Integrated Medicines Management. Eur J Clin Pharmacol 2009; 65 (10) 1037-1046
- 5 Kripalani S, LeFevre F, Phillips CO, Williams MV, Basaviah P, Baker DW. Deficits in communication and information transfer between hospital-based and primary care physicians: implications for patient safety and continuity of care. JAMA 2007; 297 (08) 831-841
- 6 Li JY, Yong TY, Hakendorf P, Ben-Tovim D, Thompson CH. Timeliness in discharge summary dissemination is associated with patients' clinical outcomes. J Eval Clin Pract 2013; 19 (01) 76-79
- 7 Hoyer EH, Odonkor CA, Bhatia SN, Leung C, Deutschendorf A, Brotman DJ. Association between days to complete inpatient discharge summaries with all-payer hospital readmissions in Maryland. J Hosp Med 2016; 11 (06) 393-400
- 8 Lewis P, Braddock K, Tolaymat L. et al. Discharge summary completion timeliness and the association of 30-day readmission. South Med J 2021; 114 (05) 319-321
- 9 Chatterton B, Chen J, Schwarz EB, Karlin J. Primary care physicians' perspectives on high-quality discharge summaries. J Gen Intern Med 2024; 39 (08) 1438-1443
- 10 Sorita A, Robelia PM, Kattel SB. et al. The ideal hospital discharge summary: a survey of U.S. physicians. J Patient Saf 2021; 17 (07) e637-e644
- 11 Momenipur A, Pennathur PR. Balancing documentation and direct patient care activities: a study of a mature electronic health record system. Int J Ind Ergon 2019; 72: 338-346
- 12 Suchman K, Garg S, Trindade AJ. Chat generative pretrained transformer fails the multiple-choice American College of Gastroenterology self-assessment test. Am J Gastroenterol 2023; 118 (12) 2280-2282
- 13 Salim Al-Damluji M, Dzara K, Hodshon B. et al. Association of discharge summary quality with readmission risk for patients hospitalized with heart failure exacerbation. Circ Cardiovasc Qual Outcomes 2015; 8 (01) 109-111
- 14 Van Veen D, Van Uden C, Blankemeier L. et al. Adapted large language models can outperform medical experts in clinical text summarization. Nat Med 2024; 30 (04) 1134-1142
- 15 Williams CYK, Bains J, Tang T. et al. Evaluating large language models for drafting emergency department discharge summaries. medRxiv 2024
- 16 Tang L, Sun Z, Idnay B. et al. Evaluating large language models on medical evidence summarization. NPJ Digit Med 2023; 6 (01) 158
- 17 Williams CYK, Subramanian CR, Ali SS. et al. Physician- and large language model-generated hospital discharge summaries: a blinded, comparative quality and safety study. medRxiv 2024:2024.2009.2029.24314562
- 18 Epic Systems Corporation. (n.d.). Epic electronic health record software. In: Epic Systems Corporation;
- 19 Agency for Healthcare Research and Quality. HCUP National Inpatient Sample (NIS) 2022: Core Masked Summary Statistics. Rockville, MD: AHRQ; 2024
- 20 Liu NFLK, Hewitt J, Paranjape A, Bevilacqua M, Petroni F, Liang P. Lost in the middle: how language models use long contexts. Available at: https://arxiv.org/abs/2307.03172
- 21 Open AI. Models. In: OpenAI; 2024
- 22 Zhou Y, Ringeval F, Portet F. Can GPT models follow human summarization guidelines? Evaluating ChatGPT and GPT-4 for dialogue summarization. Available at: https://ui.adsabs.harvard.edu/abs/2023arXiv231016810Z . Accessed October 1, 2023 at:
- 23 Sundararajan B, Sripada S, Reiter E. Improving factual accuracy of neural table-to-text output by addressing input problems in ToTTo. https://ui.adsabs.harvard.edu/abs/2024arXiv240404103S . Accessed April 1, 2024 at:
- 24 Lee J, Yoon W, Kim S. et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 2020; 36 (04) 1234-1240
- 25 Nori H, King N, McKinney SM, Carignan D, Horvitz E. Capabilities of gpt-4 on medical challenge problems. Accessed at: https://arxiv.org/abs/2303.13375