Subscribe to RSS
DOI: 10.1055/a-2742-2349
AI Prompt Engineering for Neurologists and Trainees
Authors
Funding Information P.H. has received research support from the NIH (UE5/R25) and the American Academy of Neurology/American Epilepsy Society/American Brain Foundation/Epilepsy Foundation (Susan S. Spencer, MD, Clinical Research Training Fellowship in Epilepsy). L.M.V.R.M. has received research support from the NIH (5R01AG073410-02, 2R01AG082693-01), the CDC (5U48DP006377-04-00), and the Epilepsy Foundation (Consultant to the CEO and Director of the Epilepsy Learning Healthcare System).
Abstract
Large language models (LLMs) have transformative potential in neurology, impacting clinical decision-making, medical training, and research. Prompt engineering, the strategic design of inputs to optimize LLM performance, is essential for neurologists and trainees seeking to effectively integrate these powerful tools into practice. Carefully crafted prompts enable LLMs to summarize complex patient narratives, generate differential diagnoses, and support patient education. In training, structured prompts enhance diagnostic reasoning, board preparation, and interactive case-based learning. Neurological research also benefits, with LLMs aiding in data extraction, computed phenotype generation, and literature synthesis. Despite their promise, challenges remain, including hallucinations, data bias, privacy concerns, and regulatory complexities. This review synthesizes current advances and highlights best practices, including two structured prompt engineering frameworks tailored to neurology: Role-Task-Format (RTF) for routine use and our newly developed BRAIN (Background, Role, Aim, Instructions, Next steps) for complex tasks. We offer practical guidance to maximize accuracy, safety, and equity in LLM outputs, ensuring reliable support for neurologists and trainees.
Keywords
large language models - prompt engineering - neurology - artificial intelligence - clinical decision support‡ These authors are co-senior authors.
Publication History
Received: 26 July 2025
Accepted: 10 November 2025
Accepted Manuscript online:
11 November 2025
Article published online:
03 December 2025
© 2025. Thieme. All rights reserved.
Thieme Medical Publishers, Inc.
333 Seventh Avenue, 18th Floor, New York, NY 10001, USA
-
References
- 1 Romano MF, Shih LC, Paschalidis IC, Au R, Kolachalama VB. Large language models in neurology research and future practice. Neurology 2023; 101 (23) 1058-1067
- 2 Moor M, Banerjee O, Abad ZSH. et al. Foundation models for generalist medical artificial intelligence. Nature 2023; 616 (7956) 259-265
- 3 Chen TC, Multala E, Kearns P. et al. Assessment of ChatGPT's performance on neurology written board examination questions. BMJ Neurol Open 2023; 5 (02) e000530
- 4 Mishra V, Lurie Y, Mark S. Accuracy of LLMs in medical education: evidence from a concordance test with medical teacher. BMC Med Educ 2025; 25 (01) 443
- 5 Poonia P, Ahuja I, Singh A. Batman & Robin vs. The Riddler: Is ChatGPT a reliable sidekick to Neurologists for diagnosis and management of functional movement disorders (N6.002). Neurology 2024; 102 (07) 5824
- 6 Inojosa H, Voigt I, Wenk J. et al. Integrating large language models in care, research, and education in multiple sclerosis management. Mult Scler 2024; 30 (11-12): 1392-1401
- 7 Nógrádi B, Polgár TF, Meszlényi V. et al. ChatGPT M.D.: Is there any room for generative AI in neurology?. PLoS ONE 2024; 19 (10) e0310028
- 8 Yan S, Knapp W, Leong A. et al. Prompt engineering on leveraging large language models in generating response to InBasket messages. J Am Med Inform Assoc 2024; 31 (10) 2263-2270
- 9 Meskó B. Prompt engineering as an important emerging skill for medical professionals: Tutorial. J Med Internet Res 2023; 25: e50638
- 10 Shah K, Xu AY, Sharma Y. et al. Large language model prompting techniques for advancement in clinical medicine. J Clin Med 2024; 13 (17) 5101
- 11 Savage T, Nayak A, Gallo R, Rangan E, Chen JH. Diagnostic reasoning prompts reveal the potential for large language model interpretability in medicine. NPJ Digit Med 2024; 7 (01) 20
- 12 Moura Junior V, Kummer B, Moura L. Population health in neurology and the transformative promise of AI and large language models. Semin Neurol 2025; 45: 445-456
- 13 Takita H, Kabata D, Walston SL. et al. A systematic review and meta-analysis of diagnostic performance comparison between generative AI and physicians. NPJ Digit Med 2025; 8 (01) 175
- 14 Singhal K, Azizi S, Tu T. et al. Large language models encode clinical knowledge. Nature 2023; 620 (7972) 172-180
- 15 Griot M, Hemptinne C, Vanderdonckt J, Yuksel D. Impact of high-quality, mixed-domain data on the performance of medical language models. J Am Med Inform Assoc 2024; 31 (09) 1875-1883
- 16 Singhal K, Tu T, Gottweis J. et al. Toward expert-level medical question answering with large language models. Nat Med 2025; 31 (03) 943-950
- 17 Koyun M, Taskent I. Evaluation of advanced artificial intelligence algorithms' diagnostic efficacy in acute ischemic stroke: A comparative analysis of ChatGPT-4o and Claude 3.5 Sonnet models. J Clin Med 2025; 14 (02) 571
- 18 Afshar M, Gao Y, Wills G. et al. Prompt engineering with a large language model to assist providers in responding to patient inquiries: a real-time implementation in the electronic health record. JAMIA Open 2024; 7 (03) ooae080
- 19 Zaghir J, Naguib M, Bjelogrlic M, Névéol A, Tannier X, Lovis C. Prompt engineering paradigms for medical applications: Scoping review. J Med Internet Res 2024; 26: e60501
- 20 Wang L, Chen X, Deng X. et al. Prompt engineering in consistency and reliability with the evidence-based guideline for LLMs. NPJ Digit Med 2024; 7 (01) 41
- 21 Lee D, Palmer E. . Prompt engineering in higher education: a systematic review to help inform curricula. Int J Educ Technol High Educ 2025 ;22(1)
- 22 Masanneck L, Meuth SG, Pawlitzki M. Evaluating base and retrieval augmented LLMs with document or online support for evidence based neurology. NPJ Digit Med 2025; 8 (01) 137
- 23 Hartman H, Essis MD, Tung WS, Oh I, Peden SC, Gianakos A. . Can ChatGPT-4 think like an orthopaedic surgeon? Testing clinical judgement and diagnostic ability in pathologies of the foot and ankle. Foot Ankle Orthop 2024 ;9(4)
- 24 Murton M, Boulton E, Cross S. et al. Harnessing large-language models for efficient data extraction in systematic reviews: the role of prompt engineering. Cochrane Evid Synth Methods 2025; 3 (06) e70058
- 25 Sivarajkumar S, Kelley M, Samolyk-Mazzanti A, Visweswaran S, Wang Y. An empirical evaluation of prompting strategies for large language models in zero-shot clinical natural language processing: Algorithm development and validation study. JMIR Med Inform 2024; 12: e55318
- 26 Miao J, Thongprayoon C, Suppadungsuk S, Krisanapan P, Radhakrishnan Y, Cheungpasitporn W. Chain of thought utilization in large language models and application in nephrology. Medicina (Kaunas) 2024; 60 (01) 148
- 27 Lucas MM, Yang J, Pomeroy JK, Yang CC. Reasoning with large language models for medical question answering. J Am Med Inform Assoc 2024; 31 (09) 1964-1975
- 28 Thoppilan R, Freitas D, Hall D. LaMDA: Language models for dialog applications. Nat Med 2022; 28 (11) 2310-2316
- 29 Gargari OK, Habibi G. Enhancing medical AI with retrieval-augmented generation: A mini narrative review. Digit Health 2025; 11: 20 552076251337177
- 30 Liu S, McCoy AB, Wright A. Improving large language model applications in biomedicine with retrieval-augmented generation: A systematic review, meta-analysis, and clinical development guidelines. J Am Med Inform Assoc 2025; 32 (04) 605-615
- 31 Hadar PN, Moura LMVR. Clinical applications of artificial intelligence in neurology practice. Continuum (Minneap Minn) 2025; 31 (02) 583-600
- 32 Vaira LA, Lechien JR, Abbate V. et al. Enhancing AI chatbot responses in health care: the SMART prompt structure in head and neck surgery. OTO Open 2025; 9 (01) e7007
- 33 Liu S, Wright AP, McCoy AB, Huang SS, Steitz B, Wright A. Detecting emergencies in patient portal messages using large language models and knowledge graph-based retrieval-augmented generation. J Am Med Inform Assoc 2025; 32 (06) 1032-1039
- 34 Maaz S, Palaganas JC, Palaganas G, Bajwa M. A guide to prompt design: foundations and applications for healthcare simulationists. Front Med (Lausanne) 2025; 11: 1504532
- 35 Kaur A, Budko A, Liu K, Eaton E, Steitz BD, Johnson KB. Automating responses to patient portal messages using generative AI. Appl Clin Inform 2025; 16 (03) 718-731