Keywords
clinical informatics - education - analytics - professional training - workforce
Background and Significance
Background and Significance
Data analytics is continuing to transform many industries, including health care.[1]
[2] In the business world, companies with sophisticated integration of analytics into
their business models and decision-making tend to outcompete their competitors over
time.[1] With increasing pressure to decrease costs and increase efficiency while improving
quality of care, health care enterprises need analytics to thrive.[3] Effective analytics requires a critical mass of biomedical and health informatics
(BMHI) professionals trained to access, analyze, and interpret data to glean insights
that can be used to transform care and operational processes.[1]
[3]
[4]
Many educational programs have been developed to fill the workforce need in this burgeoning
field through formal degree and certificate programs.[5] These programs focus on BMHI core knowledge, knowledge of medicine and health, and
informatics such as programming languages and database management.[5]
[6] Extensive research has been devoted to the development of curricula that communicate
core BMHI knowledge and skills,[7]
[8]
[9]
[10]
[11]
[12]
[13] but data on continuing education and development of the existing workforce is lacking.
Given the rapidly evolving nature of the field, BMHI analysts need to engage in continuous
life-long learning,[5]
[14]
[15] with adult learning theory dictating that working professionals prefer that content
be directly relevant to work and connected to prior knowledge.[16] Quality information technology professionals continue to be in high demand and short
supply,[15]
[17]
[18]
[19]
[20] so enterprises need to retain and enhance the skills of existing staff even as they
seek to hire additional staff.[4]
The Leadership in Analytics and Data Science (LEADS) program was developed at Johns
Hopkins Medicine (JHM) in 2017 to improve organizational capacity and effectiveness
in analytics through didactics, analytics skills practice, and mentoring ([Table 1]). The course meets once a week over lunch for a 6-month duration, teaching analytics
and data stewardship skills to participants as well as fostering interpersonal networks
within the enterprise. Interpersonal networks are a source of important knowledge
in the BMHI workplace[21] and can impact job productivity[22]
[23] and creativity.[24] A functioning BMHI social network can evolve into a community of practice[25]
[26]
[27] where group members have a shared interest and commitment, engage in joint activities,
help one another, and work together to develop a shared repertoire of resources and
tools for problem solving.[26]
[28] Thus, a central goal of the LEADS course was to foster a community of practice.
Table 1
Goals for leadership in analytics and data science course
1. Improve knowledge and practice of technical analytical skills
|
2. Improve knowledge of specific enterprise applications and shared data sources
|
3. Improve data stewardship practices and knowledge of institutional policies
|
4. Enhance active networking and mentorship to foster an analytics community of practice
|
To understand whether the LEADS course was successful at achieving these stated goals,
course effectiveness was evaluated using the Kirkpatrick Model[29]
[30]
[31] (see [Fig. 1]), which assesses four levels of course effectiveness including participant reaction,
learning, behavior, and institutional impact. Precourse, intracourse, and postcourse
surveys were administered to participants.
Fig. 1 The Kirkpatrick Model for evaluation of course effectiveness. The model is illustrated
as a pyramid to demonstrate that participant reaction to an educational intervention
influences learning, which influences behavior change that determines results.
Objective
In this research, a study was conducted to investigate the effectiveness of the LEADS
course with the goal of improving analyst technical skills, increasing knowledge of
applications, data stewardship and policies, and creation of an analytics community
of practice. The course effectiveness was evaluated using scheduled course assessments
that examined participant reaction, learning, and behaviors.
Methods
Setting
JHM is an integrated global health enterprise with over 40,000 full-time employees,
2.8 million outpatient visits, and 110,000 inpatient admissions per year.[32] The JHM Data Trust (DT) was established with the mission “to provide JHM with the
technical infrastructure, standards, policies and procedures, and organization needed
to bring together patient and member-related data from across the health system.”[33] The DT cataloged data sources within JHM as comprising more than 500 databases,
70,000 tables, and 35 terabytes of data (personal communication from Paul Nagy, 2019).
The DT created a hybrid reporting structure where employees work in business units
within the organization as well as serve on functional analytic teams to coordinate
activities and provide data stewardship. The teams are built around functional analysis
of data including (1) ambulatory operations, (2) hospital operations, (3) care coordination
and utilization management, (4) finance-integrated analytics, (5) population health,
(6) quality, safety, and service, (7) research/center for clinical data analysis,
(8) technology innovation center, and (9) planning and market analysis. There are
currently over 300 individuals serving as members on the DT analytics teams.
The LEADS course was offered to BMHI professionals with roles as data analysts working
within JHM. The target participants for the LEADS course were current or future members
of DT analytic teams. The program was created as part of a strategic goal by the DT
to advance the training of analytic teams across the enterprise and foster employee
collaboration, development, and engagement in a difficult-to-recruit workforce. The
Technology Innovation Center (TIC) (https://tic.jh.edu/) developed and implemented the program.
Program leadership was created with a program director, two program managers, an advisor,
and a coordinator. Thirty-three key influencers of the JHM analytics community were
identified by the course organizers and DT team leaders and invited to act as faculty
in the program. The faculty was asked to provide lectures, mentorship, and was invited
to participate in class discussions. Faculty was encouraged to attend the weekly lectures
to interact with colleagues and to hear what other analytics activities were taking
place.
Interested analysts employed at JHM applied to the program through an open enrollment
period. The application required a letter of recommendation from their supervisor,
tuition cost center information, and a letter of commitment of the time required to
complete the course. The program was capped at 30 students each year to provide close
mentoring and maintain class engagement. The $4,000 tuition was covered by the participant's
departments. The classes were organized over lunch and catering was provided. Participants'
departments were charged for the course tuition.
Curriculum Development
The LEADS course was developed with the help of a multidisciplinary team featuring
TIC administrators, physician informaticians, DT team leaders, and experts in assessment.
This group developed the course curriculum and goals based on literature survey of
core BMHI skills,[5]
[8]
[12]
[13] prior experience, and an informal survey of DT team leaders to determine key employee
development needs. Institutional BMHI leadership was also involved in the process
to ensure that course content aligned with institutional goals, existing data sources,
and upcoming software implementations. [Table 2] contains an outline of the curriculum.
Table 2
Overview of the leadership in analytics and data science curriculum
Data Trust and Analytic Teams
|
Data structures and warehousing concepts
|
Data analysis
|
Project planning
|
Practicum
|
1. Data sources used by each analytics team
|
1. Database relationships
|
1. Developing key performance indicators
|
1. Requirements gathering
|
1. Team-based project analyzing deidentified EMR data
|
2. Types of requests (clinical, operational, and research)
|
2. Structured Querying Language
|
2. Scorecards and dashboards
|
2. Project management
|
2. Assignments for data cleaning, joining, and analysis
|
3. Types of analysis for customers
|
3. ETL processes and provenance
|
3. Information visualization
|
3. Design thinking
|
3. Hands-on environments (Jupyter, Tableau, and SQL Server)
|
4. Data trust policies
|
4. Data quality assessment
|
4. Inferencing and spurious correlations
|
4. Business intelligence
|
4. Engage with faculty and members of varying functional units
|
5. Data governance
|
5. Working with EMR transactional data
|
5. Machine learning with Python
|
|
|
6. Data security
|
6. Hadoop and precision medicine
|
|
|
|
Abbreviations: EMR, electronic medical record; ETL, Extract, Transform, Load.
In addition to lectures and workshops, students were divided into small groups with
mentored faculty supervision to complete a practical data analytics project. A skill
assessment performed prior to the course was utilized to gauge the analytics skills
of each student. Students were then split into teams of 5 members with a mix of technical
expertise, and were supported by 2 faculty mentors. Team members shadowed their peers
and documented their work processes to share skills. Each team developed a data product
that was composed of a social network analysis of LEADS faculty members and their
professional networks. Teams performed team building activities, created a normalized
data model, cleaned and transformed data in a SQL server environment, used Tableau
to create innovative data visualizations, and generated a final data product. An award
was given at the end of the course to the team with the best project based on faculty
vote. See [Supplementary Appendix A] (available in the online version) for small group instructions.
Assessment
The effectiveness of the course was evaluated using a framework based on the Kirkpatrick
Model[29]
[30]
[31] (see [Fig. 1]). The Kirkpatrick Model seeks to assess four levels of course effectiveness: (1)
reaction, (2) learning, (3) behavior, and (4) results. The first level seeks to assess
participant reaction to a session's instructor, setting, materials, and content, which
are important prerequisites for learning.[29] The second level assesses the extent of learning or expansion of knowledge.[34] The third level of evaluation determines the extent of application of new skills
in the workplace.[29]
[34] The fourth level measures the organizational impact of the course.[34] Our evaluation of the LEADS course focused on levels 1 to 3.
Assessment tools were developed with the help of the office of assessment and evaluation
at JHM. Participant reaction was assessed through weekly and whole-course class satisfaction
surveys. Participant learning was assessed through a pre- and postcourse multiple-choice
knowledge assessment and self-reported capability assessment. Participant behavior
was assessed through a pre- and postcourse self-reported practice assessment. The
professional network was assessed through a pre- and postcourse self-reported social
network analysis (see [Supplementary Appendix B] [available in the online version] for assessment instruments). All surveys were
administered electronically on an internally hosted MachForm (https://www.machform.com/) survey engine. [Table 3] contains the schedule of assessment tools. [Supplementary Appendix B] (available in the online version) contains copies of the assessment tools.
Table 3
Schedule of assessment tools
Assessment
|
Precourse
|
Weekly
|
Postcourse
|
Class satisfaction
|
|
x
|
|
Knowledge
|
x
|
|
x
|
Capability
|
x
|
|
x
|
Practice
|
x
|
|
x
|
Professional network
|
x
|
|
x
|
Course assessment
|
|
|
x
|
Data Analysis
The LEADS program manager and coordinator organized the surveys and replaced student
identifiers with research identity numbers. Data were submitted as comma-separated
values (CSV) files and analysis was conducted in Python 3.5.2 using Jupyter Python
notebooks. Graphs were generated with the plotting libraries matplotlib version 2.1.0
and seaborn version 0.8.1. Data manipulation was done using pandas version 0.21.0
and mathematical analysis was done with the numpy library version 1.13.3. Wilcoxon
signed rank test was used to calculate the significance of the difference in pre-
and postcourse capability, practice, and social network size. Effect sizes were computed
using the z-statistic from scipy version 1.0.0, using the simple difference formula.[35]
Results
Reaction
In the first year of the program, LEADS had 42 applicants, with 30 analysts accepted
into the program, and 29 completing the course. One participant left the organization
during the program. Attendance was tracked through a sign-in sheet and with a mean
attendance of 84% and no class with < 70% attendance. Class and instructor evaluation
were assessed with a 4-point Likert score (not at all, somewhat, very, and extremely).
Each class was assessed for whether it was interesting, engaging, applicable, and
important. Students were emailed weekly surveys that they could optionally fill over
the 21-class sessions. There was a 23% response rate to these surveys. The sessions
with the highest scores were finance and analytics, emergency medicine and readmissions,
data mart, business intelligence and intro to Tableau, Epic reporting and analysis
tools, the nature of data in Epic, project management, and data structure fundamentals.
After completion of the course, a final course evaluation was administered, during
which participants reported on their evaluation of the course as well as barriers
on applying the knowledge they gained in the course as part of their job. All course
participants completed this survey. Evaluation of the course was positive, with 100%
of responses in the positive range for interest, engagement, applicability, driving
career growth, course organization, and logistics. Thirty percent of responses were
in the highest quartile of extremely positive. Ninety-six percent of respondents reported
that they were “very” or “extremely” likely to recommend the course. In identifying
organizational barriers in applying the skills learned, the most common barriers to
application were lack of applicability to job tasks and lack of depth in learning
of skills with 58% of respondents reporting these as barriers. See [Supplementary Appendix C] (available in the online version) for final course evaluation results.
Learning
Learning was assessed with a multiple choice, 10-question knowledge assessment delivered
at the beginning and end of the course. Eighty-six percent (24/28) of participants
in the course completed both the pre- and postcourse knowledge test. Scores moved
in a positive direction with a precourse mean score of 61% (standard deviation [SD]
17%) and a post-course mean score of 69% (SD 14%). This difference was significant
with paired t-test p-value of 0.003. Of the 24 participants who completed the precourse knowledge survey,
63% had a score increase, 17% had an equal score, and 20% had a score decrement (see
[Fig. 2)]. The questions with the largest improvement in correct answers were on data visualization
(21% improvement), enterprise analytics tools (23%), and Python (37%). The questions
with mild score decrement include SQL (5% decrement) and R (2% decrement).
Fig. 2 Knowledge assessment. Histogram of knowledge assessment scores on the pretest versus
posttest.
Behavior
Before and after the LEADS course, participants were asked to self-report their current
capabilities, goal capabilities, and current practice patterns with a variety of key
BMHI skills. The self-assessment for capabilities asked participants to rank themselves
as a beginner, novice, intermediate, or expert for particular skills. Goals used the
same scale. For the purposes of calculation of change, beginners through experts were
assigned numerical values between 1 and 4. Participants' self-report of capabilities
significantly improved through the course with regard to data stewardship, data policies,
use of the Epic data warehouse, data governance, knowledge of institutional review
board (IRB) policy, and predictive analytics. There was no change in self-report of
skill with Excel, databases, project management, or use of Tableau. Changes in capabilities
and goals are reported in [Table 4].
Table 4
Pre- and postcourse student self-evaluation of current capabilities and goal capabilities
BMHI skill
|
Capability versus goal
|
Precourse
|
Postcourse
|
p-Value (significant < 0.05)
|
Effect size (small = 0.1, medium = 0.3, large = 0.5)
|
Excel
|
Capability
|
3.1
|
3.1
|
0.009
|
0.526
|
|
Goal
|
3.7
|
3.8
|
0.034
|
0.425
|
Databases
|
Capability
|
2.6
|
2.8
|
0.419
|
0.162
|
|
Goal
|
3.4
|
3.1
|
0.008
|
0.532
|
Tableau
|
Capability
|
2.2
|
2.4
|
0.092
|
0.336
|
|
Goal
|
3.5
|
3.3
|
0.014
|
0.489
|
Project management
|
Capability
|
2.2
|
2.4
|
0.19
|
0.262
|
|
Goal
|
3.4
|
3.3
|
0.019
|
0.468
|
Data stewardship
|
Capability
|
1.6
|
2.2
|
0.007
|
0.537
|
|
Goal
|
3.1
|
3.2
|
0.199
|
0.257
|
Data Trust policy
|
Capability
|
1.6
|
2.3
|
< 0.001
|
0.799
|
|
Goal
|
3.4
|
3.3
|
0.064
|
0.371
|
Epic data warehouse
|
Capability
|
1.5
|
2
|
0.017
|
0.476
|
|
Goal
|
3
|
3.1
|
0.522
|
0.128
|
Data governance
|
Capability
|
1.8
|
2.4
|
0.003
|
0.597
|
|
Goal
|
3.3
|
3.2
|
0.163
|
0.279
|
Predictive analytics
|
Capability
|
1.4
|
1.8
|
0.007
|
0.539
|
|
Goal
|
3
|
3.1
|
0.391
|
0.171
|
IRB rules
|
Capability
|
1.4
|
1.8
|
0.005
|
0.557
|
|
Goal
|
2.9
|
2.7
|
0.118
|
0.313
|
Abbreviations: BMHI, biomedical and health informatics; IRB, institutional review
board.
A decrease was observed in the goals individuals set between the pre- and postcourse
evaluations while also seeing a reported increase in self-reported capability. For
example, an individual starting the course may have a goal of becoming an expert at
databases and once the course completed choose to downgrade their postcourse goal
to be intermediate. To understand this shift, a goal-gap was computed, which examined
the difference between capabilities in a skill with their goal changes from pre- to
postcourse. The mean goal-gap across all 10 tools was 1.34 before the course, and
0.92 postcourse, meaning that the student goals and capabilities were more closely
aligned in the end of course survey than at the beginning.
A pre- and postpractice assessment was administered to see if their overall application
of skills acquired increased. The survey used a 4-point scale to determine the frequency
of skill use in terms of rarely, sometimes, often, and always. For the purposes of
calculation of change, rarely through always were assigned numerical values between
1 and 4. In terms of practice, there was a significant increase in self-report of
the frequency of use of skills with Excel, data stewardship, data policies, Epic data
warehouse, data governance, and employment of IRB policy. While other skills suggested
an improvement, they were not statistically significant (see [Table 5]).
Table 5
Pre- and postcourse evaluation of student practice patterns
BMHI skill
|
Precourse
|
Postcourse
|
p-Value (significant < 0.05)
|
Effect size (small = 0.1, medium = 0.3, large = 0.5)
|
Excel
|
3.3
|
3.6
|
0.048
|
0.395
|
Databases
|
2.5
|
2.9
|
0.277
|
0.217
|
Tableau
|
2.5
|
2.8
|
0.118
|
0.313
|
Project management
|
2.1
|
2.6
|
0.109
|
0.321
|
Data stewardship
|
1.6
|
2.2
|
0.001
|
0.649
|
Data Trust policy
|
2.5
|
3.7
|
< 0.001
|
0.814
|
Epic data warehouse
|
1.7
|
1.7
|
0.016
|
0.279
|
Data governance
|
2.3
|
3.6
|
< 0.001
|
0.732
|
Predictive analytics
|
1.5
|
1.7
|
0.687
|
0.081
|
IRB rules
|
1.7
|
2.3
|
0.044
|
0.403
|
Abbreviations: BMHI, biomedical and health informatics; IRB, institutional review
board.
Professional Network
A professional network questionnaire was performed pre- and postcourse. The survey
listed the members of the DT by the analytic team and requested to know which of the
members the participant knew and identified as part of their professional network.
The professional network size participants identified ranged between 0 and 32 with
an average of 11 before the course and between 6 and 44 with an average of 22 after
the course. Social network analysis revealed an average doubling of individual's professional
network from an average of 11 to 23. All participants reported an increase in the
size of their social networks. See [Fig. 3] for a histogram of individual increase in network size.
Fig. 3 Histogram of professional network growth. Change in size of professional network
on the precourse evaluation versus postcourse evaluation.
Small Group Projects
For the class project, each team of students was provided a CSV file that detailed
the professional network of each course faculty member. Teams were asked to build
an interactive map of the data analytics community at JHM, with the dual goal of teaching
team members how to create an innovative data visualization and providing them a useful
living document to use to find answers to analytics questions in the future using
contacts in their professional network. See [Fig. 4] for an example of a student-generated data visualization.
Fig. 4 Example of student small group professional network visualization using Tableau.
Red stars indicate course faculty. Colored dots indicate individuals on data trust
analytic teams. Gold lines indicate social network connections among individuals and
course faculty.
Discussion
LEADS was the first course targeted toward continuing education of working BMHI professionals
at a large academic center to undergo formal evaluation for effectiveness. The course
goals were split between didactic, practical, and social components, with many course
activities focused on fostering a community of practice where participants developed
shared rules, norms, and artifacts such as specific work products. The evaluation
of course effectiveness using the Kirkpatrick Model sought to address each of these
course components in context.
Course participants rated the most popular lectures as those that could help data
analysts be maximally effective in their jobs, such as lectures on key data sources,
financial analytics, and data visualization. This matches with theories of adult learning,
which suggest that content should be relevant, useful to the learner's life, and connected
to prior knowledge.[16] In LEADS, the content was tailored specifically to the needs of the DT teams by
the leaders of these teams, and the most popular lectures were the most applicable
to the participant's daily work. All this being said, response rates to the weekly
class reaction surveys were low at 23%, making generalization of respondent feedback
difficulty. Future iterations of the course will incorporate weekly feedback into
class participation and will focus most heavily on lectures that impart practical
skills.
The knowledge assessments showed limited change in knowledge from precourse to postcourse.
The test assessed basic aptitude only and was not built based upon psychometric analysis,
which leaves the opportunity for rewriting of test items in future iterations of LEADS.
The slight decrement in scores on test questions involving SQL and R points toward
the opportunity to offer in-depth seminars on technical topics in future iterations
of the class. The knowledge test was useful in that it helped the course faculty to
conduct an initial needs assessment and organize participants into balanced groups
based on technical skills and knowledge.
The capability and practice assessments mirrored one another in showing ability and
use of data policies, data stewardship, data governance, use of IRB policy, and utilization
of a key enterprise data source, the Epic data warehouse. This may represent the successful
fostering of a community of practice with shared rules and norms of behaviors through
shared social time, lectures, and team-based data analytics experience. It points
to the effectiveness of the LEADS course at transmitting standards of data governance
and stewardship and therefore in supporting the transformation of the organization
by providing institution context and imparting institutional values.
Technical skills, including use of Tableau and database management, did not change
significantly during the course. This may be due to the exposure nature of the course
with relatively short lectures given on technical topics. This points to an opportunity
to offer in-depth seminars with intensive skills training in future iterations of
the class. Additionally, some of the skills and methodologies taught in LEADS may
not yet have diffused fully across the institution, so the actual job tasks of analysts
may not have allowed for increased use or practice of acquired skills over the relatively
short 6-month timeframe. The narrowing of the gap in self-reported technical capabilities
and goals may show that participants realized that these technical skills take a long
time to acquire as participants began to move from a lack of awareness of what they
did not know toward knowledge of just how much they had to learn.
The size of social networks of the participants expanded from precourse to postcourse.
Course comments showed that the pre- and postclass networking time was helpful for
participants, and this may partly explain this expansion. The networking changes indicated
that analysts participating in the course may have expanded in their ability to work
across teams to accomplish high-quality work. In future iterations of the course,
effort will be invested in determining whether specific social network connections
enhance work productivity and whether participant demographics, such as time employed
at JHM, influence change in social network size. The team will also investigate whether
the enhancement of social network during the LEADS course portends greater employee
satisfaction and retention. In the future, it is possible that an abbreviated symposium
version of this course could be offered to BMHI professionals outside of JHM, though
the social network benefits would not accrue to these students.
The course had several limitations in addition to those mentioned above. Participant
job productivity after the LEADS course was not assessed. Future iterations of the
course will address this important metric through follow-up evaluation on work productivity
and retention. Future assessments will also incorporate psychometric testing and improved
Likert scales. Another significant limitation was the multiple confounding factors
influencing the work environment of students, thus altering their familiarity with
software and technical processes regardless of what was going on in the course. Despite
these limitations, the course was effective in achieving the goals of fostering the
development of technical skills and helping to create a community of practice among
BMHI professionals.
Conclusion
The LEADS course was effective at achieving the four goals of improving participant
data analytical skills, knowledge of enterprise applications and data sources, skill
in data stewardship, and in fostering a community of practice as evidenced by changes
in pre- and postcourse assessments of knowledge, capabilities, practice, and social
networks. The LEADS course provides a template for continuing education of BMHI professionals
to enhance workplace effectiveness and aid in the diffusion of standardized practices
across a medical enterprise.
Clinical Relevance Statement
Clinical Relevance Statement
Health systems can deliberately manage and build analytics capacity through building
communities of practice and the creation of focused learning and growth opportunities.
This study shows that working BMHI professionals can improve technical analytics skills,
knowledge of enterprise applications, institutional policies, and active networking
to foster creation of an informatics community of practice through a targeted continuing
education program. Consistent with adult learning theory, participants are most interested
in learning skills that are pertinent to their everyday work.
Multiple Choice Questions
Multiple Choice Questions
-
Which of the following practices increased significantly for participants in the LEADS
course?
Correct Answer: The correct answer is option d, data governance. The LEADS course impacted several
practical skills including data stewardship, IRB rules and institutional policy, and
shared data resources such as the data warehouse. The course did not impact the use
of specific technologies by BMHI professional participants, which points toward the
need for more in-depth instruction for those interested in these complex tools.
-
Each of the following is a metric of educational course effectiveness in the Kirkpatrick
Model except:
-
Reaction.
-
Reflection.
-
Learning.
-
Behavior.
Correct Answer: The correct answer is option b, reflection. The four levels of the Kirkpatrick Model
of course assessment are reaction, learning, behavior, and results. The immediate
reaction of participants to course content is related to the amount that participants
learn, which impacts changes in behavior and ultimately alters results or outcomes.
This model was used to evaluate the LEADS course and is helpful because it assesses
each level in turn, allowing for modifications of specific course components in future
course iterations.
-
What is an analytics community of practice?
-
A social group of analysts designed to foster workplace friendships and facilitate
job retention.
-
An online forum for sharing useful code and ideas that is specific to a given workplace.
-
A group whose members develop shared rules and interests through engagement in joint
activities with similar resources.
-
A community developed by enterprise leadership to facilitate promotion of talented
individuals.
Correct Answer: The correct answer is option c. The definition of an analytics community of practice
is a group where members have a shared interest and commitment, engage in joint activities,
help one another, and work together to develop a shared repertoire of resources and
tools for problem solving. The LEADS course was designed to foster this through shared
social time, lectures, and team-based data analytics experience.