Appl Clin Inform 2021; 12(05): 1014-1020
DOI: 10.1055/s-0041-1736628
Research Article

Heuristic Evaluation of a Top-Rated Diabetes Self-Management App

Linda Harrington
1  Harris College of Nursing & Health Sciences, Texas Christian University, Fort Worth, Texas, United States
,
Cheryl Parker
2  College of Nursing and Health Sciences, The University of Texas at Tyler, Texas, United States
,
Kathleen Ulanday
3  Texas Children's Hospital, Houston, Texas, United States
,
Craig Harrington
4  Independent Informatics Consultant, Chapel Hill, North Carolina, United States
› Author Affiliations
 

Abstract

Objective The purpose of this study was to evaluate the usability of a top-rated diabetes app. Such apps are intended to markedly support the achievement of optimal health and financial outcomes by providing patients with substantive and continual support for self-management of their disease between periodic clinician visits. Poor usability can deter use which is especially concerning in patients with diabetes due to prevalence of the disease and impact of self-management on long-term prognosis.

Methods A diabetes app was selected due to the prevalence and seriousness of the disease. A heuristic evaluation was then performed to collect and analyze data on the usability of the app based on Nielsen's heuristics. Pareto analysis was used to illustrate the contribution of each type of heuristic violation, augmented by a stacked bar chart illuminating associated severity.

Results There were 51 heuristic violations on the opening screen, violating 6 of Nielsen's 10 heuristics. Pareto analysis revealed 29 (57%) of the heuristic violations involved a match between system and real world and 8 (16%) aesthetic and minimalist design. Severity ratings ranged from 1.0 to 4.0 (mean: 3.01) with 80% comprising a major usability problem and 6% a usability catastrophe.

Conclusion Studies show that people with diabetes are more likely to receive greater benefit from a diabetes app if they are easy to use. The number and severity of heuristic violations in this study suggest that the commercialization of mobile health apps may play a factor in bypassing experts in clinical informatics during the design phase of development. Usability and associated benefits received from mobile health apps can be enhanced by debugging the user interface of identified heuristic violations during design. Waiting to correct ongoing usability issues while apps are in production can result in patients disengaging from use of digital health tools engendering poorer outcomes.


#

Background and Significance

Diabetes is a serious chronic disease with diffuse complications and increased risk of premature death.[1] Estimates of the prevalence of diabetes show an increase of 88% during the past 13 years, from 246 million people in 2006 to 463 million in 2019.[1] An estimated 4.2 million adults ranging in age from 20 to 70 years died from diabetes in 2019, accounting for 11.3% of deaths from all causes.[1] Glycemic management not at personal goals is the greatest determinant of complications, underlining the importance of effective tools to help patients manage diabetes.[2]

Effective self-management tools are especially important in diabetes as the long-term prognosis is highly dependent on self-care behaviors employed to manage this complex chronic condition.[3] Mobile health apps are one of the digital technologies gaining momentum in fulfilling the need for self-management tools in this patient population.[4] These apps have the potential to simplify daily living by monitoring and providing feedback on glucose and lifestyle data 24/7.[3] Diabetes apps afford significant promise as 4.88 billion people, comprising 62.07% of the world's population, own a smartphone.[5] Unfortunately, not every diabetes app is helpful.[4]

A key barrier preventing the full potential of mobile health apps from improving people's lives with diabetes is usability.[4] Usability refers to “the extent to which specified users can use a product to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use.”[6] Major usability problems have serious potential for confusing mobile health app users or causing them to make errors when using the system, while minor issues may slow down the interaction or inconvenience users unnecessarily.[7]

Studies show users want mobile health apps that are easy to use, reduce self-management burden, and are well suited for the intended purpose.[8] [9] These findings are supported by studies illustrating users are more likely to remain engaged with diabetes apps if they are simple to use and require limited time to use.[10] The significance of usability and engagement goes beyond user preference to include the impact of engagement on patient outcomes. A systematic review and meta-analysis of patients with noncommunicable diseases, where seven of the nine studies used involved patients with diabetes only, found that higher levels of patient engagement were associated with better outcomes.[11] HbA1c was lower in short-term use (3–6 months; p = 0.02) with low heterogeneity (I 2 = 41%) across studies and statistically significant in long-term use (10–12 month; p = 0.009) with no heterogeneity (I 2 = 0%).[11]

A vital step to improve the usability of a mobile health app to keep people engaged is a heuristic evaluation. The method is used to uncover usability problems in a user interface design by having a few knowledgeable and skilled evaluators examine the interface and determine its compliance with accepted usability principles, known as heuristics.[12] Findings from the heuristic evaluation are used to fix usability problems and thus can be viewed as a method for debugging user interfaces.[7]

Commercializing mobile health apps adds a unique and important aspect to usability in digital health tools used to self-manage disease. First, the direct-to-consumer (DTC) business model often bypasses the involvement of experts in clinical informatics resulting in apps where consumer ratings do not correlate with clinical utility or usability.[13] Second, mHealth developers often lack the resources to fund premarket prospective studies on usability, which is sometimes compounded by pressure from early investors to demonstrate quick product growth by rapid entry into the marketplace.[14] Third, the issue is further compounded by insufficient regulation of mobile health apps by the U.S. Food and Drug Administration whose mission is to protect the public but lacks a policy on the usability of mobile health apps.[15] As a result, apps are downloaded directly by unsuspecting consumers based on high ratings that may have poor clinical utility and usability, resulting in undesired consequences ranging from low value to harm.[16]

This study was purposed to describe heuristic violations in a top-rated, DTC mobile health app intended to support self-management of patients with diabetes. The app is multifunctional, focused on the documentation, monitoring, and decision making on blood glucose, HbA1C, medication, diet, activity, and weight. The use of this type of diabetes app is intended to serve as an important tool supporting self-management leading to improved health outcomes and quality of life.[4] The study is important as poor usability can deter use, which is especially concerning due to the prevalence of the disease and the impact of self-management on long-term prognosis.


#

Methods

This study was a heuristic evaluation whereby four usability experts examined the user interface of a mobile health app to determine compliance with a set of heuristics. Created by Nielsen and Molich in 1990, a heuristic evaluation affords unique insights into the usability of a user interface through the lens of experts.[12] Usability experts combine knowledge of user needs, human–computer interaction, interface design, information architecture, cognitive and perceptual psychology, and more to identify heuristic violations, severity of the violations, missing features, and design strategies that can improve usability.[17]

The research team consisted of four informaticists. Three were informatics nurse specialists and one biomedical informaticist, all with formal graduate education and certifications in informatics. Three to five usability experts are recommended to perform heuristic evaluations.[12] [18] Early work in heuristic evaluations demonstrated that three to five single-domain experts, such as informaticists, can identify 74 to 87% of usability problems.[7] [19] Dual-domain experts, such as informaticists who are clinicians with subject matter expertise, can detect 81 to 90% of usability problems.[7] [19]

In the current study, one researcher was a single-domain expert in informatics, specifically usability testing. Three researchers were dual-domain experts in informatics, one focusing on usability testing, and all three in nursing with experience in the care of people with diabetes. Important to note, experienced evaluators can also sometimes overlook easy-to-identify usability problems in a user interface.[12] In contrast, less experienced evaluators can identify complex usability problems, making the composition of the research team essential to consider.[12]

Nielsen's usability heuristics were selected as the guiding framework for the study.[20] These heuristics were used to guide researchers in identifying 10 different types of heuristics that can be violated in the design of a user interface. These include visibility of system status; the match between system and the real world; user control and freedom; consistency and standards; error prevention; recognition rather than recall; flexibility and efficiency of use; aesthetic and minimalist design; help users recognize, diagnose, and recover from errors; and help and documentation.[20]

Setting

The user interface of a top-rated, consumer mobile health app provided the setting. The app is available on iOS and Android platforms and can be downloaded from the Google Play Store and the Apple App Store. It is rated 4.0/5.0 by more than 14K raters in the Google App Store and 4.8/5.0 by 19K raters in the Apple Store.

Each researcher accessed the app with a smartphone using the iOS platform. The app's free version was used, which requires data to be input manually in metric or imperial units. A subscription can be purchased to access additional features, such as a customizable display, enhanced reporting functionality, and connectivity to a glucose monitor.

Preparations for the study included recording the version of the app to be tested and obtaining screenshots of the user interface where data were to be collected. This was done to provide a historical record and ensure consistency of the user interface that researchers independently evaluated as mobile health apps can be frequently updated.

An Excel spreadsheet was created to capture data identified as heuristic violations in the user interface and included the following five column headings: Location of Problem, Problem Description, Heuristic(s) Violated, Severity Score, and Suggested Solutions. Location of the usability problem is important for two reasons. It ensures researchers are making comparisons on the same feature when later identifying, discussing, and agreeing on which heuristic was violated as well as the severity of the problem. Location is also helpful for programmers enabling them to readily locate where the heuristic violation is to fix it. Problem description provides an explanation of each problem. The column for heuristic violated allows independent researchers to identify the specific type of heuristic violated which later will be used to identify solutions. Similarly, the column for severity scoring allows researchers to identify and compare the level of urgency needed to fix each heuristic violation. Lastly, suggested solutions allow usability experts to share their expertise in recommending improvement strategies specific to each usability problem identified.


#

Procedures

Step 1

Each researcher independently identified the location and described each heuristic violated, documenting their findings in the Excel spreadsheet. The lists generated by the four researchers were then compiled into a single list. The researchers independently examined the compiled list for clarity and to identify redundancies including potential redundancies that may have been expressed differently. The compiled list was then distributed among the researchers for discussion and final approval of the complete list of usability problem locations and associated descriptions.


#

Step 2

The compiled list of usability problem locations and descriptions was then independently evaluated by each of the four researchers to identify and record the Nielsen heuristic violated. The four lists were then compiled into a single list. The researchers then independently examined the compiled list for any discrepancies. Any discrepancies were discussed, and a consensus was used to create a final list of heuristics violated.


#

Step 3

The compiled list of heuristic violations with descriptions and locations in the Excel spreadsheet was then independently evaluated by each of the four researchers for severity. Severity determines the frequency with which the problem occurs, the impact if the problem occurs, and how persistent the problem is. It was measured on a scale of 0 to 4. Zero = “I don't agree that is a usability problem at all,” 1 = “Cosmetic problem only: need not be fixed unless extra time is available on project,” 2 = “Minor usability problem: fixing this should be given low priority,” 3 = “Major usability problem: important to fix, so should be given highest priority,” and 4 = “Usability catastrophe: imperative to fix this before product can be released.”[21] Severity scores from each researcher for heuristic violated were compiled.


#

Step 4

Lastly, the researchers independently made suggestions for solutions to fix each usability problem identified, listing them on the Excel spreadsheet. The compiled list was shared for clarity and agreement. And a completed Excel spreadsheet was generated showing the problem location, description, heuristic violated, severity, and suggested solution.


#
#

Data

Data from the opening screen of the top-rated diabetes app were used to report findings in this heuristic evaluation. This enables context for readers to appreciate the number and severity of heuristic violations in an individual screen. The opening screen of an app is one that all users must see and navigate through.

Qualitative data were used to identify the location of the problem on the screen, description of the problem, and suggested solutions. These allowed for count or frequency statistics. Quantitative data were used to identify heuristics violated and severity per the severity scale, affording for mathematical calculations.


#

Statistical Analysis

The statistical analysis focused on the types of heuristics violated and the associated severity. Pareto analysis was performed to determine the frequency of each type of heuristic violation and the contribution of each type to the total. This type of statistical analysis is widely used to identify critical factors leading to defects in a process.[22] The analysis uses a Pareto chart, a bar chart of frequencies typically sorted left to right with the highest-to-lowest frequency.[22]

A mean severity score for each type of heuristic violated was then calculated based on the sum of severity scores from all researchers divided by the number of researchers (4). An overall mean of severity for the total 51 heuristic violations was determined by dividing the sum of all severity scores for each item by the total number of violations (51). These findings were summarized in a stacked bar chart. These charts are useful when comparing quantities across items, such as a severity score for each type of heuristic violated, as well as the contribution of each type to the total.[23]


#
#

Results

There were 51 heuristic violations identified on the opening screen of the top-rated diabetes app examine as illustrated in the Pareto chart ([Fig. 1]). The horizontal axis represents the six different types of heuristics where violations were identified out of the 10 Nielsen heuristics. These were: error prevention, user control and freedom, recognition rather than recall, consistency and standards, aesthetic and minimalist design, and match between system and real world. The left vertical axis on the Pareto chart represents the counts of heuristic violations for each type. The right vertical axis represents the cumulative counts expressed in percentages of the total count of heuristic violations.

Zoom Image
Fig. 1 Pareto analysis of heuristic violations.

The Pareto analysis revealed 29 (57%) of the heuristic violations involved a problem with the match between system and the real world, 8 (16%) aesthetic and minimalist design, 6 (11%) consistency and standards, 4 (8%) recognition rather than recall, 3 (6%) user control and freedom, and 1 (2%) error prevention ([Fig. 1]).

The severity of the 51 heuristic violations ranged from 1.0 to 4.0 with a mean of 3.01/4.0 ([Fig. 2]). Eighty percent comprised a significant usability problem. An additional 6% percent were considered a usability catastrophe.

Zoom Image
Fig. 2 Stacked bar chart of heuristic violation severity.

What is underneath the above numbers and why is usability so important in health-related apps? [Table 1] provides insights into the six areas of heuristic violations identified on the opening page of a top-rated diabetes app. This table shows an example and description of error prevention, user control and freedom, recognition rather than recall, consistency and standards, aesthetic and minimalist design, and match between system and the real world. As identified in the table, the consequences of these heuristic violations provide clear evidence of how poor usability can lead users to disengage.

Table 1

Examples of heuristic violations from the study

Heuristic violated

Description

Examples from study app

Consequences

Error prevention[a]

Two types of errors: (1) slips—unconscious errors caused by inattention, and (2) mistakes—conscious errors caused by inattention

Abnormal values, such as glucose of 99,999 mg/dL, are allowed to be entered by users without an alert, flag, or highlight

Can create safety issues such as users overmedicating themselves

User control and freedom[b]

Users want to feel in control of apps and that if they make a mistake, they can get out of it.

Clicking on “Insights” consistently crashes the app

Users feel confused, trapped, reluctant to explore features, and become dissatisfied with the app

Recognition rather than recall[c]

Recognition involves familiarity; recall involves more details from memory

The app heavily uses icons and uses different icons for the same thing. Universal icons are rare making recognition and meaning of icons difficult.[d] For example, serum glucose is a solid black drop icon or orange-outline drop icon.

Unclear icons create confusion and frustration, impeding users from completing their task

Consistency and standards[e]

Colors, symbols, words, content, and layout should be consistent throughout creating familiarity in users

Serum glucose is associated with solid black drop icon, orange-outline drop icon, word ”glucose,” word “blood sugar,” and abbreviation BG

Lack of consistency confuses users and makes apps more difficult to learn

Aesthetic and minimalist design[f]

Designs should be aesthetically pleasing with high informational value

Fork and knife icon combined with word “Food” whereby “Food” is sufficient, and icon is irrelevant

An icon can replace words if it adds value. Otherwise, icons become clutter slowing user progress

Match between system and the real world[g]

Content should be in user's language and concepts, and navigation should be logical to them

The icon of insulin is a pill when insulin is an injection

Inappropriate icons create confusion leaving users wondering what to do

a Laubheimer P. Preventing user errors: avoiding unconscious slips. Available at: https://www.nngroup.com/articles/slips/. Accessed April 5, 2021.


b Rosala M. User control and freedom usability heuristic #3. Available at: https://www.nngroup.com/articles/user-control-and-freedom/. Accessed April 5, 2021.


c Budiu R. Memory recognition and recall in user interfaces. Available at: https://www.nngroup.com/articles/recognition-and-recall/. Accessed April 5, 2021.


d Harley A. Icon usability. 2014. Available at: https://www.nngroup.com/articles/icon-usability/. Accessed April 5, 2021.


e Krause R. Maintain consistency and adhere to standards: usability heuristic #4. Available at: https://www.nngroup.com/articles/consistency-and-standards/. Accessed April 5, 2021.


f Fessenden T. Aesthetic and minimalist design: usability heuristic #8. Available at: https://www.nngroup.com/articles/aesthetic-minimalist-design/. Accessed April 5, 2021.


g Kaley A. Match between the system and the real world: the 2nd usability heuristic explained. 2014. Available at: https://www.nngroup.com/articles/match-system-real-world/. Accessed April 2, 2021.



#

Discussion

The current study highlights many heuristic violations on the opening screen of a top-rated diabetes app. This is the first time a focused heuristic evaluation has been introduced in the literature. The findings are important as the first screen is required for navigation for all users, and poor usability makes it more likely for them to disengage from a digital tool supporting self-management.

Summary of Main Findings

The Pareto analysis helped illustrate that 57% of heuristic violations involved a problem with the match between system and the real world. This allows clinical informaticists to focus improvement efforts on better matching user interfaces with the users' real world, such as using words, phrases, and familiar concepts, thereby capitalizing on their existing knowledge to make the user interface easier to learn.[24]

It is important to note that using a Pareto analysis to identify improvement strategies does require more than looking at percentages and focusing on high contributing types of heuristic violations. One of the potential limitations to using Pareto analysis involves the overshadowing of critical information. This effect can be seen in the current study when the visualization of high contributing types of heuristic violations outshone an important, low contributing type of heuristic violation, error prevention, with the highest possible severity rating ([Figs. 1] and [2]). In fact, there were only three heuristic violations with severity of 4.0/4.0 in the current study, each occurring in the lowest contributing types of heuristic violations identified. This requires a sound strategy for visualizing severity that minimizes the deficit that can be seen with Pareto analyses.

A stacked bar chart was chosen to illustrate severity levels within the different types of heuristic violations ([Fig. 2]). Classical bar charts are useful in visualizing multiple attributes.[25] The length of each bar indicates the sum of the attribute, in this study, the type of heuristic violation, and provides a linkage to the Pareto analysis. The segments within each bar indicate how the attribute of severity level contributes to the total in each heuristic violation. The different colors of bar segments ease attribute recognition and comparison.[25]

As shown in [Fig. 2], error prevention comprised only 1 (2%) of the severity issues, but the issue was rated 4.0/4.0 in severity, which is imperative to be seen and thus fixed before the app being released. The usability problem involved the ability of users to input abnormal values without an alert, flag, or other notification, thereby preventing a recheck or correction of data input. For example, a glucose value of 99,999 mg/dL or an A1C of 99.1% could be consciously or accidentally input without the user being warned. More important, an unrealistically high value may result in a patient self-administering an overdose of medication. Using the stacked bar chart enabled the user interface's low contributing type of heuristic violation to be more readily apparent as a high-contributing severity ([Fig. 2]). This illustrates the importance of viewing the results through two different lenses.


#

Comparison to Existing Literature

The current study was consistent with previous findings that conclude incorporating heuristic design principles can improve diabetes app design. In a study by Fu and colleagues, a total of 314 heuristic violations were identified among four top-rated diabetes apps.[26] Heuristic principles violated most often were “Help and Documentation” (n = 50), followed by “Error Prevention” (n = 45) and “Aesthetic and Minimalist Design” (n = 43).[26] Researchers concluded that the four top-rated diabetes apps had “marginally acceptable” to “completely unacceptable” usability with significant opportunities to improve.[26]

Findings from the current study are also consistent with research on the relationship between users' star rating and usability of the mobile health app. AlBesher and Stone tested three mobile apps with low (2.5/5.0), medium (3.5/5.0), and high (4.8/5.0) user ratings using the System Usability Scale.[27] Researchers found that the user star rating was not correlated with usability.[27] The lowest rated app had the highest usability score.[27] Equally important, task completion time and the number of errors committed while completing the task were significantly correlated to usability score.[27]


#

Limitations

While the study indicated poor usability of a top-rated diabetes app, the impact on use or self-management of diabetes cannot be determined from a heuristic evaluation. The findings do suggest a lack of input on usability by experts knowledgeable in clinical informatics. Longer-term studies are needed to provide insights into the impact of usability experts as well as poor usability on engagement with the app and outcomes related to self-management using the app.

Another limitation concerns missing interface elements. An early finding by Nielsen referenced the difficulty in identifying missing elements, which he defined simply as something that should be included in a user interface but is not.[7] Nielsen's findings suggested that heuristic violations related to missing elements were easier to identify in paper prototypes than in computer-based systems postulating at the time that missing elements would cause evaluators to get stuck and prevent them from moving forward.[7] No missing interface elements were identified in the current study. Deliberate efforts to identify missing elements in user interfaces of apps should be considered by researchers.


#

Suggestions for Future Work

Long-term or impact studies of diabetes mobile health apps are needed to understand the impact on self-management, quality of life, length of engagement, and clinical utility. It is crucial to simultaneously study both sides of the user interface equation. Studies focused solely on user characteristics independent of the user interface and vice versa prohibit the identification of comprehensive and thereby effective solutions. Findings from these studies should be used to inform policy for improved oversight of these critical digital tools.

Additional knowledge can be gained using other usability evaluation methods such as cognitive walkthroughs. These provide unique insights from the perspective of app users. Studies focused on specific populations of people based on age, ethnicity, and disabilities, as well as factors such as health and digital literacy, among others, may also prove beneficial to the successful use of health-related apps.


#
#

Conclusion

The number and severity of heuristic violations in the small opening screen of this top-rated mobile health app support the need for significant improvement in usability and involvement of clinical informaticists in this growing area of health care delivery. The heuristic evaluation used was important in uncovering where usability problems are located, descriptions of each usability issue, types of heuristics violated and associated severity, as well as solutions for improvement. Findings from the study suggest that other factors beyond usability may be impeding success with mobile health apps including commercialization which allows for bypassing experts in clinical informatics during the design and testing phases of development. The DTC pathway for these apps increases clinical risks, requiring further exploration and action.[28] The lack of regulatory oversight further accentuates the issue by removing normal safeguards and proliferating poor usability.[13] Technology and data can create essential and highly beneficial tools in the self-management of chronic diseases going forward. Success will require a comprehensive strategy for sound mobile health app design and testing, better pathways for commercialization, and regulatory oversight to ensure safe, efficient, and effective tools.


#

Clinical Relevance Statement

Advances in technology enable the delivery of health care beyond the traditional bricks and mortar, resulting in better access to digital tools for disease self-management. This study presents a cautionary example about designing these essential tools by illustrating significant usability issues in a top-rated mobile health app. The seriousness of the consequences of poor usability highlights the importance of the involvement of clinical informatics experts in the development and design of mobile health apps.


#

Multiple Choice Questions

  1. Which of the following is a consequence of poor usability in mobile health apps?

    • Improved self-management.

    • Patient disengagement.

    • Ease of use.

    • Reduced errors.

    Correct Answer: The correct answer is option b. Improved self-management of chronic illnesses, such as diabetes, is enhanced by good usability in mobile health apps. This is demonstrated by previous studies. Research has also shown that mobile health app users desire apps that are easy to use and prevent them from making errors when managing their disease. Poor usability causes patients to become discouraged and disengaged from using apps to support their self-management.

  2. Which of the following plays a key role in enhancing the usability of mobile health apps?

    • Food and Drug Administration.

    • User ratings.

    • Clinical informaticists.

    • Direct-to-consumer marketing.

    Correct Answer: The correct answer is option c. The Food and Drug Administration does not currently oversee mobile health apps. Research has shown, and this study corroborates, that user ratings are not correlated with enhanced usability. Direct-to-consumer marketing circumvents the enhancement of usability bypassing the expertise of clinical informaticists that can improve mobile health app usability.


#
#

Conflict of Interest

None declared.

Protection of Human and Animal Subjects

The study was performed in compliance with the World Medical Association Declaration of Helsinki on Ethical Principles for Medical Research Involving Human Subjects, and was deemed exempt by Texas Christian University, Institutional Review Board Chair.



Address for correspondence

Linda Harrington, PhD, DNP, FAMIA
Harris College of Nursing & Health Sciences, Texas Christian University
Fort Worth, Texas 76109
United States   
Email: [email protected]   

Publication History

Received: 21 June 2021

Accepted: 13 September 2021

Publication Date:
03 November 2021 (online)

© 2021. Thieme. All rights reserved.

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany


Zoom Image
Fig. 1 Pareto analysis of heuristic violations.
Zoom Image
Fig. 2 Stacked bar chart of heuristic violation severity.