By the end of the 20th century the interest of patients to participate in making decisions regarding their own health has increased . This required a facilitation of access to information relevant to healthcare. Among the most important information are laboratory test results, as a significant number of clinical decisions are based on them . The study performed by the European Federation of Clinical Chemistry and Laboratory Medicine Working Group on Patient Focused Laboratory Medicine showed that about 30% of patients wanting additional information on laboratory results would refer to the Internet .
On the other hand, the rapid increase of the global use of smartphones, thanks to their portability, facilitates access to health information at any time. Namely, technical capabilities of smartphones have been advancing rapidly, enabling many of the additional functions, beyond simple communication. The introduction of smartphones and their software platforms (Apple’s iOS, Google’s Android, Microsoft Windows Phone) were a starting point for designing special purpose applications, known as mobile applications (apps) . Many of these apps are used for delivering health interventions and involve, in different ways, laboratory medicine data.
The quality of health-related apps, on the other hand, is more challenging. There have been attempts to apply tools developed for the evaluation of quality of web sites, as well as methods for systematic review of the literature, for identifying and evaluating health apps for effective health promotion and chronic disease management , , , , , , .
Considering the importance of laboratory medicine data in the overall healthcare-decision making, this survey was conducted to search the online stores market for Android (Google Play) and iPhone (iTunes), as two of the most widely used smartphone platforms, for health apps that manage in any way user’s laboratory medicine data. The aim was to review them and perform a quality evaluation.
Materials and methods
We used the proposed method for the analysis of the content of mobile health apps by Grundy et al. , who recommended reporting standards by BinDhim et al.  and the use of multidimensional quality assessment tools, like the Mobile App Rating Scale (MARS) . The apps for the analysis were collected throughout the period January 2017–May 2018, and the app search was conducted within Google Play and iTunes in Serbia. The search criteria entered in Google Play and App Store were “Lab results blood work”, “Lab results”, and “Health apps”. These criteria yielded 1247 potentially relevant apps, 747 for Android and 500 for iPhone, out of which 16 were available for both platforms. Apps were included if they, in any way, dealt with laboratory medicine data, had an English language interface and were available for smartphones. Out of 1247 potentially relevant apps, 95 were included in further analysis (29 for Android, 50 for iPhone, and 16 for both), after the exclusion of the “prank apps” which allegedly measure the blood concentration of glucose or cholesterol through a finger scan, and apps that only mention laboratory data, but actually do not operate with them. In this phase, the information about apps was collected from their description in the store. In order to review every selected app, the apps were downloaded and analyzed based on their content and functioning. During this process we needed to further exclude apps that requested payment of a certain fee prior to download, apps that could not be started after the download and those that requested a specific login data. Finally, we were able to analyze the content and function of 52 selected apps – 27 for Android platform, nine for iPhone, and 16 available on both of them. The flowchart of the selection process is presented in Figure 1.
Review and content analysis of the apps
Descriptive data collected about each app were app name, version, developer, price, average ratings, total number of ratings, affiliation, category, and the year of the last update. Technical aspects of the app included the possibility of sharing on social media, whether it had an app community, does it allow password protection, require login, send reminders, and need web access to function. For the technical aspect, the total number of present aspects was calculated and used for further analysis. For review and content analysis of the apps we used a multidimensional tool for classifying and rating the quality of mobile health apps – MARS, developed by Stoyanov et al. . The apps quality criteria are systematized into four categories divided into 19 subcategories representing the 19 individual MARS items: 1. Engagement (entertainment, interest, customization, interactivity, and target group), 2. Functionality (performance, ease of use, navigation, gestural design), 3. Aesthetics (layout, graphics, visual appeal), and 4. Information quality (accuracy of app description, goals, quality and quantity of information, visual information, credibility, evidence base). Each item uses a 5-point scale (1-Inadequate, 2-Poor, 3-Acceptable, 4-Good, 5-Excellent). If an item cannot be applicable for all apps, there is a “Not applicable” option. MARS is scored by calculating the mean score for each of the five categories. The fifth category describes app subjective quality, and it is scored as a mean subjective quality score out of personal raters’ opinion on whether they would recommend the app, how many times they would use it, whether they would pay for it, and of the overall star rating of the app. The MARS scoring was conducted independently by the two co-authors (SJ and JS), both laboratory medicine specialists. The internal consistency of MARS ratings was estimated with calculation of Cronbach’s alpha. Interrater reliability was determined by calculating the intraclass correlation coefficient (ICC). Values of ICC<0.5 indicate poor interrater reliability, values between 0.5 and 0.75 are considered moderate, 0.75–0.9 good, and >0.9 indicated excellent agreement in MARS scoring. Calculation of ICCs was performed as two-way mixed effects, average measures model with absolute agreement , .
Calculated scores were then used for appropriate statistical analysis. We used Spearman’s correlation analysis to analyze the relationships among user’s ratings found in the app stores, quality elements defined by the MARS score, technical aspects of the app, as well as the affiliation of the developer. Also, the Kruskal-Wallis analysis of differences between the number of technical aspects present, MARS score items, and users’ ratings of apps classified in the distinguished categories of apps using laboratory medicine data was performed. Statistical analysis was carried out using IBM® SPSS® Statistics (version 20) and MedCalc Software (version 18).
All of the 95 eligible apps could be categorized into one of the seven categories according to their purpose described in the app store. These categories were:
apps that offer medical advice about symptoms and health queries with the possibility to upload laboratory test results, which can be seen, stored and shared (9/95, 9.5%),
reference ranges of selected analysis with basic information about the causes of increase or decrease designed for patients (15/95, 15.8%),
quick reference for laboratory tests for medical students and doctors (30/95, 31.6%),
apps for monitoring the state of user’s health through a wide range of health parameters, including glucose and/or cholesterol as laboratory data (19/95, 20.0%),
apps that provide access to patients’ laboratory results to physicians (11/95, 11.6%),
apps that enable patients to access their laboratory test results directly from the diagnostic center (4/95, 4.2%), and
electronic health records apps that include laboratory test results (7/95, 7.4%).
However, all of the items classified into the categories of apps that provided access to patient’s results to physicians, and mobile electronic health record apps, after download allowed access only with specific login data, and therefore they could not be fully evaluated. Therefore, instead of seven, we could classify the analyzed apps into only five categories. Also, from the category of apps that allowed patients’ access to their laboratory test results directly from the providing laboratory, only one app could be evaluated following the download. All of the included apps (n=52) are listed in Table 1.
Summary characteristics of the selected apps
The descriptive data for the analyzed apps are presented in Table 2.
Most of the apps had none of the technical aspects examined (29/52), only one followed in frequency (10/52), then two (6/52), three (3/52), while four and five were present only in two apps each (2/52). None of the examined apps had all six technical features employed. Among less than half of the examined apps that did have one or more of the features of interest, most of them had enabled possibility of sharing data, and the least had an app community. MARS ratings performed by the two independent raters demonstrated high internal consistency (Cronbach’s alpha for individual MARS categories ranged from 0.5 to 0.88, and for the total MARS score was 0.93). Interrater variability was moderate to good for individual MARS categories (ICC between 0.5 and 0.83), as for the total MARS score (ICC 0.86). The average MARS total score was 3.8 out of 5 (IQR 0.8), ranging from 2.2 (Blood Tests Result by King of Story App) to 4.8 (Biolab). The highest individual MARS scoring item was for the Functionality 5.0 (IQR 0.0), and the lowest was Engagement, with median individual score value of 2.6 (IQR 1.2). Aesthetics and Information quality had considerably high and equal scoring values of 3.7 (IQR 0.7) and 3.8 (IQR 1.0), respectively. Interestingly, the Subjective quality score was lower than the total MARS score, with the value of 3.2 (IQR 1.0).
Spearman’s correlation analysis of relationships among user’s ratings found in the app stores, quality elements defined by the MARS score, technical aspects of the app, as well as the affiliation of the developer are presented in Table 3. All MARS individual scoring items were significantly and positively associated with each other, as well as with the Subjective quality and Number of technical aspects (except the Functionality and Number of technical aspects, where there was no significant correlation). Also, the total MARS score and individual items were significantly correlated with the affiliation of the app developers, as expected. However, there were no significant correlations between the ratings from the app stores and the MARS scores. Users’ ratings were only in significant correlation with the corresponding number of ratings in each app store.
According to Kruskal-Wallis analysis (Table 4), there was a significant difference in the number of technical aspects employed, MARS Engagement and Informational quality score items, MARS Total score, and MARS subjective quality. The lowest values for all of these items were in the category 2, i.e. the category of apps that provided reference ranges of selected analysis with basic information about the causes of increase or decrease designed for patients. Moreover, the post hoc test showed that the difference was statistically significant between the values in category 2, and the values in all other categories of apps. Affiliation of app providers was in no connection with these differences, as the χ2 independence test showed independent distribution of affiliation categories and app classification with p=0.3414 (data not shown).
To the best of our knowledge, the analysis of the number and quality of available smartphone apps using laboratory medicine data in any way, has not yet been performed. These mobile apps are not subjected to regulatory oversight as they are not considered medical devices . Considering the widespread use of smartphones and the availability of mobile apps, increasing impact of their use and the information they provide on overall health related behavior is expected. Therefore, we wanted to evaluate the quality and reliability of information regarding laboratory medicine data in mobile apps present on the app market.
As seen from the presented data, laboratory medicine data present in any way in mobile apps on the Android and iOS market are represented in a minor percent. This number is much lower than, for example, for apps for the prevention, detection, and management of cancer, where out of the total number of 1314 applications, 295 met the selection criteria . However, the number of apps that are dealing exclusively with laboratory medicine data and their interpretation is half of the total number (categories 2 and 3), and only half of these are intended for patient use. When searching PubMed, we found no articles that evaluated any of applications dealing with laboratory medicine data, which is why we have no data on their actual use and effect on patients’ habits and overall health.
Although several possible criteria can be found for mobile apps content evaluation throughout the literature, we used a multidimensional tool for classifying and rating the quality of mobile health apps, the MARS app quality criteria. The MARS rating was validated on weight management and mental health apps , , and was recommended by Grundy et al.  as the innovative method in the study of commercially available health-related apps. The average MARS total score was above average, with the value of 3.8 out of 5. The highest individual MARS scoring item was for the Functionality 5.0, and the lowest was Engagement, with median individual score value of 2.6. The analysis of MARS score values between the seven different categories of applications that use laboratory medicine data, revealed the poorest performance and quality of the apps intended for patients. This, together with the significant issues of security of personal information used by the apps, and the questionable affiliation of developers, which was either commercial or unknown, without referencing the source of information cited in the app, reveals the poor quality of apps aimed to be the source of information to patients who want to know more about their laboratory test results. Furthermore, there were no significant correlations between the ratings from the app stores and the MARS scores. The users ratings from the app store could be misleading as they are unlikely to have enough knowledge in health care, and also often among reviews contributing ratings there are those made by services hired by publishers to provide positive ratings and reviews for their apps .
Limitations of this analysis are that we have not evaluated the impact these apps have on their intended users, i.e. patients and medical professionals. It is important to identify whether the use of these apps leads to a change in behavior – increased health awareness, understanding of laboratory test results, and their impact on personal health. Also, we used the MARS rating for apps quality assessment, a general tool not yet evaluated on laboratory medicine apps. As there is a chance that app developers restrict distribution of apps to specific countries, their availability for download may depend on the location of the user. Therefore, the fact that the app search was conducted within Google Play and iTunes in Serbia may be considered as a limitation of this study. Further work in this direction would provide the definition of quality benchmarks for laboratory medicine apps, and possible modification of the MARS items for their adequate evaluation.
In conclusion, mobile apps for laboratory medicine available on the app market that deal exclusively with the interpretation of laboratory test results represent half of the apps selected with the defined search criteria. Apps designed for patients, the vulnerable group, are of the poorest quality, considering the total quality of the content and information they provide, estimated using the MARS tool. This estimation needs to be validated for laboratory medicine apps, and eventually modified after consideration of specific quality benchmarks.
Hallworth MJ, Epner PL, Ebert C, Fantz CR, Faye SA, Higgins TN, et al. on behalf of the IFCC Task Force on the Impact of Laboratory Medicine on Clinical Management and Outcomes. Current evidence and future perspectives on the effective practice of patient-centered laboratory medicine. Clin Chem 2015;61: 589–99. CrossrefGoogle Scholar
Watson ID, Oosterhuis WP, Jorgensen PE, Dikmen ZG, Siodmiak J, Jovicic S, et al. A survey of patients’ views from eight European countries of interpretive support from specialists in laboratory Medicine. Clin Chem Lab Med 2017;55:1496–500. PubMedWeb of ScienceGoogle Scholar
Gagliardi A, Jadad AR. Examination of instruments used to rate quality of health information on the internet: Chronicle of a voyage with an unclear destination. Br Med J 2002;324:569–73. CrossrefGoogle Scholar
US National Library of Medicine Evaluating Internet Health Information: A tutorial from the National Library of Medicine 2016. http://medlineplus.gov/webeval/webeval.html. Accessed 3 Jun 2018.
Xie B, Su Z, Zhang W, Cai R. Chinese cardiovascular disease mobile apps’ information types, information quality, and interactive functions for self-management: systematic review. JMIR Mhealth Uhealth 2017;5:e195. PubMedCrossrefWeb of ScienceGoogle Scholar
Grundy QH, Wang Z, Bero LA. Challenges in assessing mobile health app quality. A systematic review of prevalent and innovative methods. Am J Prev Med 2016;51:1051–9. CrossrefPubMedWeb of ScienceGoogle Scholar
Stoyanov SR, Hides L, Kavanagh DJ, Zelenko O, Tjodronegoro D, Mani M. Mobile App Rating Scale: a new tool for assessing the quality of health mobile apps. JMIR mHealth Uhealth 2015;3:e27. Web of ScienceCrossrefPubMedGoogle Scholar
US Food and Drug Administration (FDA). Mobile medical applications: Guidance for industry and food and drug administration staff. Silver Spring, MD: US FDA:2015. Google Scholar
Bender JL, Yue RY, To MJ, Deacken L, Jadad AR. A lot of action, but not in the right direction: systematic review and content analysis of smart phone applications for the prevention, detection, and management of cancer. J Med Internet Res 2013;15:e287. PubMedCrossrefGoogle Scholar
Bardus M, van Beurden SB, Smith JR, Abraham C. A review and content analysis of engagement, functionality, aesthetics, information quality, and change techniques in the most popular commercial apps for weight management. Int J Behav Nutr Phys Act 2016;13:35. CrossrefPubMedWeb of ScienceGoogle Scholar
App review services. Available at www.appreviewservice.com/. Accessed 2 Jun 2018.
Lecture given by Dr. Snežana Jovičić at the 2nd EFLM Strategic Conference, 18–19 June 2018 in Mannheim (Germany) (https://elearning.eflm.eu/course/view.php?id=38).
About the article
Published Online: 2018-11-29
Published in Print: 2019-02-25
Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission.
Research funding: None declared.
Employment or leadership: None declared.
Honorarium: None declared.
Competing interests: The funding organization(s) played no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the report for publication.