Meaningful use of COMSAE Phase 1 in preparation for COMLEX-USA Level 1.

CONTEXT
The Comprehensive Osteopathic Medical Licensing Examination of the United States (COMLEX-USA) is a three level national standardized licensure examination for the practice of osteopathic medicine. The Comprehensive Medical Self Assessment Examination (COMSAE) is a three phase self assessment tool designed to gauge the base knowledge and ability of candidates preparing for COMLEX-USA.


OBJECTIVES
To investigate how COMSAE Phase 1 (Phase 1) was used by candidates and how completing Phase 1 impacted their performance on the COMLEX-USA Level 1 (Level 1) examination.


METHODS
Using data from the 2018-2019 administration of Level 1 and Phase 1 examinations, we counted the frequency of the unique Phase 1 forms taken by the candidates and calculated the correlation between the candidates' first attempt Phase 1 scores and the number of Phase 1 forms taken. We then calculated the correlation between the Level 1 scores and the Phase 1 scores. Next, we applied a multilevel regression model to examine the candidates' score improvement on the multiple Phase 1 forms taken. Finally, we investigated the effect of practicing through Phase 1 on the candidates' Level 1 performance using logistic regression models.


RESULTS
The majority of candidates took one (2,414; 33.9%) to two (2,196; 30.8%) timed Phase 1 forms prior to the Level 1 examination. There was a significant negative correlation (r=-0.48, t (6,505)=-44.05, p<0.001) between the candidates' first attempt Phase 1 scores and the number of Phase 1 forms taken. There was a strong and positive correlation (r=0.66 to 0.74, p<0.001) between Phase 1 and Level 1 scores. With other variables controlled, on average, candidates' Phase 1 scores increased 23.2 points on one attempt from the previous attempt. Having the most recent Phase 1 score controlled, a greater number of Phase 1 forms taken was associated with an improvement on the Level 1 performance.


CONCLUSIONS
The significant correlation between Phase 1 and Level 1 performance provided validity evidence for Phase 1. Moreover, our results suggested that candidates, especially those with lower performance on their initial Phase 1 attempt, might improve their Level 1 performance by taking multiple Phase 1 forms to monitor their academic improvement and gauge their readiness for Level 1.

The Comprehensive Osteopathic Medical Licensing Examination of the United States (COMLEX-USA) is a three level national, standardized, high stakes licensure examination for the practice of osteopathic medicine. COMLEX-USA Level 1 (Level 1) assesses the knowledge of foundational biomedical sciences and other medical knowledge relevant to solving clinical problems and promoting and maintaining health in providing osteopathic medical care to patients [1]. It is administered in a timed setting and typically taken by the candidates of accredited osteopathic medical schools at the end of their second year. There are multiple test forms administered every test cycle. All forms are comparable in content and statistical specifications.
The Comprehensive Medical Self Assessment Examination (COMSAE) is a three phase self assessment designed to gauge the base knowledge and ability of candidates preparing for COMLEX-USA. Each COMSAE examination is presented in a format and structure similar to the corresponding COMLEX-USA examination. COMSAE Phase 1 (Phase 1) is built from the same blueprint as Level 1 and is used as a self assessment tool to measure candidates' readiness to pass or achieve a certain score on Level 1 [2]. It has multiple test forms available for candidates or schools to purchase. Candidates or schools may choose to take the same test form multiple times, or they can order different forms. Schools can purchase any available forms, whereas candidates can purchase a limited number of forms.
Since COMSAE is designed as a self assessment tool for COMLEX-USA and the Phase 1 examination was ranked by 113 students among the most helpful preparation tools for Level 1 [3], we studied how Phase 1 was used by the candidates and how taking Phase 1 impacted their performance on the Level 1 examination. We hope the findings from this study can provide guidance to candidates regarding how their practice through Phase 1 can assist with their Level 1 preparation. In this study, we intended to answer the following questions: How many Phase 1 forms did candidates take prior to Level 1? What was the correlation between Level 1 and Phase 1 scores? Did Phase 1 performance improve as more forms were taken? Did the Phase 1 test taking experience help with the preparation for Level 1?

Methods
The study was reviewed and deemed exempt by the institutional review board of the National Board of Osteopathic Medical Examiners in July 2020.

Data
The Phase 1 data for this study was collected for the period from November 2017 to April 2019, a Phase 1 test window for students who planned to take the 2018-19 cycle of the Level 1 examination. We prepared the data by removing "outlier" scores from students who answered less than 90% of the questions and excluding tests completed under the untimed setting. Since candidates could repeat the same test form multiple times, we used their responses on the initial attempt for each unique form. In concert with the purpose of the study, only candidates who took Phase 1 prior to Level 1 were included.
All data analyses were conducted using R version 3.6.3 [4].

Number of Phase 1 forms taken
We counted the frequency of the unique Phase 1 forms taken by the candidates. We also calculated the correlation between the candidates' first attempt Phase 1 scores and the numbers of Phase 1 forms taken.

Correlation between Level 1 and Phase 1 scores
We then calculated the correlation between Level 1 scores and Phase 1 first attempt scores, last attempt scores, and mean scores across all forms.

Improvement of Phase 1 performance
Next, we examined the candidates' improvement on Phase 1. We modeled the relationship between the sequence of each unique form taken by a candidate (sequence) and the scores obtained on that form (comsae.score). This analysis was only based on the candidates who took more than two forms. Given that the scores were nested within the individual candidates, we calculated the intraclass correlation, which was 0.47. This value justified the application of a multilevel model. A random intercept model was specified as follows: Comsae.score∼sequence + year comsae + (1|ID) As indicated, the candidates' Phase 1 scores were predicted by the exam level predictor sequence while controlling for the years spent in medical school when each Phase 1 form was taken (year_comsae).
Year_comsae was computed based on the assumptions that all the candidates attended 4 year colleges and that schooling began on August 1 of the year they began attending the program. Year_comsae was computed by deducting the number of days left at school, which was calculated based on the candidates' school reported expected years of graduation, from the 4 year length of education and then divided by 365. (1|ID) indicated a multilevel structure of the exam scores under individual candidates.

Improvement of the Level 1 performance
We investigated the effect of taking Phase 1 on candidates' Level 1 performance through a series of logistic regression models. The models described the relationship between the candidates' likelihood of passing Level 1 and a binary variable indicating whether additional Phase 1 forms were taken, given the same current Phase 1 performance. The models were: In these models, p.comlex indicated the probability of passing Level 1, comsae.score i indicated the ith-attempt Phase 1 score, (n.comsae>I) was a binary variable indicating whether the candidate took more than I forms, where I∈{1, 2, 3, 4}, and timegap was the year difference between the first Phase 1 exam and the Level 1 exam. For example, Model 2 examined the difference in the Level 1 performance between the candidates who took exactly two Phase 1 forms (n.comsae=2) and those who took more than two forms (n.comsae>2), controlling for the second attempt Phase 1 score and the time gap between the first Phase 1 and the Level 1 examinations. Notably, each model in the series used a subset of the data from the previous model.

Results
Number of Phase 1 forms taken Level  Correlation between Level 1 and Phase 1 scores Table 2 reports the Pearson's product moment correlations between Level 1 scores and first attempt, last attempt, and means scores of all Phase 1 forms. All correlations were strong and significantly positive. The highest correlation occurred between the Level 1 scores and the mean Phase 1 scores (r=0.74, t (6,505) =87.51, p<0.001), followed by last attempt Phase 1 scores (r=0.69, t (6,505) =76.86, p<0.001). Figure 1 shows the Level 1 scores by Phase 1 mean score. There is a clear positive relationship between the two sets of scores. A similar pattern was observed between the Level 1 scores and first/last attempt Phase 1 scores. Figure 2 shows the scatterplots of Level 1 scores and Phase 1 average scores for the candidates who took one, two, or three Phase 1 forms (i.e., the majority of candidates). The Pearson's product moment correlations were 0.69 (t (2,412) =47.37, p<0.001) and 0.66 (t (2,194) =40.88, p<0.001) for the candidates who took one and two Phase 1 forms. The correlation was highest (i.e., r=0.81, t (1,248) =49.20, p<0.001) for candidates who took three Phase 1 forms. Figure 3 shows that first attempt scores were negatively correlated with the number of forms taken. While the group of candidates who only took one form (n=2,414) had a mean first attempt score of 546, the group of candidates who took five forms (n=145) had a mean score of 351. Within each subgroup, the mean Phase 1 score increased as another form was taken. We also observed that the mean score on the last forms taken was over 500 for all subgroups. In other words, the candidates seemed to target a score of 500 on Phase 1 as a threshold to stop taking   another form. The linear regression results of Phase 1 score improvement are displayed in Table 3. The sequence of the forms taken was a significant (t (6,273) =29.35, p<0.001) predictor of Phase 1 scores. Specifically, with the timing of the test controlled, one unit increase in the sequence of the test was related to an increase of 23.2 points in Phase 1 score (i.e., the expected second attempt score would be 23.2 points higher than the first attempt score).

Improvement of the Level 1 performance
As shown in Table 4, taking extra Phase 1 forms showed either significantly positive effect (Models 1 and 3; p<0.05) or no significant effect (p=0.841 for Model 2 and p=0.053 for Model 4) on Level 1 performance. That means, for example, based on Model 1, holding other variables constant, the candidates who took more than one Phase 1 form (n.com-sae>1) had an expected increase of 0.5 in the log odds of passing the Level 1 examination than those who took only one form (n.comsae=1). In comparison, based on Model 2, holding other variables constant, the candidates who took more than two Phase 1 forms (n.comsae>2) did not have significantly higher log odds of passing Level 1 than those who took exactly two forms (n.comsae=2). Figure 4 exemplifies the probability curves for the two groups of candidates (n.comsae=1 vs. n.comsae>1) based on Model 1 when the time gap was fixed at its mean value of 0.21 (77 days). In this example plot, for the candidates who scored 400 on their first Phase 1 exam, those who stopped taking new forms had an 85% probability of passing the Level 1 examination, while those who took more forms had a 90% probability of passing Level 1. One can also notice in the plot (Figure 4) that candidates with lower scores benefited from taking extra forms more than candidates with higher scores.

Discussion
There is scant research exploring how medical students use COMSAE self assessment tools to prepare for the COMLEX-USA examinations. The purpose of this study was to investigate the candidate experience with Phase 1 to determine whether and how it helped them prepare for the Level 1 examination. The results showed a strong relationship between Phase 1 and Level 1 scores. Phase 1 and Level 1 are built from the same blueprint. Phase 1 questions are repurposed questions from the Level 1 examination and are presented with a similar format and structure. It is unsurprising to see a significant correlation between the two sets of scores. Jackson et al. [5] analyzed Phase 1 and Level 1 score information from 102 survey participants from the class of 2019 at the University of New England College of Osteopathic Medicine and found a similar correlation of 0.61 between participants' Level 1 scores and their scores on a Phase 1 form taken nearest to their Level 1 examination. However, they also found a low correlation of 0.25 between participants' Level 1 scores and their scores on a Phase 1 form taken earlier [5]. They then concluded that Phase 1 was not a meaningful preparation tool for Level 1 [5]. We argue that the low correlation was likely a measurement error caused by their small sample size. Contrary to Jackson et al.'s conclusion, our findings supported the predictive validity of Phase 1.
Most candidates in our study took one or two Phase 1 forms and the mean Phase 1 last attempt score was over 500 among all the subgroups taking different numbers of Phase 1 forms. We infer that students tended to continue taking the Phase 1 examination until they reached a score of 500 or above; thus, it seems that the candidates used Phase 1 as a tool to monitor their academic progress until they felt confident that they would pass or reach a certain score (i.e., a target score for the application of certain competitive residency or training programs) on Level 1.  For the candidates in our study with the same recent Phase 1 performance, taking more forms was generally related to an increase in the probability of passing Level 1. However, note that this relationship does not indicate causality. It may not be that taking Phase 1 alone helped with better Level 1 performance. Instead, the candidates chose to keep studying, monitored their progress through more Phase 1 forms, and achieved better Level 1 results, as suggested by the improvement in Phase 1 scores. We encourage future candidates, especially those with lower performance on their initial Phase 1 attempts, to utilize Phase 1 to gauge their readiness for Level 1 and monitor their improvement.
A limitation to our study is that we treated all Phase 1 test taking in the same way, with the belief that candidates might benefit from exposure to Phase 1 content and format, regardless of why they took the examination. We acknowledge that some schools use Phase 1 as a barrier to qualify students to take Level 1. The reasons why candidates took Phase 1 can be another parameter we explore in a future study. Another practical consideration is that Phase 1 is a variant examination, allowing different modes of timing and proctoring and involving different levels of candidate motivation. Thus, relying on Phase 1 scores alone to identify candidates with "outlier" scores for early intervention may not be feasible. Future studies can collect more data, such as demographic information (e.g., gender, ethnicity, or socioeconomic status) and classroom performance data, so that more accurate predictions of Level 1 performance can be achieved. Further, we applied the data from Phase 1 and Level 1 examinations built from the original test blueprint [6], which was composed of nine patient presentation domains and six physician tasks. From the 2019-2020 administration, the Level 1 examination adopted a new test blueprint [7], which consists of seven competency domains and 10 clinical presentations. To examine the impact of the change in test blueprint on the results, we plan to reexamine the findings in May 2021, when we have sufficient data.

Conclusions
This study examined how candidates used COMSAE Phase 1 as a self assessment tool to prepare for the COMLEX-USA Level 1 examination. The significant correlation between Phase 1 and Level 1 performance provides validity evidence for Phase 1. Moreover, results suggested that candidates, especially those with lower performance on their initial Phase 1 attempt, might improve their Level 1 performance by taking multiple Phase 1 forms.
Acknowledgements: The authors would like to thank National Board of Osteopathic Medical Examiners staff members, Robby Biegalski, Caitlin Brown, Lisa Mysker, and Evelyn Ronkowski, who provided comments on prior versions of this manuscript. Research funding: None reported. Author contributions: All authors provided substantial contributions to conception and design, acquisition of data, or analysis and interpretation of data; all authors drafted the article or revised it critically for important intellectual content; all authors gave final approval of the version of the article to be published; and all authors agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. Competing interests: The authors are employees of the National Board of Osteopathic Medical Examiners and therefore have a financial stake in the success of the Comprehensive Medical Self-Assessment Examination.