Since the inception of the United States Medical Licensing Examination (USMLE) in 1992, sitting for this multi-part examination is a rite of passage for all physicians wishing to practice medicine in the United States — but has it become too important? Sponsored by the Federation of State Medical Boards (FSMB) and the National Board of Medical Examiners (NBME), the original purpose of the examination was to ensure that there was a national standard (set by representatives of state medical boards, educators, and the public) against which all physicians seeking to be licensed to practice in the United States could be compared. For three of the four steps in the current iteration of the USMLE examinations (USMLE Step 1, Step 2 Clinical Knowledge (CK), and Step 3), scores are reported using a numeric scale; Step 2 Clinical Skills (CS) provides a pass/fail grade.

While the intended purpose of the USMLE exams was to help state medical boards with making decisions about granting licenses to physicians, the USMLE exam scores are also used in a variety of other unintended manners. One of the uses that has gained the most attention is its use as a screening tool or as a data point in the selection of candidates for residency training. As a result of this use, the USMLE exams – especially Step 1, where the score is typically available for all residency applicants – have become even more high-stakes. This has raised a range of concerns on the effect of the examination on medical students’ training and wellness.

To fully explore these concerns, a recent conference, the invitational conference on USMLE scoring (InCUS), was sponsored by the American Medical Association (AMA), the Association of American Medical Colleges (AAMC), the Educational Commission for Foreign Medical Graduates (ECFMG), FSMB, and NBME. The group of 65 invited attendees – representing the breadth of individuals and organizations with a stake in USMLE scoring – met to discuss the challenges of the status quo and develop recommendations for addressing these. Additionally, pre-conference input was gathered from over 200 professional organizations, societies, and state medical boards. At the end of the conference, an outline of the problems associated with the current use of USMLE scores in residency selection and a list of preliminary recommendations were issued; these are available at https://www.usmle.org/inCus/ and public comments are being solicited until July 26, 2019.

What are the problems associated with the current scoring system of the USMLE examination?

Primarily, the concerns are coming from the community of medical students and medical school educators. As the USMLE examination has become increasingly important in the residency selection process, medical students choose to focus their energy on preparing for the exam. As a result, educators complain that curricular reform and innovative educational methods are difficult to implement. Student engagement in educational activities that do not directly prepare them for the USMLE examination – but that are nevertheless important experiences in the training of future physicians – has dropped as students instead spend time answering multiple choice questions to prepare for the USMLE examination. Students also report that the high-stakes nature of the examination is creating a lot of stress and adversely affecting their well-being. Finally, a low score on the USMLE examination may dissuade students from seeking certain residencies that are perceived as highly competitive and requiring high USMLE scores. The impact of the USMLE scores on the students’ career choices can be substantial.

On the other hand, program directors in residency programs report that they are under external pressures that have pushed them to rely heavily on the USMLE scores when selecting and evaluating residency candidates. Program directors acknowledge that the USMLE examination scores only reflect a narrow part of a candidates’ attributes – specifically, their medical knowledge and ability to perform on a standardized exam – which may miss other attributes that a more holistic review would reveal. For Step 1, this is particularly a concern, as the content that is tested is the scientific foundations of medicine, which is only indirectly applicable to medical practice and thus of unclear correlation to clinical performance. There is also a dearth of data to show that USMLE scores (steps 1, 2, and 3) correlate with better clinical performance in residency or in practice. However, there are few other objective measurements that program directors can use. Medical schools use different grading systems that cannot be readily compared, and more and more medical schools have moved away from grades altogether, with all or most courses being pass/fail. The Dean’s Letter, which is supposed to give insights into the breadth of the student’s achievements, suffers from selective inclusion of positive statements, being too long, and lack of standardization. Add the variation inherent in grades and letters for residency candidates who trained at medical schools outside of the United States, and the program director is left with a near-impossible task.

In addition, the number of applications that a residency program director considers has increased over the past decades. The Electronic Residency Application System (ERAS) makes it easy to apply to a large number of programs to the point that the average residency applicant now applies to 90 residency programs. As a result, the average residency program, which has 6.7 slots, receives almost 1000 applications, going as high as almost 3100 applications for the average internal medicine residency program. Reviewing each application in detail is simply not possible, leading program directors to use a USMLE score cutoff as a convenient objective measurement to reduce the number of applications that they review.

Various groups have put forth a variety of suggestions to resolve these concerns. A suggestion that has been particularly popular among medical students and medical school faculty has been to change the scoring of the USMLE exams to a pass/fail system. The backlash from the Graduate Medical Education community has been intense, which has proposed alternative solutions that would put more pressure on applicants (requiring additional information in the application such as results of examinations administered by a third party, or requiring visiting rotations at the residency institution).

Working toward a resolution

In the tense situation that has developed, the InCUS attendees had a difficult task: How to resolve this situation without creating winners and losers? An important consensus from the attendees was that the current transition system from medical school to residency is flawed; therefore, unilateral changes to USMLE alone will not be sufficient. In order to fully resolve this situation, the InCUS attendees recommended that this transition be optimized – specifically by “convening a cross-organizational panel to create solutions for the assessment and transition challenges from UME to GME, targeting an approved proposal, including scope/timelines by end of calendar year 2019.” The attendees discussed some ideas to improve program directors’ ability to more holistically evaluate candidates, for example through developing new tools for GME selection or optimizing the currently available tools. They also discussed other possible solutions, including limiting the number of residency applications per applicant or creating a multistage resident matching process.

Attendees also made recommendations to consider changes to the USMLE exams. In terms of the scoring, they considered a variety of possible solutions, including categorical scoring (pass/fail, quartiles, quintiles, and so on) and composite scoring across several of the exams. They considered changing the timing of score release or changing the recipients of the released scores. In the end, the conference did not make a firm recommendation for a path forward; however, the attendees recommended “reducing the adverse impact of the current overemphasis on USMLE performance in residency screening and selection through consideration of changes such as pass/fail scoring.”

Furthermore, as the USMLE exam scores are used by residency programs as a predictor of performance in the program, the attendees recommended “accelerating research on the correlation of USMLE performance to measures of residency performance and clinical practice.”

Finally, there are racial differences in USMLE performance that can adversely affect the career progression of certain demographic groups. The final recommendation was therefore to strive to “minimize racial demographic differences that exist in USMLE performance.”

Identifying the problems that have arisen in the current system and creating these recommendations are good first steps toward creating a new system where the USMLE exam scores play a more appropriate role. However, resolving the situation will require large changes in the status quo, both from the undergraduate and graduate medical education communities. These complex issues could not be fully resolved in the time provided by a brief conference, but hopefully, this will be a call to action and the positive dialogue that began at InCUS will continue.

Ole-Petter Riksfjord Hamnvik, MB, BCh, BAO, MMSc is an endocrinologist and educator at Brigham and Women’s Hospital and Harvard Medical School in Boston. He is a core faculty in the HMS endocrine course; the program director for the endocrinology fellowship program at Brigham and Women’s Hospital; the education editor for the NEJM Group; and he is also involved in several continuing medical education courses.

Read more from the NEJM Knowledge+ Blog: