December 8, 2020

The Importance of Post-Exam Quality Assurance

by Karmen Garey, PharmD, PGY-1 Baptist Memorial Hospital – North Mississippi Pharmacy Resident, University of Mississippi School of Pharmacy

From the students’ perspective, once they hit “submit” after completing an exam they think “Thank goodness that’s done!” However, for teachers, there is still some critical work to do. Now it’s time to review the performance data to ensure the examination was fair and measured what was intended. Here are a few tips and strategies to assess the quality of an exam.

Make certain the exam (as a whole) is a “good” one 

Before the exam is administered to students, a good exam should be written with the following goals in mind:1,2

  • An exam should address multiple levels of Bloom’s taxonomy — from knowledge recall to application and analysis.
  • The exam should include a variety of questions that test a range of concepts that map back to the learning objectives.
  • The consistency of the exam's performance over time is important. An exam should routinely perform the same from year to year despite some changes to the questions.
  • An exam should measure the learning outcomes and course material it was designed to test.

Make certain the questions included on the exam are “good” ones

There are two types of questions that should be included on exams: mastery questions and discriminating questions.  Mastery questions are those questions that students are expected to excel on.3 This type of question is typically a “knowledge level” question in Bloom’s Taxonomy. The questions often test factual recall and the recognition of fundamental material.2  These questions might be called “gimmie questions” by the students; however, teachers include these questions to ensure that students have a firm understanding of the basic but super important concepts or facts.  Discrimination questions, on the other hand, are intended to identify students who have a deeper knowledge of the material and separate students into different performance levels (e.g. identify "A", "B", and "C" students).  Higher-performing students are expected to answer these questions correctly more often than lower-performing students.  This type of question often targets the comprehension, application, analysis, synthesis, or evaluation cognitive level in Bloom’s taxonomy. These questions require an in-depth knowledge of the subject matter.2

Next, let’s look at the distractors.  Does each question include appropriate distractors?3 A distractor is an answer choice that, while wrong, sounds and appears like it could be plausible. A good distractor should be clear and concise and should be similar in structure and content to the correct response. Savvy test-takers have learned to spot answers that seem different in some way, so even small variations in the style, subject matter, and length of the answer choices can provide clues. 

Next, is the question stem clearly written.  Is it clear what the learner is being asked?  Or is the question open to interpretation?  When writing questions, it is important to ensure that the question is not misconstrued.  Sometimes students will overthink a question and try to find the hidden meaning when there is none. To avoid this problem, use words that are unambiguous.  Avoid phrasing that could be cryptic.

Finally, is the answer to the question correctly keyed.  If a lot of students selected the “wrong” answer, it's possible that the question was miskeyed.  While this is not something that happens often, it does happen! So it is always a good idea to double-check that the correct answer was selected on the answer key. 

Some other things to consider as you look at the post-exam performance data.  How did the exam scores look last year? While a group of students performing much better or much worse than previous year’s students is not always an indication that the exam is invalid, it should prompt additional questions.

  • Was the material taught in a manner that was different from previous years?
  • Was the exam formatted or delivered differently?
  • Could the students this year have been less (or better) prepared in some way to comprehend the material?
  • Is cheating suspected?
  • If there are multiple instructors, did students received different messages about the content?

The answers to these questions may not be obvious or even relevant, but it is something to keep in mind.

Use the post-exam statistical analysis to identify problem questions3

As technology becomes a more integral part of exam delivery, it enables a wealth of data that can be used for post-exam quality assurance. Most post-exam statistical analysis tools report similar elements; however, the names may be slightly different. ExamSoft is among the most common exam delivery tools available today and routinely reports these statistics:

  • Item Difficulty represents the difficulty of a question. It reports the percentage of students who correctly answered the question. The lower the percentage the more difficult the question. There is not a set number that the item difficulty should be but the number should be used to ensure the intent behind the question matches the number. For example, if the teacher wants the item to be a mastery question, the difficulty should be 0.90 to 1.00 with very few students getting the question wrong.  If the question is meant to separate those who have a firm grasp on the material vs. those who don’t, lower levels are acceptable. An instructor may have a difficulty “cutoff” number in mind where anything below 0.6 (for example) prompts additional analysis of the question.
  • Upper/Lower 27%, Discrimination Index, and Point Biserial are each calculated differently but they report a similar concept. Stated simply, they all determine whether the top performers on the exam achieved better results on a question compared to those who did not perform well. If the top performers don’t out-perform the poor performers, the question should be assessed to determine why.
    • Upper 27% / Lower 27% - what percentage of the top 27%  vs. the bottom 27% of performers got the question correct.
    • Discrimination Index – this represents the difference in performance between the best performers vs. the lowest performers.
    • Point Biserial – indicates whether those who answered correctly on a specific item correlates with doing well on the exam overall.  In other words, does performance on this question predict whether a student did well (or not so well) on the exam? 

 

Correlation with Overall Exam was

Point Biserial

Very good

>0.3

Good

0.2-0.29

Moderate

0.09-0.19

Poor

<0.09



So, let’s look at the statistical analysis from two example questions. 

  • This was a mastery question — students are expected to do well on this question. It’s a fundamental concept that all students should know.
  • The Discrimination index = 0.04 which indicates almost no discrimination between the top and bottom performers. In this case, because it’s a mastery question and we expected all students to perform well on this question.  Thus, we don’t expect this question to discriminate between the best and worse performers.
  • The Point Biserial = 0.10 indicating this question only moderately correlate with doing well on the exam overall. Again, the top and bottom performers performed quite similarly on this question, so there won’t be a strong correlation between the performance on this question and the overall exam.
  • If this question was not intended to be a mastery question, perhaps the material was taught particularly well … or maybe there was cheating involved

Now let’s take a look at a question where only 66% of the students selected the correct response.

  • Item difficulty = 0.66 so 66% of the students selected the correct response. This is not a bad thing but it is important to make sure the students who understood the material were more likely to get this question right.
  • This is intended to be a discriminating question, so let’s make certain it’s actually discriminating between the best and worse performers.
  • Look at the Upper vs. Lower 27%: 82% of the top performers got this question correct. Only 46% of those who performed the poorest on this exam got this question correct.
  • Discrimination Index: 0.36. This question did a good job discriminating between the best and worst performers on this exam.
  • Point Biserial = 0.28 Performance on this question has a good correlation with the student’s overall exam performance.

While there are no hard rules for how to analyze an examination, the strategies I’ve outlined in this blog post are some of the best practices every teacher should follow. It is important to follow a systematic process and establish “cut-offs” in advance. The key is to be clear and consistent from exam to exam.

References

  1. Brame C. Writing Good Multiple Choice Test Questions. 2013. Accessed December 3, 2020.
  2. Omar N, Haris SS, Hassan R, Arshad H, Rahmat M, Zainal NFA, et al. Automated Analysis of Exam Questions According to Bloom's Taxonomy. Procedia - Social and Behavioral Sciences. 2012;59:297–303. Accessed December 1, 2020.
  3. Ermie E. Psychometrics 101: Know What Your Assessment Data Is Telling You. Examsoft. 2015. Accessed November 18, 2020.

A Hopeful Pharmacist-Led Educational Program to Reduce Prescription Errors

by Spencer Harris, Doctor of Pharmacy Candidate, University of Mississippi School of Pharmacy

Summary and Analysis of:  Gursanscky J, Young J, Griffett K, Liew D, Smallwood D. Benefit of targeted, pharmacist-led education for junior doctors in reducing prescription writing errors - a controlled trial. Journal of Pharmacy Practice and Research. 2018;48(1):26–35.

Writing a safe and properly-formatted prescription is no easy task. Not only does the prescriber need to include the patient’s name, date of birth or address, the date of the writing, the name of the drug, the dose, the dosage form, the instructions on how to take it, the quantity, the number of refills, and the signature of the authorizing provider but the prescriber must write a prescription that is safe for the patient. Factor in the multitude of patients a physician sees, the innumerable questions that she receives, the monotony of writing dozens of prescriptions every day, and many other variables that add stress on her shoulders, it's understandable there will be an error here and there. While understandable, it is not something that can be accepted or overlooked. Each year, according to the FDA’s Wedwatch website, more than one hundred thousand reports about medication errors are documented. A subset of these reports are related to errors in prescribing errors, both in the sense of missing information and prescribing inappropriate therapy.  These errors affect patient health outcomes; this is inexcusable. I have witnessed these errors firsthand, as I am sure nearly every person who has worked in a pharmacy has.

Educational programs might be one way to address this problem. But an educational program must be efficient and compatible with the constant bustle of healthcare, where there is no time to waste. It is for this reason that I read the study by Gursanscky and his colleagues from Monash University in Australia with high hopes.


The investigators implemented a pharmacist-led approach to teaching junior physicians (who write a notably large proportion of prescriptions in teaching hospitals) about prescription writing.  They compared this approach to an online education program (based on the National Inpatient Medication Chart Training course) and to a control group that did not receive any additional instruction. The study was a cluster-randomized trial that enrolled all junior doctors in the general medical units at an Australian tertiary hospital (twelve interns and four registrars). The junior physicians were divided equally into four person-groups who were randomly assigned to either the pharmacist-led intervention (one group), the e-learning intervention (one group), or the control arm (two groups).

The pharmacist-led intervention consisted of three very brief (10-minute) sessions per week for four weeks.  During these sessions, a clinical pharmacist discussed types of errors, their frequency, and severity. Over the four weeks, the pharmacist discussed each error type, why it was unsafe, its consequences, and how to avoid it. Following each tutorial, the pharmacist addressed participant questions. A full report on the intervention can be found in the original study.

Data was collected for three weeks before the intervention and for four weeks during the intervention. The data collected was the prescription error rate among all groups. An error was defined as a prescription that had incomplete patient or prescriber details or which was “illegible, incomplete, or incorrect.” The error rates were then compared using a Chi-square analysis for the pre- and post-intervention periods.

The results (n= 9,657 prescriptions analyzed) showed that the pharmacist-led group had a significantly lower rate of errors in the post-intervention period. Interestingly, the error rates in both the control group and the e-learning group increased significantly in the post-intervention period.

Table 1: Rate of Errors per Total Orders Before and After the Intervention Period

 

Control

E-learning

Pharmacist-led

Pre-intervention

0.49

0.58

0.58

Post-intervention

0.59

0.63

0.37

p-value

<0.001

0.025

<0.001

This study addresses a real-world problem that negatively impacts patients and places a substantial burden on the healthcare system. Additionally, the study clearly describes the design of the educational intervention and outcome measures (e.g. the prescription writing error and its methods of data collection).  The number of prescriptions that were analyzed over the course of the study is very large (n=9,657). With that large of a sample, it is likely that the measured error rate is small but there is always the possibility of bias in the selection process. This study also has some flaws that can leave it weak in the eyes of reasonable readers. Specifically, the sample size of providers is small with only sixteen physicians, four per group.  The study duration was relatively short — approximately two months. These shortcomings may have led to the odd and significant increase in the error rate among the e-learning group and control group. Why would a course designed by professionals to instruct providers on how to write prescriptions result in a higher prescription error rate? Of course, the e-learning course could be poorly designed in some way, but I believe that the more likely reason is there was a small number of participants in the group.  Thus the changes in error rates observed in the control and pharmacist-led intervention groups might be due to chance as well.

Personally, I believe a pharmacist-led approach can and should result in a lower error rate, but I believe that this study must be replicated on a larger scale before any conclusions can be made about the effectiveness of this approach. None-the-less, the study is still relevant. The reason is simple; there are preventable medication errors being made all over the world and they lead to problems that directly affect patients. Until this problem is solved, we should be looking for answers and taking action to find good practices for reducing the errors. While this study is not of the highest quality, the intervention is simple and practical to implement.

Therefore, I urge those who are involved in the training of prescribers to use this study as a template to provide pharmacist-led instruction on prescription-writing. A successful program should include frequent but brief tutorials with an opportunity to ask questions. We must actively make efforts to provide our patients with the high-quality healthcare that they deserve.

References

  1. Gursanscky J, Young J, Griffett K, Liew D, Smallwood D. Benefit of targeted, pharmacist-led education for junior doctors in reducing prescription writing errors - a controlled trial. Journal of Pharmacy Practice and Research. 2018; 48(1):26–35.
  2. Working to Reduce Medication Errors [Internet]. U.S. Food and Drug Administration. FDA; 2019.  Accessed October 23, 2020.