Educational Theory and Practice

April 11, 2013

I, I and II, or I, II and III?

by Adrian Wong, Pharm.D., PGY Pharmacy Practice Resident, The Johns Hopkins Hospital

Recently graduated and staring at the computer screen in front of me, I once again repeated what I had done many times in pharmacy school – crammed. I had received warnings about how horrific the Multistate Pharmacy Jurisprudence Examination (MPJE) was from all my mentors and peers. I was truly dreading the outcome. Examinations were never my strong suit and I feared those multiple-multiple choice questions that seem to appear on these high stakes exams all too frequently. Regardless of the name they are given – K-type, complex multiple-choice (CMC), or complex-response questions – they all evoke the same feeling of dread. If I need to jog your memory, an example is shown here:

Question: Based on the best available evidence, which of the following is the most appropriate medication to initiate for management of this patient’s congestive heart failure?

I. Metoprolol succinate

II. Metoprolol tartrate

III. Atenolol

a. I only

b. III only

c. I and II only

d. II and III only

e. I, II and III

After my experience with these questions, it always seems to come down to one of two answers. Even using an educated guess, I never seemed to get the “right” answer. From my experience with multiple-choice questions, the answer is rarely ever all of the above. So why did this format of question come to be? Who came up with this traumatizing format? What is the data behind this torture?

Based on my research, the complex multiple-choice (aka K-type) question was introduced by the Educational Testing Service in 1978.¹ This question format was designed to accommodate for situations when there is more than one correct choice - much as in real life.These questions also appear to be more difficult that comparable “traditional” multiple-choice questions.² Therefore, in the world of health professionals, where multiple correct answers may exist, and, in an attempt to increase the difficulty of board examination questions, the CMC format was adopted by many professional testing services and persists today.

Weaknesses of this format exist. Albanese evaluated the use of Type-K questions and identified several limitations including:²

1. Increased likelihood of “cluing” of secondary choices

2. Lower test score reliability

3. Greater difficulty constructing questions

4. Extended time required to answer questions

“Cluing” results when a test-taker is able to narrow down choices based on the wording of the question or the available answer options. For example, my thought process for the question above helped me to narrow down the choices solely by looking at the question or “stem.” The question is looking for only one “most appropriate” answer (assuming, of course, that the test-writer has written a grammatically correct statement), as denoted by “is” versus “are.” Thus, as a saavy test taker, I would gravitate toward choices “a” and “b.” An additional clue is that there are two similar choices (metoprolol succinate vs. tartrate), one of which is likely to be the correct answer. Thus, cluing may lead to lower test score reliability and the results may be dependent on how well a “test-taker” one is through “cluing.”²

Additional studies have further illustrated the limitations of this assessment format. One study examined the amount of time needed to complete a CMC-based test compared to multiple true-false (MTF) test.³On average, it took 3.5 times as long to complete a CMC-based test compared to a MTF test.

However, after evaluating this literature, I will begrudgingly admit that CMC questions, under certain circumstances, could be effective despite their inherent weaknesses. Researchers at one pharmacy school evaluated the use of CMC questions using a partial-credit scoring system and compared it to traditional dichotomous (right vs. wrong) scoring.⁴ The instructors designed a test to examine student knowledge regarding nonprescription drugs. The test was administered to 150 student pharmacists in their second professional year. The purpose of this study was to optimize the measurement of student pharmacist knowledge without penalty for guessing or incorrect responses. Partial-credit scoring was accomplished by assigning a tiered score based on descending “best” answers. Test items were sent to an external content review panel for content validity. Parameters evaluated in this study included item difficulty,

item discrimination (e.g. the ability to determine low and high-ability students), and the coefficient of effective length (CEL), a measure that determines how many more questions a test would need in order to produce the same reliability as another scoring method. The authors found that with partial-credit scoring, the test was more reflective of actual student knowledge. There was no statistical differences between the two methods with regard to item discrimination but there was greater CEL with dichotomous scoring. Indeed, the findings indicate that dichotomous scoring would require 26% more questions to achieve the same reliability measuring student’s actual knowledge of the subject matter. The authors recommend more studies regarding this partial-credit scoring method for CMC questions, including its ability to predict student achievement and effect on student confidence.

Alternatives to traditional multiple choice testing that have been evaluated in the literature include the use of open-ended, uncued (UnQ) items, which allows the test-taker to select an answer from over 500 responses. This type of test has been used for Family Practice board examinations.⁵One study conducted in over 7,000 family practice residents found the UnQ to be a more reliable method for determining a physician’s competence.

The best mode of assessment probably dependents on the material being tested. In my experiences, the open-response format allows for the best indicator of a student’s knowledge - but like any test, the questions must be carefully worded. The biggest weakness of open-response essay-type exams is the time required to grade them [Editor’s note: As well as the inherent subjectivity required when judging the “correctness” of the student’s answers]. To my chagrin, the use of CMC questions will likely continue for licensing examinations for healthcare professionals.

References:

1. Haladyna TM. The effectiveness of several multiple-choice formats. Appl Measure Educ 1992;5:73-88.

2. Albanese MA. Type K and other complex multiple-choice items: an analysis of research and item properties. Educ Measure Issues and Practice 1993;12:28-33.

3. Frisbie D, Sweeney DC. The relative merits of multiple true-false achievement tests. Journal of Educational Measurement 1982;19:29-35.

4. Wongwiwtthananukit S, Popovich NG, Bennett DE. Assessing pharmacy student knowledge on multiple-choice examinations using partial-credit scoring of combined-response multiple-choice items. Am J Pharm Educ 2000;61:1-10.

5. Veloski JJ, Rabinowitz HK, Robeson MR, Young PR. Patients don’t present with five choices: an alternative to multiple-choice tests in assessing physicians’ competence. Acad Med 1999;74:539-46.

April 10, 2013

Using Positive Psychology to Enhance Well-being

By Arthur Graber, Pharm.D., PGY1 Pharmacy Practice Resident, Medstar Georgetown University Hospital

Do we really want our students to merely learn how to be successful in their work environments … or do we want them to thrive and have a high sense of well-being? Recently, the study of “positive” psychology (PP) has taken flight. This is a new type of psychology that focuses on the achievement of well-being. In traditional psychology, there is a focus on solving a problem and correcting weaknesses. However, the positive psychology perspective states that if you focus on your strengths and talents, you will indirectly address your weaknesses and achieve success.¹ PP is an exciting area of focus that can help people discover meaning in their lives and build resilience in tough times.

Residency training and medical education are some of the psychologically hardest times for young professionals. Not exactly the best time to learn how to flourish into well rounded, patient-centered practitioner. Indeed, the rate of emotional exhaustion and depersonalization among medical residents and students has been estimated to be between 40% and 70%.² Burnout adversely affects judgment and clinical decision-making as well as undermines satisfaction.¹ In order to prevent burnout, it is important for educators, preceptors, and (most importantly) students to understand how they can improve their experience, develop a sense of

well-being, and make the most of the educational experience.

Well-being encompasses a wide range of factors including the following components: Positive emotion, engagement, meaning, positive relationships, and accomplishment.³ Put into play, these elements create an environment that allows students to flourish.

Positive emotion or as Dr. Seligman, one of the founders of PP, likes to describe it, living the pleasant life, is the component that encompasses happiness and life satisfaction. Engagement is also pleasurable but we realize it only in retrospect. “Wow, I got lost in what I was doing!” Engagement occurs when a person uses their personal strengths and virtues in a manner that causing them to experience “flow.”³ For a teacher, the use of “strength-based education” is a shift in paradigm from focusing on the clinical expertise of the instructor to one of cooperative management and exploration of the skills, knowledge, and resources of each student.¹ Students can be coached to determine what their strengths are and evaluate if potential projects, work settings, and other life pursuits provide opportunities for the student to use their strengths.⁴ Meaning, belonging to and serving something greater than yourself, is another measure of well-being. Positive relationships add significantly to a person’s well-being. Relationships allow us to perform acts of kindness for others. It is through acts of kindness that we can produce the most reliable increase in our sense of well-being.³ And finally, Accomplishment can be objectively measured and describes what people pursue in order to increase their well-being.³

Activities that allow us to build our positive emotions, engagement, meaning, positive relationships, and accomplishment substantially increase our sense of well-being. Those who have a sense of well-being are in a better position to learn and to perform well in their responsibilities. The following exercises can be applied by teachers to increase student well-being:⁵

Reflective Learning Exercises

Three Good Things

Activity: Write down three positive events that happened during the work day every night for one week and answer the question of why the positive outcome occurred.
Benefit: Provides changes in perspective, a cognitive-behavioral strategy, to improve stress management and self-care. It can also be used as a reflection strategy to understand how to approach future endeavors in order to get more positive results. These lessons can be shared with group members.

Not Always, Not Everything

Activity: Reflect on negative outcomes and attribute them to factors that are not innate to the student and are temporary and specific.
Benefit: This allows students to see that failure may be situational instead of innate and allow them to perform better in subsequent attempts.

Satisfice more

Activity: Choose to purposely do what is good enough as opposed to maximize, a.k.a. obtain the best possible outcome by default. Determine what features are desired and pick the first option that meets those criteria.
Benefit: Teaching this skill to students builds efficiency. It can teach students the importance of prioritizing effort for for major projects, while doing less for minor ones and still obtaining positive outcomes while lowering stress levels.

Team-Building Exercises

Capitalization

Activity: Spend time sharing positive news and events that occurred such as the events stated in the above activity.
Benefit: For residency programs, the development of a structured support group that meet regularly in order to share positive events and accomplishments instead of just problems may improve and enhance the quality of the work/education setting.

Personal Development Exercises

Signature Strengths

Activity: Identify your top strengths by visiting www.authentichappiness.org. Aim to use these strengths intentionally every day or choose one to develop.
Benefit: This is one of the best methods of developing skills and overcoming weaknesses while increasing active learning. Using these strengths purposely for one week has been associated with increased happiness and less depression up to 3 months into the future.

Since fit between academic interest and career reduce the risk of burnout, using activities aimed at cultivating positive behaviors and the development of personal strengths will benefit students and teachers. These benefits are related to achieving greater student engagement as well as promoting mindful empathy, critical thinking, professionalism, and stress management.¹

References

1. Pedrals NG, et al. Applying positive psychology to medical education. Rev Med 2011; 139: 941-9.

2. Thomas NK. Resident burnout. JAMA 2004; 292: 2880-9.

3. Seligman MEP. Flourish. New York: Free Press, 2011.

4. Skerrett K. Extending family nursing: concepts from positive psychology. Journal of Family Nursing 2010; 16: 487-502.

5. Hershberger PJ. Prescribing happiness: positive psychology and family medicine. Family Medicine 2005; 37: 630-4.

April 3, 2013

What Do Elvis Presley and Aristotle Have in Common? Metaphor!

by Peggy Kraus, Pharm.D., Clinical Pharmacy Specialist, the Johns Hopkins Hospital

Aristotle once observed that “Those words are most pleasant which give us new knowledge. Strange words have no meaning for us; common terms we know already. It is metaphor which gives us most of this pleasure.”

The word metaphor comes from the Greek words meta or “over, across” and pherein or “to carry.”¹ They are often used in education to help “bridge the distance” (a metaphor) between what students already know and what they need to know.¹Every metaphor highlights one aspect of the concept, just as it hides another.¹George Lakoff, Professor of Linguistics at University of California, Berkeley and Mark Johnson, Knight Professor of Liberal Arts and Sciences at University of Oregon, called this “metaphorical systematicity.”¹ Are we not “bridging the distance” through distance learning (yep a metaphor) thanks to the power of technology in our own class?

Metaphors can be used to create a pattern and expectations, shape the way we think, and influence the decisions or thoughts of others. ^1,2,3 In education, are we not trying to do these very things? We need to relay complex, often abstract concepts. Metaphors can help students understand the concepts.⁴

Using “the web” as a metaphor for the Internet highlights some of its essential characteristics while making other, non-web-like qualities less apparent.¹ Information is education (another metaphor).¹ Some assume information transmission is the main purpose of education, or that the content of education is synonymous with information.¹ But if this were the case, the internet would do away with the need for schools and colleges.¹ What is lost or hidden in this metaphor is that attending too closely to information overlooks the social context that helps people understand what the information means and why it matters.¹ This information needs to assimilated, understood and made sense of, and that understanding is different depending on the learner.¹

We use metaphors as a bridge to understand educational contexts. Researchers and participants often draw on pre-existing knowledge to explain current experiences.³ Metaphors accomplish this by enabling the connection of information about a familiar concept to another familiar concept.³ That can lead to a new understanding in which the comparison between the two concepts acts as generators for new meaning(another metaphor).³ They can be used to take knowledge that is already held and build the scaffold (another one) to teach or learn a newer concept.⁴

In order to examine the use of metaphor, Devon Jensen classified metaphors into four categories: active, inactive, dead, and foundational.³ Active metaphors carry saliency between the topic and vehicle terms. For example, “This school is a real melting pot.”³ Inactive metaphors, the optic term must be interpreted through the vehicle term just as “The car race ended in a massacre.”³ Dead metaphors , the saliency between the topic and the vehicle terms are not apparent to due a lack of knowledge or experience with the characteristics of the vehicle term. For example, “working downtown is a real rat race” is only understandable to modern man when the concept of a “rat race” is explained; few of us today have had experience or witnessed rats in a frenzy. ³ The last category, foundational, the metaphor defines the centrally important features of the concept. Example: “ organization as machine.”³

Jensen then used these classifications and searched for studies that used metaphors and metaphor analysis as their central method.³ He found 1,128 studies, which surprised me. He then classified the studies into five major themes. Studies in theme one attempted to raise awareness of modern metaphors that legitimized social process with regard to power and politics.³ In the second theme, these studies examined the metaphoric usage within an educational setting and led to change in education practice, policy and/or roles.³ The third theme was a group of studies that examined techniques and procedures for measuring, understanding, and interpreting metaphors in educational and literary writing.³Theme four examined the usages, implementation, and/or analysis of metaphor in student, school, and institutional writing.³And the last theme was on qualitative education research characterized by studies that look at how participants use metaphor to describe existing educational states.³Metaphors can be myths that limit growth or new ideas that expand possibilities.³

One must be careful about the use of metaphors because it can lead to confusion or misunderstanding.⁴ This is particularly true when there are culture differences between students and instructors or when the metaphor is too old for a younger audience to understand. Metaphors mean different things to people of different cultures and ages.

James Geary, a writer and the former European Editor of Time, during his TED talk entitled "Metaphorically Speaking" claims that we use six metaphors a minute.² I really did not spend much time thinking about metaphors until I started working on this blog, but I now recognize that I use them a lot without even realizing it. Geary starts his presentation by analyzing the many metaphors found in Elvis Presley’s song, “I’m All Shook Up.” It might sound a little weird but its an interesting analysis. Later in his presentation, he draws parallels between René Descartes famous philosophical declaration and Elvis’ song. “I think therefore I am” was translated into English from the Latin “cogito ergo sum.” But according to Geary, the literal translation should be “I shake things up, therefore I am.” So perhaps Elvis was trying to tell us something really deep through the use of metaphor!

References:

1. Meyer, K.A. Common Metaphors and Their Impact on Distance Education: What They Tell Us and What They Hide. Teachers College Record. 2005; 107 (8):1601-1625.

2. Geary J. Metaphorically speaking. TED.com. Accessed February 20, 2013.

3. Jensen, D.F.N. Metaphors as a bridge to understanding educational and social contexts. International Journal of Qualitative Methods 2006, 5(1), Article 4. Accessed March 30, 2013.

4. Ritcher, R. The use of metaphors in teaching and learning. The teaching tomtom. Accessed March 30, 2013.

April 2, 2013

Computerized Adaptive Testing

by David Cannon, Pharm.D., Clinical Instructor, University of Maryland School of Pharmacy

Unique assessment tools have always been fascinating to me. Once, when I was taking a practice exam consisting of 25 questions on pharmacy law the following message appeared: “You scored a 56%, you passed!” How could that be, I thought? Surely the minimum passing score for a state law exam could not be that low! But, as it turned out, this exam was an adaptive test. While the computer was reporting the percentage of questions I scored correctly, behind the scenes it was doing calculations based on the difficulty and weight of the questions. Once I began to peel back the surface of these complicated algorithms, I wanted to learn more. But first, let’s review some basics about assessment …

The purpose of an exam, including high stakes exams to make state licensure decisions, is to use the assessment data (answers to test items) to make inferences about the learner. Assessment is best approached by first considering what the end requirements of the learner are. Then think about what actions, jobs, or thoughts would illustrate mastery of the desired requirements. By deciding what the goals of assessment are makes the process of actually creating it much easier.¹

Evidence-Centered Design utilizes a series of key questions to analyze the assessment design. Table 1 is good example of a set of questions recommended by Mislevy, et. al.:¹

Table 1:

a. Why are we assessing?

b. What will be said, done, or predicted on the basis of the assessment results?

c. What portions of a field of study or practice does the assessment serve?

d. Which knowledge and proficiencies are relevant to the field of study or practice?

e. Which knowledge or proficiencies will be assessed?

f. What behaviors would indicate levels of proficiency?

g. How can assessment tasks be contrived to elicit behavior that discriminates among levels of knowledge and proficiency?

h. How will the assessment be conducted and at what point will sufficient evidence be obtained?

i. What will the assessment look like?

j. How will the assessment be implemented?

Taken from Automated Scoring of Complex Tasks in Computer-based Testing¹

ECD draws parallels to instructional design in that these questions do not necessarily need to be asked in order, the outputs to each question should be considered when examining the others, and these questions should be repeated as necessary.¹ To understand how evidence-centered design is utilized in creating assessments, the assessment tool must be broken down its individual components.

When designing assessments used for licensing examinations, many domains of knowledge are tested. A domain is a complex of knowledge or skills that is valued, where features of good performance or situations during which proficiency can be exhibited, and where there are relationships between knowledge and performance.¹ In a high stakes examination, like a state board licensure exam, it is not sufficient for an examinee to be competent in only one domain but not the others. To test proficiency in each of the domains, smaller subunits of the assessment called “testlets” are used. Testlets typically contain a group of assessment items that are related to each other that would elicit the behaviors associated with the domain.³ It is vital to understand how these examinations are designed from an evidence based perspective in order to evaluate the validity of computerized adaptive testing.

So what is a computerized adaptive test anyways? CAT is an assessment tool that utilizes a iterative algorithm with the following steps:²

1) Search the available items in the testlet domain for an optimal item based on the student’s ability

2) Present the chosen item to the student

3) The student either gets the item right or wrong

4) Using this information as well as the responses on all prior items, an updated ability estimate of the student is determined

5) Repeat steps 1-4 until some termination criteria is met

CAT is utilized in many high-stakes licensing examinations such as the National Council Licensure Examination (NCLEX), which is required by most states for nurses before they can practice. In the case of NCLEX, after each item is answered by the examinee a calculation is done in the background that determines an estimate of the persons competency based on the difficulty of the item answered. The computer then subsequently asks a slightly more difficult question to apply the algorithm again, creating a new estimate of the candidates competency. This is repeated until the computer reaches a predetermined cutoff (with a confidence interval 95% in the case of the NACLEX³) for minimum competency or until the number of test items has been exhausted. Put another way, the exam will cease when the algorithm has determined with 95% certainty that the student’s ability falls above or below a minimum competency standard. Check out this VIDEO of how the algorithm works behind the NACLEX.³

Now you might be wondering how you create an adaptive test. It’s a pretty complicated process that would involve breaking down the different subject areas you want to test into different domains. Then you’d need to develop an item bank for each domain. Content experts would come together and decide what items should be included while at the same time evaluating their appropriateness and difficulty/weight. A great free resource for creating your own adaptive test can be found here.⁴ The NCLEX is a great example of how computerized adaptive testing brings together the ideas of evidence centered design and instructional design by helping educators assess their students with greater accuracy.

References:

1. Mislevy RJ, Steinberg LS, Almond RG, Lukas JF. Concepts, Terminology, and Basic Models of Evidence-Centered Design. In: Williamson, D, Mislevy, R, Bejar, I (eds). Automated Scoring of Complex Tasks in Computer-based Testing (1^st ed). Mahwah, NJ.: Lawrence Erlbaum Associates, 2006 (pp 15-47).

2. Thissen, D., Mislevy, R.J.. Testing Algorithms. In Wainer, H. (eds.) Computerized Adaptive Testing: A Primer. Mahwah, NJ: Lawrence Erlbaum Associates, 2000.

3. Computerized Adaptive Testing (CAT). National Council of State Boards of Nursing. Accessed on: March 11, 2013.

4. Software for developing computer-adaptive tests. Assessment Focus. Accessed on: March 26, 2013.