In this episode, we sit down with Ken Phillips, a seasoned professional who has spent over 40 years in the Learning and Development (L&D) field as an independent consultant. Ken’s journey has seen a significant evolution from focusing on performance management and sales performance to becoming a leading expert in measuring and evaluating learning.
Join us as we delve into the transformative changes happening in the L&D landscape, driven by technology, and explore how these changes can be harnessed for positive impact. Discover how innovations like microlearning, virtual training, AI, and ChatGPT are shaping the future of L&D and why the key lies in using them correctly.
Ken also shares exciting developments on the horizon, including upcoming speaking engagements and his work on two groundbreaking books. The first book centers on predictive learning analytics, offering insights into tackling scrap learning, while the second one delves into integrating the four-level evaluation model for a more comprehensive approach to measuring the effectiveness of training programs.
We also recap the key takeaways from the recent webinar, Creating Level 2 Quizzes and Tests That Actually Measure Something, from the art and science of crafting valid multiple-choice test questions to the significance of measuring job application in assessments. This leads to a content-rich Q&A session, where Ken answers some of the burning questions about creating effective test questions and test design, providing valuable insights into best practices and pitfalls to avoid.
If you’re in the world of Learning and Development or simply curious about the evolving landscape of education and training, this episode is a must-listen. Join us for a deep dive into the fascinating world of L&D with a true industry expert.
Full show notes: Creating Level 2 Quizzes and Tests That Actually Measure Something
00:04
Welcome to this week’s episode of the HRDQ-U In Review podcast, where we bring you the latest insights and practical tools for enhancing soft skills training within your organization. This podcast is brought to you by HRDQU.com, and I am your host, Sarah, Learning Events Manager at HRDQ-U. And today I have Ken Phillips joining me to discuss the webinar, Creating Level 2 Quizzes and Tests That Actually Measure Something.
00:32
So thanks so much for joining me today, Ken. Well, thank you for asking me, Sarah. I’ve been looking forward to this ever since we first started talking about date and getting together and doing some follow up to the webinar that I did for HRDQ-U last week.
00:54
Yeah, we had a really engaged audience and Ken, you’ve done numerous webinars with us in the past. You’re the author of Coaching Skills Inventory that’s available over at HRDQstore. But I believe this is your first time joining me on the podcast. Am I right? Yes, right. I’ve done several webinars, but probably four or five, I think. But right, this is the first podcast.
01:17
Yeah, so can you let us know, can you just let our audience know that’s tuning in? Can you share a little bit about who you are, what you do, and your background? OK, sure. um I guess to start with, I’ve been in the learning and development uh field for a little over 40 years. um So that’s a long time. I started out doing consulting and training.
01:44
in two areas. One was sales performance and the other was performance management. And I did that for probably, I don’t know, close to 30 years. And I was doing consulting with corporations to help them implement performance management systems and doing training around all the skills needed to implement the system and then doing sales performance training as well.
02:11
And then in 2008, I had submitted proposals to speak at the ATD International Conference a number of times during this period when I was working on sales performance and performance management. Never got accepted. But I didn’t quit. I had submitted a proposal in 2008 around
02:38
measuring and evaluating training and it focused on uh level one evaluations and how to create more valid, know, scientifically sound level one evaluations. And I got accepted to speak. So that was my first time speaking at the ATD International Conference. But it also kind of moved me in the direction of measurement and evaluation. I mean, all the learning instruments that, uh you know, that HRDQ publishes that I’ve authored,
03:08
I was dabbling in measurement and evaluation with those, but I really wasn’t fully invested in it. um But that experience kind of pushed me in that direction. So since then, oh all my work has focused on measurement and evaluation of learning. And I do speaking and consulting and writing. I’ve got articles that I’ve written and other things that I’ve written that are out there. So that’s basically what I’m doing now.
03:36
Great. And this question I love to ask all of my guests that come on to our podcast. And that’s what changes do you see happening in the L.N.D. space right now? Oh, I’m probably not the first one to say this. you’ve asked that question before. that is technology, technology, technology. You know, and I look at it as basically a good thing.
04:00
you know what it’s done is it’s opened up lots of you know different ways of providing learning and uh You know cut down travel time and travel expenses and uh Also allowed for things like the development of little micro learning modules for reinforcement and uh you know in virtual uh synchronous and asynchronous training and and uh now
04:28
with all the stuff around uh AI and uh chat, GPT, Lord knows where that’s going to end up. it looks like it has a promising future. uh you know, but I think the key word is that these things have to be used uh appropriately. You know, and there’s a lot of bells and whistles with the technology and. uh
04:55
Yeah, you know, if it’s not used, if it’s not used appropriately or effectively, um it probably really doesn’t help all that much. eh the possibilities there. Yeah, that is the overwhelming responses. This AI and um just how fast the technology has really, really changed and how it’s changing every single month. There’s something new and how fast that’s.
05:24
that’s impacting our workforce. Yep. And Ken, know, what exciting things are you up to next? um I’ve got the end of uh November. I’m going to be doing a Mastering Measurement and Evaluation Certificate Program for Training uh Magazine. It’s an online uh certificate program, and it consists of four three-hour modules spread over a two-week period.
05:53
And uh I’m also at the end of November doing an ATD Core 4 uh conference. speaking, this is a virtual conference, and I’m presenting on a level one evaluation and the title of the session is Add Muscle to Your Level One Evaluations with Predictive Questions. oh I’ll be doing that. uh
06:18
And then I guess the next thing I’ve got coming up after that probably isn’t until February. And that’s where I’m going to be speaking at the Training Magazine annual uh conference and expo in uh Florida, the uh middle of February. And I’m also have been asked to do my Mastering M &E certificate program workshop.
06:45
as a pre-conference workshop. So I’ll be actually doing the pre-conference workshop and then speaking during the conference. well, it sounds like you have some some exciting projects ahead in your pipeline. Yeah, yeah. Well, and I’ve got some other stuff that I didn’t want to mention because it isn’t finalized. But I but yes, I’ve you know, my goal every year is to do at least
07:12
Yeah, a minimum of at least 15, you know, conferences or speaking at local chapters or things of that nature. I, I try to talk, I’m out there trying to, you know, submit proposals and do all that kind of stuff. So I have an opportunity to present this information that I’ve had been working on for, you know, a number of years now.
07:38
Yeah, absolutely. And so we recently did the webinar together titled Creating Level 2 Quizzes and Tests that Actually Measure Something. And can you share what the key takeaways were for registrants at that event for folks that maybe didn’t have the opportunity to tune into that webinar just yet? oh Yeah, I think there were two key ones that when I when I thought about that and when I saw your question and the first one has to do with
08:07
the fact that just because people have taken uh multiple choice tests doesn’t mean they know how to create multiple choice test questions. And so there are a number of common errors that people who aren’t savvy in the art and science of creating test questions make inadvertently, not intentionally, but inadvertently, that uh either
08:33
give away uh or offer clues as to the correct answer or create questions that are uh being viewed by the test takers or the participants, if you’re talking about uh learning. uh View the test questions as either tricky uh or overly difficult and not see them as fair. uh And so the net result is that you end up with some learners who maybe are
09:03
frustrated with the test because they don’t think it’s really a fair representation of what they know. But on the other side, it also, uh you know, it gets in the way of the validity of the data you’re collecting. So you may end up collecting a lot of data that looks like people learn something when they didn’t because the test questions contained lots of clues to the correct answers. Or if you wrote
09:26
tricky or overly difficult questions, it may look like people didn’t learn anything when in fact they did. So it’s trying to strike that balance, avoid all these common errors so that the data you’re collecting is valid and sound and credible. Yeah, it sounds like crafting those effective text test questions is both an art and a science there. Right.
09:54
Right. Well, and the other thing that we talked about, the second uh key takeaway was the the whole idea around uh writing test questions that not only measure like recall or you know, whether people can recall what was covered in the training, but also do they know how to apply it? So writing what is referred to as job application focused test questions. And you get
10:23
so much better insight and data if you can write uh job application focused test questions because you’re not only measuring whether or not people learn something, but you’re also measuring whether they know how to apply it. And that’s key when you get to if you decide to do a level three evaluation, you collect your level three data and it says, oops, people aren’t applying it. Then you can go back to that level two data and say, hey,
10:51
It wasn’t the training program that’s the problem. Something happened between the training program and after people went back on the job that got in the way and prevented them from applying it. Because when we gave them the knowledge test, they knew how to apply it because we used job application focused test questions. And so why should people wait one to three weeks uh following a training program before administering a level two knowledge test?
11:19
Well, yeah, I think two reasons. One is credibility. Because, you know, when business executives send people to a training program, you know, they wouldn’t send their employees or associates to a training program if they didn’t think they were going to learn something new. And what happens is if you administer your Level 2 knowledge test immediately after the training program is over,
11:44
Even if you come back with really uh outstanding results where you got 95 % of the people scored 89 % or better on the knowledge test, the business executives kind of dismiss that because they are thinking, they may not say it to you, but they’re thinking, well, I wouldn’t have sent my uh associates to your training program if I didn’t think they were going to learn something. So the fact that you’re telling me.
12:10
that they did learn something, you know, it’s kind of interesting, but it doesn’t impress me all that much. But if you wait, you know, one or two or perhaps even three weeks, what happens is then if you come in with those kinds of results that, you know, 95 % of the people scored 89 % or better, um now you got a story because the business executives, they know.
12:34
because they’ve experienced it probably, and they’ve also sent other people to training probably, and they know that when people come back from training, some of them just don’t apply what they’ve learned. And so now if you’re telling me that they really learned this and you’ve got data that you collected uh after the forgetting curve came into play, now you got a story.
12:59
And why does the correct answer often contain the most words when creating multiple choice test questions? Well, that’s one of those problems that, you know, the um that people that when people write the test questions um and that issue happens just naturally because when we write the test questions, especially if you were the one that designed the training and so on, you know a lot more about the correct answer.
13:27
uh than you do about the information in the other distractors for your multiple choice test question. And so we tend to put more information in that correct answer because we just know more about it. And it just happens naturally. So my suggestion and recommendation is, look, don’t worry about that when you initially write your test questions. Write them all out. Now go back.
13:55
and look at all your test questions and take and look at the correct answers and make sure that they’re roughly equivalent in word length to the other distractors that you have. That doesn’t have to be exactly the same, uh but don’t try to get hung up on that when you’re initially writing this stuff, because it’ll just bog you down. So you want to get it all out there, then go back and look at it and edit your… uh
14:22
and edit uh your correct answer or add additional information to the distractors. And how can you tell if a response option for a multiple choice test question are viewed as plausible? Yeah, it’s fairly easy to do. But most people, at least my experience is, most people in L &D don’t do it. They write the test and create the test questions and administer the test.
14:49
But the easy way to do it is to just collect data from, you know, you don’t need a large number, 15 or 20 learners who have taken the test, and then go through and do an item analysis and look at each one of your test questions and look at then the response options for each one of your test questions and track how many of these 15 or 20 learners that you collected the data from.
15:17
how many of them chose each of the response options. And what you’re looking for would be some kind of a distribution across all those response options, assuming, let’s say, you use four, that you would have people who have selected uh all four of those response options. And if you end up with some response options that either are under-selected, significantly under-selected, or nobody selects,
15:47
You will know those uh response options aren’t seen as plausible. So people look at the test question, they know right away, I can eliminate this one, this one and this one, because I know those aren’t plausible. So even if they slept through your training program, they would be able to, you know, get the correct answer. Or if you didn’t, you know, if you couldn’t eliminate three, even if they eliminated
16:11
out of, you when you had four, now you’ve got a 50-50 chance of guessing the right answer. So that’s why you need to write uh plausible response options, and the only way you know that is to collect a little bit of data and do the analysis. And what about all the above? Why is that not a recommended response? The problem with that is that savvy test takers know whenever they see an all-the-above response option to a test question,
16:41
More than likely, uh it’s the correct answer. so uh in order to avoid uh that, the key thing is to not use all the above uh because, uh as I said, savvy test takers will know that that’s the correct answer, even though they may have slept through your training. Or the other option would be to use
17:08
all the above with some questions where it’s not the correct answer so that you can balance that out and then end up uh with credible valid data. And Ken, what I’m gathering here is that you’re probably a savvy test taker yourself. And what’s the problem with using not or negatively worded multiple choice test questions?
17:35
Yeah, that’s a I see you again, lots of, you know, test questions that L and D people uh create that that use that construct of called a null test question or a negatively worded test question. And there are two problems with it. One is that a lot of people see those negatively worded test questions as tricky because we’re asking something, you know, not so they see it as tricky. And if they missed the test question,
18:04
then they feel like it’s not fair. so then you end up with, and if that happens with a number of the learners, then you end up with a whole group of learners who feel like the test wasn’t very fair. And then they end up dismissing their, you the results because they said, well, wait a second, didn’t do very well in this test, but it wasn’t my fault. It was because all these, you know, were tricky test questions. But the other thing that it does is if you’re thinking about using
18:31
which we talked about earlier, using your test questions for reinforcement of the training that was covered. That was when we talked about waiting one to three weeks after the training was over before administering your test. Why in the world would you ever want to reinforce something you don’t want people to remember? I mean, from a learning standpoint.
18:56
That just doesn’t make any sense and that’s not going to get you where you want to go. you want to avoid the use of negatively worded test questions. And Ken, before I let you go today, can you share where listeners can go to learn more about your work? Yeah, you can go to my website, which is www.
19:23
Phillips, my last name, P-H-I-L-L-I-P-S, Associates, all spelled out, dot com. um On my website are a whole host of articles that I’ve written um that are all free and you just can download them. And I’ve got some eBooks that I’ve written that are there that are free. uh And…
19:47
some blogs that I’ve written that are just there for free. So if you’re interested in getting more information, that’s one place. And I’m also m on LinkedIn. So if you want to reach out to me and connect with me on LinkedIn, I do send out updates about things that I’m doing or things that I’ve written or whatever it is or places I’m speaking. um if you want to connect with me, uh you can just send me a connection request and I’ll accept it.
20:15
will be connected and then you’ll be on the list of people to receive this information. So if you’re ever at a conference where I’m speaking, we can connect or have questions about anything that I’m sending out, uh you can reach out to me and we’ll talk about it. Well, great. Well, thank you so much for your time today, Ken. Oh, you’re welcome, Sarah. Thank you. I hope you have a great weekend.
20:40
You as well. And if you have yet to listen to the webinar, you can click the link below in the description. It was a really engaging and interactive session there. So, make sure to check that out. And we hope you enjoy listening to the HRDQ-U In Review podcast, available on all major streaming platforms. If you did enjoy today’s episode, make sure to give us a follow and leave us a review. And we look forward to seeing you all next week.
Listen to this podcast event at no charge with your
HRDQ-U Free Access Membership
In this episode, we sit down with Ken Phillips, a seasoned professional who has spent over 40 years in the Learning and Development (L&D) field as an independent consultant. Ken’s journey has seen a significant evolution from focusing on performance management and sales performance to becoming a leading expert in measuring and evaluating learning.
Join us as we delve into the transformative changes happening in the L&D landscape, driven by technology, and explore how these changes can be harnessed for positive impact. Discover how innovations like microlearning, virtual training, AI, and ChatGPT are shaping the future of L&D and why the key lies in using them correctly.
Ken also shares exciting developments on the horizon, including upcoming speaking engagements and his work on two groundbreaking books. The first book centers on predictive learning analytics, offering insights into tackling scrap learning, while the second one delves into integrating the four-level evaluation model for a more comprehensive approach to measuring the effectiveness of training programs.
We also recap the key takeaways from the recent webinar, Creating Level 2 Quizzes and Tests That Actually Measure Something, from the art and science of crafting valid multiple-choice test questions to the significance of measuring job application in assessments. This leads to a content-rich Q&A session, where Ken answers some of the burning questions about creating effective test questions and test design, providing valuable insights into best practices and pitfalls to avoid.
If you’re in the world of Learning and Development or simply curious about the evolving landscape of education and training, this episode is a must-listen. Join us for a deep dive into the fascinating world of L&D with a true industry expert.
Full show notes: Creating Level 2 Quizzes and Tests That Actually Measure Something
[ PODCAST PLAYBACK ]
You must be signed-in with your membership account to access this content.
Enjoyed this podcast? Have suggestions on how we can improve? Please take our quick survey and receive a coupon for 15% OFF any of our individual membership plans.
*Instant 15% coupon available upon completion of survey.
Want to learn more? Become an Individual or Corporate member to watch this and hundreds more webinars!
Learn tips and best practices on how to write Level 2 quizzes and tests that produce valued data. Emphasis is also given to writing test questions that measure job application, not mere recall of facts.

Ken Phillips
Ken Phillips is the founder and CEO of Phillips Associates and the creator of the Predictive Learning Analytics™️ evaluation methodology. He is also a measurement and evaluation master, having spoken and gotten rave reviews – at the ATD International Conference on measuring and evaluating learning issues every year since 2008. He also has presented at the Annual Training Conference and Expo every year since 2013 on similar topics.
Ken has pooled his measurement and evaluation knowledge and experience into a series of presentations explicitly designed for L&D professionals. The presentations are highly engaging, practical, and filled with relevant content most L&D professionals haven’t previously heard. In short, they are not a rehash of traditional measurement and evaluation theory – but fresh ideas and solutions.
Training Tools for Developing Great People Skills
This event is sponsored by HRDQ. For 45 years HRDQ has provided research-based, off-the-shelf soft-skills training resources for classroom, virtual, and online training. From assessments and workshops to experiential hands-on games, HRDQ helps organizations improve performance, increase job satisfaction, and more.
Classroom Training 101 Customizable Courseware
Training in the classroom can be challenging, but there are ways to ensure success. Both new and seasoned facilitators will learn the ins and outs of effective classroom facilitation and how to develop, deliver, and manage a smooth and engaging training session.
Buy at HRDQstore.com
Behavioral Interviews Customizable Courseware
Past performance predicts future success. Participants are guided through a structured interview process and gain practical tips for informed hiring decisions. Real-life exercises and role-play scenarios enable hands-on application in the workplace.
Buy at HRDQstore.comThe HRDQ-U In Review Podcast, brought to you by HRDQU.com, brings you the latest insights and practical tools for enhancing soft-skills training in your organization. As a learning community for trainers, coaches, consultants, managers, and anyone passionate about performance improvement, we interview subject matter experts and thought leaders from recent webinars they presented with us to take a deeper dive into the content they shared and answer all your questions. Join us as we explore new ideas and industry trends, share success stories, and discuss challenges faced by professionals.
The HRDQ-U In Review Podcast is intended for HR and training professionals, organizational development practitioners, and anyone interested in improving workplace performance and productivity.
New episodes of HRDQ-U In Review are released every week.
The length of the episodes varies, but they typically range from 15-30 minutes.
The podcast covers a wide range of topics related to HR and organizational development, including leadership development, team building, communication skills, conflict resolution, employee engagement, and more.
No, HRDQ-U In Review is completely free to listen to.
You can listen to any available HRDQ-U In Review Podcast right on our website at HRDQU.com via our embedded Spotify player on the related webinar page. In addition to our self-hosted option, you can find the HRDQ-U In Review Podcast on many of the popular streaming services, which are listed above.
Download our catalog of our top 20 most popular webinars.