Ubuntu Certified Professionals

Fri Apr 7 18:01:58 UTC 2006

On Friday 07 April 2006 11:26, Daniel Carrera wrote:
> email.listen at googlemail.com wrote:
> >>I need to dispell this myth that somehow multiple choice
> >> questions are inherently poor quality and that practical exams
> >> must be better.
>
> It depends entirely on what you are measuring. Some things might
> lend themselves to multiple choice and some things don't. I've
> taught mathematics for several years and I wouldn't use multiple
> choice on a math exam (I think that the SATs are worthless btw).
> Likewise, you can't replace an essay question with multiple choice.
> I've seen some good physics and astronomy exams that were multiple
> choice. It's not as simple as "multiple choice is ok" or "multiple
> choice is bad". It has to be apt for what you are testing.

Correct. Both methods can be valid if used correctly. For maths, I'd 
want to see the complete worked out solution (making it an essay 
question), and for practical tasks like flying planes nothing beats a 
practical task. Practical tests are excellent for answering this 
question: "Can the candidate follow the exact steps of the protocol 
needed to produce a known desired result?" 

LPI can't use this method as:

1. Which distro is being tested? The exam is designed to be 
distro-neutral so the "exact sequence of steps" test can't be 
implemented.
2. Practical tests are expensive to deliver and maintain. LPI's 
mandate is to deliver good tests to many people in a cost effective 
manner

> Btw, I also work for a computer certificate company (though we aim
> at a much lower level than LPI - roughly highschool level). Our
> certificate is awarded based on direct observation and samples of
> work because those are just more appropriate ways of measuring the
> things we are trying to measure. This is also very good pedagogy
> because it gives the instructor flexibility to build a course
> around the criteria which is suitable for his teaching styles and
> his students' needs.
>
> http://theingots.org
>
> >>The truth is that practical exams are just as removed from
> >> reality as multiple choice.
>
> Not the ones we do :)  But I guess that's because we don't require
> a single "exam day". The assessment is based on prolonged
> observation.

End user testing is best done practically as you are not testing the 
cognitive domain, you want to know if the candidate can print a 
document from Writer or enter the correct formula in Calc for 
example. There's only one way to do the task, so you check if he/she 
can do it.

> Btw, we don't sell courses, we just do certification.
>
> >> And practical exams are not reproducible,
>
> I don't see why a practical exam is less reproducible than a
> multiple choice. You still have to ask the pupil to either tick the
> boxes again, or perform a task again, and you can still get
> different results because the testee was having a bad day.

Ah, but the RHCE doesn't work like that, you don't get the same exam 
the second time round, you get a completely different set of tasks to 
perform, and to the best of my knowledge no-one has yet found a way 
to normalize the results to a common reference point. Red Hat has 
been asked many times to show the psychometric validity of their 
exams, and to date they have not been able to do so. Which means that 
the results are not scientifically valid. That doesn't mean it's a 
bad exam (it's not), it means that RH can't scientifically prove what 
they claim even if they are making a correct claim.

Research shows that practical tests are influenced by the testee's 
mood - having a bad day can ruin your exam results. But multiple 
choice by measurement doesn't suffer from this
>
> >>You can't validly compare two people
> >>writing different practical exams and prove you are comparing
> >> apples and apples.
>
> You can't compare two people writing different multiple choice
> exams and prove you are comparing apples and apples.

But you can, using Angoff Standards Setting. First, the testee profile 
is defined, then every question in the database is rated by subject 
matter experts as to what % of testees are expected to get it right. 
The exam objectives themselves are assigned weights according to 
their relative importance. This gives a common base from which to 
start.

Then the exam goes through a beta test phase, and results are 
statistically analysed. If 80% of testees are predicted to know what 
ls does, then you check that about 80% of them get those questions 
right. If they don't then something is wrong with the rating or the 
question and it has to be fixed. The you look for anomolies like a 
question where most testees who get it right failed the exam as a 
whole and vice versa (this actually happened at least once) and that 
question gets fixed or dropped.

In LPI's case the correlation between predicted results and expected 
results is in the range 0.88 to 0.92, which is quite exceptional - 
probably because a large number of experienced and respected Linux 
sysadmins wrote the questions. Collectively they have many hours of 
real-life experience to draw from.

> Why should it? If it's a *different* exam, then it's not apples and
> apples.

That's not the idea. Instead you want to show statistically that if 
candidate X scored 500 on any arb set of 65 exam questions, then the 
statistical probability is that they would score about 500 on any 
other arb set of 65 questions from the same pool.

Granted, this would only work if the mathematics of statistics and 
probabilities wasn't valid, but I don't think there's much doubt 
about that :-)

> >>There has to be $ changing hands, unless the sabdfl is prepared
> >> to finance the cert in perpetuity. There are courier costs for
> >> the written exams, Prometric and Thomson Vue want their slice
> >> for computer delivered exams. Clerical staff need to be paid,
> >> proctors have to be transported to the exam venue to supervise,
> >> etc, etc. $100 is dirt cheap, compare what it costs to write
> >> some other exams.
>
> Reminder: The certification I work with aims at a totally different
> market than LPI, but I want to talk about it anyways :)
>
> Because we use a different assessment method we remove 95% of the
> bureocracy and can afford to give the exams at $10 :)  One
> disadvantage of multiple-choice exams is that they automatically
> force quite a bit of bureocracy. Instead, we train teachers (who
> are already professional instructors) for a fee, and they then
> incorporate our practical certificate in their course and pupils
> pay only about $10 for the certificate. The teacher time is
> finnanced by the school anyways, so over-all it's cheap for the
> school and cheap for the pupils. We do spot checks on statistically
> representative samples to improve quality control (more thorough
> checks for more advanced levels).

Looks like ingot has a streamlined operation, so well done for that. 
Why do you say multiple choice exams produce bureaucracy? I know the 
Microsoft ones do, but that's because that program is designed to 
generate revenue and the staff are working on the business model.

LPI gets by with something like 10 staff in Canada. There's a team of 
about 6 to 10 working on exam development, and a network of volunteer 
proctors who deliver the exams in their respective areas. You can't 
get much leaner than that :-)

Historically, LPI exams delivered computer based style cost $100, and 
more than half of that goes to Prometric or Thomson Vue. Paper based 
exams cost $25 and that includes the cost of couriering them all over 
the world. For security reasons, exams are printed and marked only in 
Canada, hence the need to courier stuff.

-- 
Alan McKinnon
alan at linuxholdings dot co dot za
+27 82, double three seven, one nine three five