You can always rely on Harry for good ideas, just as you can rely on me
to make a mountain out of a molehill.....

The rational scale idea makes eminent sense, but the use of such an
approach to scaling makes far more sense if:

a) the test has many items (where the impact of the rational weighting
scheme on potential measurement error is minimal, on an item-wise
b) the test is not counting for a lot.

If the client is expecting a handful of items to do a lot of work for
them, then perhaps something more precise, in the way of weighting, is
called for. But if it has a lot of items (and I won't commit to where
the inflection point between "some" and "a lot" lies), and the test does
not have critical implications, when other things are factored in, and
if you have the appropriate assurances from SMEs, then I think you're
good to go.

Mark Hammer

I'd suggest a rational scale, something on the order of:

+2 = essential to do
+1 = nice to do
0 = no harm/no benefit
-1 = bad to do
-2 = terrible to do

Get a bunch of SMEs together and come to consensus

Hope that works for you.

Harry Brull

I am reviewing an on-the-job knowledge test developed by a client for
in-house use (a pseudo certification test, I guess you could say). Many

of the items are multiple answer, with the number of right and wrong
answers varying across items. They want to give partial credit for
getting some of the right answers correct. To do that, they'll also

to penalize for selecting wrong answers (otherwise a test taker need
only select all answers--right or wrong--to get full credit). So...I'm

trying to help them figure out a reasonable way to do this. If the

answers are worth the same points as the right answers (albeit with a
negative value), there are cases where a test taker could lose more
points than they'd earn. The only practice I can think to apply is the

correction for guessing, but I've never used such formulas, and it

like it won't be so straightforward, given that a) we're talking about

the possibility of multiple wrong answers for each question and b) the

number of right and wrong answers varies across items (i.e., I can't
arrive at a single formula for the whole test; might have to be a
formula for each item? Is there such a thing?).

Any thoughts?

To be clear, I did not develop this test. I'm just being asked to weigh

in after the fact.

Thanks in advance,

