How do I compare observed numbers correct to those expected by chance in a multi-choice task?

Suppose a questionnaire has a set of k possible responses to N questions of which one of the k choices is the correct response. Assuming by chance each of the k responses to each question is equally likely the number expected by chance equals N/k.

Suppose we observe x $$\leq$$ N responses for a patient. We wish to see how likely x responses are given the patient responds at random to k choices on each of N questions.

To do this we can assume the number of correct responses, x, follows a Poisson distribution, Po(m) of general form:

$$P(X=x|\mbox{random responses}) = [m^x ]/[x! ]e^-m

where $$m$$ is the expected total number of correctly answered questions from the N questions. Since this equals N/k we can rewrite the above as:

$$P(X=x|\mbox{random responses}) = [(N/k)^x] / [x!] e^-(N/k)

one-tailed p-value = P(X \leq x) = (sum from 1 to x) P(X=x|\mbox{random responses}) = 0.5(two-tailed p-value)

For large N the Poisson distribution is approximately Normally distributed with mean and variance both equal to N/k so we can analogously obtain a one-tailed p-value as:

P(X $$\leq$$ x) = Probit( [x-(N/k)] divided by [Square Root{N/k}] ).

Example

Suppose we have 14 questions each with 3 possible responses of which only one is correct and a patient gets a total score of 3 correct responses. We wish to determine how likely it is we would observe 3 or fewer responses given the patient has responded at random (1 sided p-value).

The expected number of correct responses assuming the patient is responding at random is 14/3=4.67.

Using the Poisson distribution

P(X $$\leq$$ 3)

= P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3)

= (1 + 4.67 + [4.67²]/[2] + [4.67³]/[6])e^-4.67

= 0.31.

We can think of the p-value as the sum of numbers correct which are less likely assuming random responses than the observed one. It turns out that P(X $$\geq$$ 6) also accounts for less likely occurrences than the observed 3 correctly answered questions, given random responses. So the exact two-sided p-value is the sum of the Poisson probabilities P(X $$\leq$$ 3) and P(X $$\geq$$ 6) which equals 0.64.

P(X $$\leq$$ 3) is approximately equal to P(X $$\geq$$ 6) because the Poisson distribution is symmetric about its expected value, N/k, the two-tailed p-value can also be computed as approximately equal to twice the one-tailed p-value = 0.31 x 2= 0.62. This is close to the p-value of 0.64 using the sum of the poisson probabilities above.

The approximate or exact p-values from a Poisson distribution both conclude that there is no evidence to suggest a score of 3 on a three-choice task of 14 questions differs from chance responses.

We can also evaluate probabilities of observing 3 correct responses due to chance using the Normal approximation to the Poisson distribution:

$$P(X \leq 3) = Probit (\frac{3-4.67}{\sqrt{4.67}}) = Probit(-0.772) = 0.22$$ with a two-sided p-value of 0.44. The two-tailed probability equals $$P(X \leq 3) + P(X \geq 6.33)$$ since the Normal distribution probabilities are, like those of the Poisson distribution, symmetric about the mean of 4.67. Of course we can't observed 6.33 correct responses but this is a continuous approximation to the discrete Poisson distribution - like joining lines between frequency bars on a histogram of the number of correct responses!

As with the Poisson distribution we conclude there is no evidence to suggest getting 3 questions correct on a three-choice task of 14 questions differs from chance. The exact Poisson two-sided p-value and its Normal approximation may be computed using a spreadsheet.

In practice for over 30 questions (N) the Poisson and Normal approximations should closely agree. For less than 30 questions the Poisson is preferable as it is a discrete distribution assuming, as in this example, only integer values can occur (ie that numbers of correctly answered questions are whole numbers).

MRC CBU Wiki

Quick Links

Search Wiki

Page Tools

How do I compare observed numbers correct to those expected by chance in a multi-choice task?