# Probability Time Trials

It has come to my attention that I write the basic Rasch probability in half a dozen different forms; half of them are in logits (aka, log odds) and half are in the exponential metric (aka, odds.) My two favorites for exposition are, in logits, exp (b-d) / [1 + exp (b-d)] and, in exponentials, B / (B + D)., where B = eb and D = ed. The second of these I find the most intuitive: the probability in favor of the person is the person’s odds divided by the sum of the person and item odds. The first, the logit form, may be the most natural because logits are the units used for the measures and exhibit the interval scale properties and this form emphasizes the basic relationship between the person and item.

There are variations on each of these forms like, [B / D]/ [1 + B / D] and 1 / [1+ D / B], which are simple algebraic manipulations. The forms are all equivalent; the choice of which to use is simply convention, personal preference, or perhaps computing efficiency, but that has nothing to do with how we talk to each other, only how we talk to the computer. The goal of computing efficiency means to minimize the calls to the log and exponential functions, which causes me to work internally mostly in the exponentials and to do input and output in logits.

These deliberations led to a small time trial to provide a partial answer to the efficiency question in R. I first step up some basic parameters and defined a function to compute 100,000 probabilities. (When you consider a state-wide assessment, which can range from a few thousand to a few hundred thousand students per grade, that’s not a very big number. If I were more serious, I would use a timer with more precision than whole seconds.)

> b = 1.5; d = -1.5

> B = exp(b); > D = exp(d)

> timetrial = function (b, d, N=100000, Prob) { for (k in 1:N) p[k] = Prob(b,d) }

Then I ran timetrial 100,000 times for each of seven expressions for the probability; the first three and the seventh use logits; four, five, and six use exponentials.

> date ()

 “Tue Jan 06 11:49:00 ”

> timetrial(b,d,,(1 / (1+exp(d-b))))            # 26 seconds

> date ()

 “Tue Jan 06 11:49:26 ”

> timetrial(b,d,,(exp(b-d) / (1+exp(b-d)))) # 27 seconds

> date ()

 “Tue Jan 06 11:49:53 ”

> timetrial(b,d,,(exp(b)/(exp(b)+exp(d)))) # 27 seconds

> date ()

 “Tue Jan 06 11:50:20 ”

> timetrial(b,d,,(1 / (1+D/B)))                  # 26 seconds

> date ()

 “Tue Jan 06 11:50:46 ”

> timetrial(b,d,,((B/D) / (1+B/D)))            # 27 seconds

> date ()

 “Tue Jan 06 11:51:13 ”

> timetrial(b,d,,(B / (B+D)))                     # 26 seconds

> date ()

 “Tue Jan 06 11:51:39 ”

> timetrial(b,d,,(plogis(b-d)))                  # 27 seconds

> date ()

 “Tue Jan 06 11:52:06 ”

The winners were the usual suspects, the ones with the fewest calls and operations but the bottom line seems to be, at least in this limited case using an interpreted language, it makes very little difference. That I take as good news: there is little reason to bother using the exponential metric in the computing.

The seventh form of the probability, plogis, is the built-in R function for the logistic distribution. While it was no faster, it is an R function and so can handle a variety of arguments in a call like “plogis (b-d).” If b and d are both scalars, the value of the expression is a scalar. If either b or d is a vector or a matrix, the value is a vector or matrix of the same size. If both b and d are vectors then the argument (b-d) doesn’t work in general, but the argument outer(b,d,“-“) will create a matrix of probabilities with dimensions matching the lengths of b and d. This will allow computing all the probabilities for, say, a class or school on a fixed form with a single call.

The related R function, dlogis (b-d) has the value of p(1-p), which is useful in Newton’s method or when computing the standard errors. And may be useful for impressing your geek friends or further mystifying your non-geek clients.