CEMS {BradleyTerry2} | R Documentation |
Community of European management schools (CEMS) data as used in the paper by Dittrich et al. (1998, 2001), re-formatted for use with BTm
CEMS
A list containing three data frames, CEMS$preferences
,
CEMS$students
and CEMS$schools
.
The CEMS$preferences
data frame has 303 * 15 = 4505
observations (15 possible comparisons, for each of 303 students) on the following 8 variables:
student
a factor with levels 1:303
school1
a factor with levels c("Barcelona",
"London", "Milano", "Paris", "St.Gallen", "Stockholm")
; the first management school in a comparison
school2
a factor with the same levels as school1
; the second management school in a comparison
win1
integer (value 0 or 1) indicating whether
school1
was preferred to school2
win2
integer (value 0 or 1) indicating whether
school2
was preferred to school1
tied
integer (value 0 or 1) indicating whether no preference was expressed
win1.adj
numeric, equal to win1 + tied/2
win2.adj
numeric, equal to win2 + tied/2
The CEMS$students
data frame has 303 observations (one for each student) on the following 8 variables:
STUD
a factor with levels c("other",
"commerce")
, the student's main discipline of study
ENG
a factor with levels c("good, poor")
,
indicating the student's knowledge of English
FRA
a factor with levels c("good, poor")
,
indicating the student's knowledge of French
SPA
a factor with levels c("good, poor")
,
indicating the student's knowledge of Spanish
ITA
a factor with levels c("good, poor")
,
indicating the student's knowledge of Italian
WOR
a factor with levels c("no", "yes")
,
whether the student was in full-time employment while studying
DEG
a factor with levels c("no", "yes")
, whether
the student intended to take an international degree
SEX
a factor with levels c("female", "male")
The CEMS$schools
data frame has 6 observations (one for each
management school) on the following 7 variables:
Barcelona
numeric (value 0 or 1)
London
numeric (value 0 or 1)
Milano
numeric (value 0 or 1)
Paris
numeric (value 0 or 1)
St.Gallen
numeric (value 0 or 1)
Stockholm
numeric (value 0 or 1)
LAT
numeric (value 0 or 1) indicating a 'Latin' city
The variables win1.adj
and win2.adj
are provided in order
to allow a simple way of handling ties (in which a tie counts as half a
win and half a loss), which is slightly different numerically from the
Davidson (1970) method that is used by Dittrich et al. (1998): see the
examples.
David Firth
Royal Statistical Society datasets website, at http://onlinelibrary.wiley.com/journal/10.1111/(ISSN)1467-9876/homepage/47_4.htm.
Davidson, R. R. (1970) Extending the Bradley-Terry model to accommodate ties in paired comparison experiments. Journal of the American Statistical Association 65, 317–328.
Dittrich, R., Hatzinger, R. and Katzenbeisser, W. (1998) Modelling the effect of subject-specific covariates in paired comparison studies with an application to university rankings. Applied Statistics 47, 511–525.
Dittrich, R., Hatzinger, R. and Katzenbeisser, W. (2001) Corrigendum: Modelling the effect of subject-specific covariates in paired comparison studies with an application to university rankings. Applied Statistics 50, 247–249.
Turner, H. and Firth, D. (2012) Bradley-Terry models in R: The BradleyTerry2 package. Journal of Statistical Software, 48(9), 1–21.
## ## Fit the standard Bradley-Terry model, using the simple 'add 0.5' ## method to handle ties: ## table3.model <- BTm(outcome = cbind(win1.adj, win2.adj), player1 = school1, player2 = school2, formula = ~.. , refcat = "Stockholm", data = CEMS) ## The results in Table 3 of Dittrich et al (2001) are reproduced ## approximately by a simple re-scaling of the estimates: table3 <- summary(table3.model)$coef[, 1:2]/1.75 print(table3) ## ## Now fit the 'final model' from Table 6 of Dittrich et al.: ## table6.model <- BTm(outcome = cbind(win1.adj, win2.adj), player1 = school1, player2 = school2, formula = ~ .. + WOR[student] * Paris[..] + WOR[student] * Milano[..] + WOR[student] * Barcelona[..] + DEG[student] * St.Gallen[..] + STUD[student] * Paris[..] + STUD[student] * St.Gallen[..] + ENG[student] * St.Gallen[..] + FRA[student] * London[..] + FRA[student] * Paris[..] + SPA[student] * Barcelona[..] + ITA[student] * London[..] + ITA[student] * Milano[..] + SEX[student] * Milano[..], refcat = "Stockholm", data = CEMS) ## ## Again re-scale to reproduce approximately Table 6 of Dittrich et ## al. (2001): ## table6 <- summary(table6.model)$coef[, 1:2]/1.75 print(table6) ## ## Not run: ## Now the slightly simplified model of Table 8 of Dittrich et al. (2001): ## table8.model <- BTm(outcome = cbind(win1.adj, win2.adj), player1 = school1, player2 = school2, formula = ~ .. + WOR[student] * LAT[..] + DEG[student] * St.Gallen[..] + STUD[student] * Paris[..] + STUD[student] * St.Gallen[..] + ENG[student] * St.Gallen[..] + FRA[student] * London[..] + FRA[student] * Paris[..] + SPA[student] * Barcelona[..] + ITA[student] * London[..] + ITA[student] * Milano[..] + SEX[student] * Milano[..], refcat = "Stockholm", data = CEMS) table8 <- summary(table8.model)$coef[, 1:2]/1.75 ## ## Notice some larger than expected discrepancies here (the coefficients ## named "..Barcelona", "..Milano" and "..Paris") from the results in ## Dittrich et al. (2001). Apparently a mistake was made in Table 8 of ## the published Corrigendum note (R. Dittrich personal communication, ## February 2010). ## print(table8) ## End(Not run)