BTm {BradleyTerry2} | R Documentation |
Fits Bradley-Terry models for pair comparison data, including models with structured scores, order effect and missing covariate data. Fits by either maximum likelihood or maximum penalized likelihood (with Jeffreys-prior penalty) when abilities are modelled exactly, or by penalized quasi-likelihood when abilities are modelled by covariates.
BTm(outcome, player1, player2, formula = NULL, id = "..", separate.ability = NULL, refcat = NULL, family = binomial, data = NULL, weights = NULL, subset = NULL, na.action = NULL, start = NULL, etastart = NULL, mustart = NULL, offset = NULL, br = FALSE, model = TRUE, x = FALSE, contrasts = NULL, ...)
outcome |
the binomial response: either a numeric vector, a factor in which the first level denotes failure and all others success, or a two-column matrix with the columns giving the numbers of successes and failures. |
player1 |
either an ID factor specifying the first player in
each contest, or a data.frame containing such a factor and possibly
other contest-level variables that are specific to the first player. If
given in a data.frame, the ID factor must have the name given in the
|
player2 |
an object corresponding to that given in
|
formula |
a formula with no left-hand-side, specifying the model for player ability. See details for more information. |
id |
the name of the ID factor. |
separate.ability |
(if |
refcat |
(if |
family |
a description of the error distribution and link
function to be used in the model. Only the binomial family is
implemented, with either |
data |
an optional object providing data required by the
model. This may be a single data frame of contest-level data or a list of
data frames. Names of data frames are ignored unless they refer to
data frames specified by |
weights |
an optional numeric vector of ‘prior weights’. |
subset |
an optional logical or numeric vector specifying a subset of observations to be used in the fitting process. |
na.action |
a function which indicates what should happen when
any contest-level variables contain |
start |
a vector of starting values for the fixed effects. |
etastart |
a vector of starting values for the linear predictor. |
mustart |
a vector of starting values for the vector of means. |
offset |
an optional offset term in the model. A vector of length equal to the number of contests. |
br |
logical. If |
model |
logical: whether or not to return the model frame. |
x |
logical: whether or not to return the design matrix for the fixed effects. |
contrasts |
an optional list specifying contrasts for the factors
in |
... |
other arguments for fitting function (currently either
|
In each comparison to be modelled there is a 'first player' and a 'second player' and it is assumed that one player wins while the other loses (no allowance is made for tied comparisons).
The countsToBinomial
function is provided to convert a
contingency table of wins into a data frame of wins and losses for
each pair of players.
The formula
argument specifies the model for player ability and
applies to both the first player and the second player in each
contest. If NULL
a separate ability is estimated for each
player, equivalent to setting formula = reformulate(id)
.
Contest-level variables can be specified in the formula in the usual
manner, see formula
. Player covariates should
be included as variables indexed by id
, see examples. Thus
player covariates must be ordered according to the levels of the ID
factor.
If formula
includes player covariates and there are players
with missing values over these covariates, then a separate ability
will be estimated for those players.
When player abilities are modelled by covariates, then random player
effects should be added to the model. These should be specified in the
formula using the vertical bar notation of lmer
,
see examples.
When specified, it is assumed that random player effects arise from a N(0, sigma^2) distribution and model parameters,
including sigma, are estimated using PQL (Breslow and
Clayton, 1993) as implemented in the glmmPQL
function.
An object of class c("BTm", "x")
, where "x"
is the class
of object returned by the model fitting function (e.g. glm
).
Components are as for objects of class "x"
, with additionally
id |
the |
separate.ability |
the |
refcat |
the |
player1 |
a data frame for the first player containing the ID factor and any player-specific contest-level variables. |
player2 |
a data frame corresponding to that for |
assign |
a numeric vector indicating which coefficients correspond to which terms in the model. |
term.labels |
labels for the model terms. |
random |
for models with random effects, the design matrix for the random effects. |
Heather Turner, David Firth
Agresti, A. (2002) Categorical Data Analysis (2nd ed). New York: Wiley.
Firth, D. (1992) Bias reduction, the Jeffreys prior and GLIM. In Advances in GLIM and Statistical Modelling, Eds. Fahrmeir, L., Francis, B. J., Gilchrist, R. and Tutz, G., pp91–100. New York: Springer.
Firth, D. (1993) Bias reduction of maximum likelihood estimates. Biometrika 80, 27–38.
Firth, D. (2005) Bradley-Terry models in R. Journal of Statistical Software, 12(1), 1–12.
Stigler, S. (1994) Citation patterns in the journals of statistics and probability. Statistical Science 9, 94–108.
Turner, H. and Firth, D. (2012) Bradley-Terry models in R: The BradleyTerry2 package. Journal of Statistical Software, 48(9), 1–21.
countsToBinomial
, glmmPQL
,
BTabilities
, residuals.BTm
,
add1.BTm
, anova.BTm
######################################################## ## Statistics journal citation data from Stigler (1994) ## -- see also Agresti (2002, p448) ######################################################## ## Convert frequencies to success/failure data citations.sf <- countsToBinomial(citations) names(citations.sf)[1:2] <- c("journal1", "journal2") ## First fit the "standard" Bradley-Terry model citeModel <- BTm(cbind(win1, win2), journal1, journal2, data = citations.sf) ## Now the same thing with a different "reference" journal update(citeModel, refcat = "JASA") ################################################################## ## Now an example with an order effect -- see Agresti (2002) p438 ################################################################## ## Simple Bradley-Terry model, ignoring home advantage: baseballModel1 <- BTm(cbind(home.wins, away.wins), home.team, away.team, data = baseball, id = "team") ## Now incorporate the "home advantage" effect baseball$home.team <- data.frame(team = baseball$home.team, at.home = 1) baseball$away.team <- data.frame(team = baseball$away.team, at.home = 0) baseballModel2 <- update(baseballModel1, formula = ~ team + at.home) ## Compare the fit of these two models: anova(baseballModel1, baseballModel2) ## ## For a more elaborate example with both player-level and contest-level ## predictor variables, see help(chameleons). ##