Nonparametric maximum likelihood estimation for survival data

Léo Belzile

2024-07-18

The longevity package includes an implementation of Turnbull’s EM algorithm for the empirical distribution function for data subject to arbitrary censoring and truncation patterns.

For example, we can consider the interval censored data considered in Lindsey and Ryan (1998). The left and right give respectively.

library(longevity)
left <- c(0,15,12,17,13,0,6,0,14,12,13,12,12,0,0,0,0,3,4,1,13,0,0,6,0,2,1,0,0,2,0)
right <- c(16, rep(Inf, 4), 24, Inf, 15, rep(Inf, 5), 18, 14, 17, 15,
           Inf, Inf, 11, 19, 6, 11, Inf, 6, 12, 17, 14, 25, 11, 14)
test <- np_elife(time = left,   # left bound for time
                 time2 = right, # right bound for time
                 type = "interval2", # data are interval censored
                 event = 3) # specify interval censoring, argument recycled

plot(test)
Nonparametric maximum likelihood estimate of the distribution function for the AIDS data

Nonparametric maximum likelihood estimate of the distribution function for the AIDS data

We can also extract the equivalence classes and compare them to Lindsey and Ryan (1998): these match the values returned in the paper. The summary statistics reported by the print method include the restricted mean, which is computed by calculating the area under the survival curve.

test$xval
##      left right
## [1,]    4     6
## [2,]   13    14
## [3,]   14    15
## [4,]   15    16
## [5,]   17    18
print(test)
## Nonparametric maximum likelihood estimator
## 
## Routine converged 
## Number of equivalence classes: 5 
## Mean:  10.47143 
## Quartiles of the survival function: 15.5 14 8

References

Lindsey, Jane C., and Louise M. Ryan. 1998. “Methods for Interval-Censored Data.” Statistics in Medicine 17 (2): 219–38. https://doi.org/10.1002/(SICI)1097-0258(19980130)17:2<219::AID-SIM735>3.0.CO;2-O.