diff --git a/DESCRIPTION b/DESCRIPTION index 5a6acf9..964a718 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,8 +1,8 @@ Package: durmod Type: Package Title: Mixed Proportional Hazard Competing Risk Model -Version: 1.1-1 -Date: 2019-07-24 +Version: 1.1-2 +Date: 2019-08-21 Authors@R: person("Simen", "Gaure", email="simen@gaure.no", role=c("aut","cre"), comment=c(ORCID="https://orcid.org/0000-0001-7251-8747")) URL: https://github.com/sgaure/durmod diff --git a/R/durmod-package.R b/R/durmod-package.R index 4b7f0cd..f1a8f68 100644 --- a/R/durmod-package.R +++ b/R/durmod-package.R @@ -1,5 +1,6 @@ #' A package for estimating a mixed proportional hazard competing risk model with the NPMLE. #' +#' #' The main method of the package is \code{\link{mphcrm}}. It has an interface #' somewhat similar to \code{\link{lm}}. There is an example of use in \code{\link{datagen}}, with #' a generated dataset similiar to the ones in \cite{Gaure et al. (2007)}. For those who have @@ -21,7 +22,7 @@ #' may bias the estimates, not just increase uncertainty. To account for unobserved heterogeneity, a #' random intercept is introduced, so that the hazards are of the form \eqn{h_i^j(\mu_k) = exp(X_i \beta_j + \mu_k)} #' for \eqn{k} between 1 and some \eqn{n}. The intercept may of course be written multiplicatively as -#' \eqn{exp(X_i \beta_j) exp(\mu_k)}, that's why they are called \emph{proportional} hazards. +#' \eqn{exp(X_i \beta_j) exp(\mu_k)}, that is why they are called \emph{proportional} hazards. #' #' The individual likelihood depends on the intercept, i.e. \eqn{L_i(\mu_k)}, but we integrate it out #' so that the individual likelihood becomes \eqn{\sum p_k L_i(\mu_k)}. The resulting mixture @@ -30,7 +31,7 @@ #' Besides the function \code{\link{mphcrm}} which does the actual estimation, there are functions for #' extracting the estimated mixture, they are \code{\link{mphdist}}, \code{\link{mphmoments}} and a few more. #' -#' There's a summary function for the fitted model, and there is a data set available with \code{data(durdata}} which +#' There's a summary function for the fitted model, and there is a data set available with \code{data(durdata)} which #' is used for demonstration purposes. Also, an already fitted model is available there, as \code{\link{fit}}. #' #' The package may use more than one cpu, the default is taken from \code{getOption("durmod.threads")} diff --git a/inst/NEWS.Rd b/inst/NEWS.Rd index f5498a5..e36f4b2 100644 --- a/inst/NEWS.Rd +++ b/inst/NEWS.Rd @@ -1,5 +1,13 @@ \name{NEWS} \title{durmod news} +\section{Changes in version 1.1-2}{ + \itemize{ + \item Some adjustments to default parameters. + \item Use median, not mean, for the interval search for new + points. Use Neumaier compensated addition in computation of gradients. + \item Some more documentation. + } +} \section{Changes in version 1.1-1}{ \itemize{ \item Fixed a bug in the interval censored gradient which slowed down diff --git a/man/durmod-package.Rd b/man/durmod-package.Rd index 7868bb0..78be913 100644 --- a/man/durmod-package.Rd +++ b/man/durmod-package.Rd @@ -13,6 +13,42 @@ used the program used in that paper, a mixture of R, Fortran, C, and python, this is an entirely new self-contained package, written from scratch with 12 years of experience. Currently not all functionality from that behemoth has been implemented, but most of it. } +\details{ +A short description of the model follows. + +There are some individuals with some observed covariates \eqn{X_i}. The individuals are +observed for some time, so there is typically more than one observation of each individual. +At any point they experience one or more hazards. The hazards are assumed to be of the form +\eqn{h_i^j = exp(X_i \beta_j)}, where \eqn{\beta_j} are coefficients for hazard \eqn{j}. +The hazards themselves are not observed, but an event associated with them is, i.e. a transition +of some kind. The time of the transition, either exactly recorded, or within an interval, must also +be in the data set. With enough observations it is then possible to estimate the coefficients \eqn{\beta_j}. + +However, it just so happens that contrary to ordinary linear models, any unobserved heterogeneity +may bias the estimates, not just increase uncertainty. To account for unobserved heterogeneity, a +random intercept is introduced, so that the hazards are of the form \eqn{h_i^j(\mu_k) = exp(X_i \beta_j + \mu_k)} +for \eqn{k} between 1 and some \eqn{n}. The intercept may of course be written multiplicatively as +\eqn{exp(X_i \beta_j) exp(\mu_k)}, that is why they are called \emph{proportional} hazards. + +The individual likelihood depends on the intercept, i.e. \eqn{L_i(\mu_k)}, but we integrate it out +so that the individual likelihood becomes \eqn{\sum p_k L_i(\mu_k)}. The resulting mixture +likelihood is maximized over all the \eqn{\beta}s, \eqn{n}, the \eqn{\mu_k}s, and the probabilities \eqn{p_k}. + +Besides the function \code{\link{mphcrm}} which does the actual estimation, there are functions for +extracting the estimated mixture, they are \code{\link{mphdist}}, \code{\link{mphmoments}} and a few more. + +There's a summary function for the fitted model, and there is a data set available with \code{data(durdata)} which +is used for demonstration purposes. Also, an already fitted model is available there, as \code{\link{fit}}. + +The package may use more than one cpu, the default is taken from \code{getOption("durmod.threads")} +which is initialized from the environment variable \env{DURMOD_THREADS}, \env{OMP_THREAD_LIMIT}, +\env{OMP_NUM_THREADS} or \env{NUMBER_OF_PROCESSORS}, or parallel::detectCores() upon loading the package. + +For more demanding problems, a cluster of machines (from packages \pkg{parallel} or \pkg{snow}) can be +used, in combination with the use of threads. + +There is a vignette (\code{vignette("whatmph")}) with more details about \pkg{durmod} and data layout. +} \references{ Gaure, S., K. Røed and T. Zhang (2007) \cite{Time and causality: A Monte-Carlo Assessment of the timing-of-events approach}, Journal of Econometrics 141(2), 1159-1195. diff --git a/tests/sometests.Rout.save b/tests/sometests.Rout.save index cfc79b9..21a7658 100644 --- a/tests/sometests.Rout.save +++ b/tests/sometests.Rout.save @@ -1,5 +1,5 @@ -R Under development (unstable) (2019-07-05 r76784) -- "Unsuffered Consequences" +R Under development (unstable) (2019-08-21 r77049) -- "Unsuffered Consequences" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) @@ -93,7 +93,7 @@ $coefs job.x1 0.89634 0.00931 96.2537 0.00e+00 job.x2 -0.80995 0.00759 -106.7509 0.00e+00 job.alpha -0.17688 0.02670 -6.6241 3.67e-11 -job.f.2 0.07449 0.04287 1.7376 8.23e-02 +job.f.2 0.07449 0.04287 1.7377 8.23e-02 job.f.3 -0.06195 0.04409 -1.4049 1.60e-01 job.f.4 0.11317 0.04148 2.7287 6.37e-03 job.f.5 0.01854 0.04311 0.4301 6.67e-01 @@ -103,7 +103,7 @@ program.x2 0.37743 0.01749 21.5830 4.31e-101 program.f.2 0.20536 0.13447 1.5271 1.27e-01 program.f.3 0.02068 0.13735 0.1505 8.80e-01 program.f.4 0.00632 0.12036 0.0525 9.58e-01 -program.f.5 -0.13482 0.14185 -0.9504 3.42e-01 +program.f.5 -0.13482 0.14185 -0.9505 3.42e-01 program.f.6 0.28122 0.14694 1.9138 5.57e-02 program.g.2 0.03025 0.13082 0.2312 8.17e-01 program.g.3 0.14083 0.13864 1.0157 3.10e-01 @@ -111,7 +111,7 @@ program.g.4 -0.02071 0.14072 -0.1472 8.83e-01 program.f:g.2:2 -0.30106 0.19641 -1.5328 1.25e-01 program.f:g.2:3 -0.24366 0.19296 -1.2627 2.07e-01 program.f:g.2:4 0.03646 0.20306 0.1796 8.58e-01 -program.f:g.3:2 0.10878 0.19535 0.5569 5.78e-01 +program.f:g.3:2 0.10878 0.19535 0.5568 5.78e-01 program.f:g.3:3 -0.17411 0.19189 -0.9073 3.64e-01 program.f:g.3:4 0.16215 0.19720 0.8223 4.11e-01 program.f:g.4:2 -0.00186 0.17351 -0.0107 9.91e-01 @@ -174,4 +174,4 @@ job 0.0172 > > proc.time() user system elapsed - 11.809 0.231 4.793 + 6.87 0.24 4.31