CPH Biostat Student Presentations
at JSM San Diego August 2012
Title: Hosmer-Lemeshow Goodness-of-Fit Test for Multiply Imputed Data
Abstract: The Hosmer-Lemeshow (H-L) test is widely used for evaluating goodness of fit in logistic regression models. The H-L test first creates groups based on deciles of the estimated probabilities and then compares observed and expected event rates within these groups. Multiple imputation (MI) is growing in popularity as a method for handling missing data, and how to apply the H-L test after MI is not straightforward. In this paper we discuss complexities involved in applying the H-L test to multiply imputed data, related to which variables have missingness. When covariates have been imputed, predicted probabilities vary across imputed data sets, and thus the boundaries of the predicted probability groupings vary as well. When the outcome has been imputed, both predicted probabilities and "observed" event rates change from one data set to the next. We then propose several different methods for using the H-L test with multiply imputed data, and compare the methods through simulation.
Title: A Discriminant Function for Renal Inflammatory Activity Associated with Lupus Nephritis
Abstract: A trivariate normal model with a patterned covariance matrix is proposed to examine the longitudinal patterns in a dataset which is composed of lupus nephritis patients who experience repeated renal inflammatory activity. The model accounts for the correlation between three variables of interest as well as the correlation which is present across time. Conditional distributions are utilized to develop a discriminant function which identifies patients who are currently experiencing inflammatory activity. The training dataset provides MLE's for each of the parameters present in the discriminant function. The discriminant function complete with MLE's can then be used for patient classification in future datasets.
Title: Semiparametric Bayesian Joint Modeling of a Binary and Continuous Outcome with Applications in Toxicological Risk Assessment
Abstract: Many dose-response studies collect data on correlated outcomes. For example, in developmental toxicity studies, uterine weight and presence of malformed pups are measured on the same dam. Joint modeling can result in more efficient inferences than independent models for each outcome. Most methods for joint modeling assume standard parametric response distributions. However, in toxicity studies, it is possible that response distributions vary in location and shape with dose, which may not be easily captured by standard models. We propose a semiparametric Bayesian joint model for a binary and continuous response. In our model, a kernel stick-breaking process (KSBP) prior is assigned to the distribution of a random effect shared across outcomes, which allows flexible changes in shape with dose shared across outcomes. The model also includes outcome-specific fixed effects to allow different location effects. In simulation studies, we found that the proposed model provides accurate estimates of toxicological risk when the data don't satisfy assumptions of standard parametric models. We apply our method to data from a developmental toxicity study of ethylene glycol diethyl ether.
Title: Bayesian Threshold Regression Model for Current Status Data with Informative Censoring
Abstract: In some biomedical applications, there is interest in making inferences about a time to event distribution but the exact time of the event is unknown. For instance in animal carcinogenicity studies, tumors are not discovered until the time of examination and hence time to tumor is interval censored; this is known as current status data. Sometimes, the examination time is not independent of the event time; e.g., an exam may have occurred because the animal died to a cause related to tumor development. In this case, survival analysis methods assuming independent censoring would result in biased inferences. To address this issue, we propose a Bayesian approach which jointly models time to tumor and time to death using Wiener processes which fail once they hit a boundary value. Data augmentation approach is used to sample the unobserved time to tumor. To account for informative censoring, our model allows the drift of the death process to change following time to tumor. In addition to being a conceptually appealing model, our model does not require the assumption of proportional hazards of some standard methods. We demonstrate our method using time to lung tumor data from an NTP study.