Academic Seminar: Christelle Zozoungbo’s Presents her Thesis on Heterogeinity

It is important for policy makers to identify social groups that are the most in need of health infrastructure. Statistical and geographical techniques can be used to detect these groups through the analysis of data related to mortality. One of these techniques have been presented on Tuesday November 13th, 2018 by Christelle Zozoungbo, a Predoctoral Fellow at ASE in her thesis, as part of the weekly academic seminar.

She began her presentation titled “Spatial Heterogeneity in Survival Data: Log Normal and Gamma Frailty’’, by pointing out the fact that in data related to mortality, most often only a few covariates are available, leading to frequent heterogeneity in the analysis of such data. The second relevant fact she referred to is the way individuals are in the space. Individuals are arranged in the space, first at family level, household level, village level up to country level. As a result, individuals share environmental factors. People in very closed locations are to some extent alike. These two points have then to be considered in the model specification.

Her work then focuses on finding an accurate way of analyzing survival data while accounting for possible heterogeneity and the potential spatial arrangement of individuals. To do that, she considers the Cox proportional hazard model which includes a random term called Frailty term (introduced by Vaupel, 1979) in order to access the degree of heterogeneity among subjects. She explained that from the literature, a prior distribution must be assumed for the random term.

She explored two different possible scenarios for the frailty prior distribution. In the first scenario (log-normal frailty model), it is assumed that the frailty term changes from one location to another one but is similar for location that are closer. In the second scenario (gamma frailty) it is assumed a multivariate Gaussian as prior distribution of the frailty term. But this prior distribution reduces to a Gamma distribution within location. She then simulated data of the form of the scenarios and fit the R function “survregbayes” devoted to the analysis of spatially correlated survival data, to the simulated data. She found as a result that the function fit better data from the first scenario 1 based on an analysis of bias between coefficients true value and estimated value and the distribution of the simulated and estimated frailties. She concluded that this is likely due the fact that the options in the R function are designed for data in the first scenario.

She added at the end of the presentation that the next step will be to edit the R function “survregbayes” and adapt it to data of the form of the one in scenario 2.