Three-part joint modeling methods for complex functional data mixed with zero-and-one–inflated proportions and zero-inflated continuous outcomes with skewness

Publication Type:
Journal Article
Citation:
Statistics in Medicine, 2018, 37 (4), pp. 611 - 626
Issue Date:
2018-02-20
Filename Description Size
Li_et_al-2018-Statistics_in_Medicine.pdfPublished Version1.26 MB
Adobe PDF
Full metadata record
Copyright © 2017 John Wiley & Sons, Ltd. We take a functional data approach to longitudinal studies with complex bivariate outcomes. This work is motivated by data from a physical activity study that measured 2 responses over time in 5-minute intervals. One response is the proportion of time active in each interval, a continuous proportions with excess zeros and ones. The other response, energy expenditure rate in the interval, is a continuous variable with excess zeros and skewness. This outcome is complex because there are 3 possible activity patterns in each interval (inactive, partially active, and completely active), and those patterns, which are observed, induce both nonrandom and random associations between the responses. More specifically, the inactive pattern requires a zero value in both the proportion for active behavior and the energy expenditure rate; a partially active pattern means that the proportion of activity is strictly between zero and one and that the energy expenditure rate is greater than zero and likely to be moderate, and the completely active pattern means that the proportion of activity is exactly one, and the energy expenditure rate is greater than zero and likely to be higher. To address these challenges, we propose a 3-part functional data joint modeling approach. The first part is a continuation-ratio model to reorder the ordinal valued 3 activity patterns. The second part models the proportions when they are in interval (0,1). The last component specifies the skewed continuous energy expenditure rate with Box-Cox transformations when they are greater than zero. In this 3-part model, the regression structures are specified as smooth curves measured at various time points with random effects that have a correlation structure. The smoothed random curves for each variable are summarized using a few important principal components, and the association of the 3 longitudinal components is modeled through the association of the principal component scores. The difficulties in handling the ordinal and proportional variables are addressed using a quasi-likelihood type approximation. We develop an efficient algorithm to fit the model that also involves the selection of the number of principal components. The method is applied to physical activity data and is evaluated empirically by a simulation study.
Please use this identifier to cite or link to this item: