Background: Linear discriminant analysis (DA) encompasses procedures for classifying observations into groups (predictive discriminant analysis, PDA) and describing the relative importance of variables for distinguishing between groups (descriptive discriminant analysis, DDA) in multivariate data. In recent years, there has been increased interest in DA procedures for repeated measures data. PDA procedures that assume parsimonious repeated measures mean and covariance structures have been developed, but corresponding DDA procedures have not been proposed. Most DA procedures for repeated measures data rest on the assumption of multivariate normality, which may not be satisfied in biostatistical applications. For example, health-related quality of life (HRQOL) measures, which are increasingly being used as outcomes in clinical trials and cohort studies, are likely to exhibit skewed or heavy-tailed distributions. As well, measures of relative importance based on discriminant function coefficients (DFCs) for DDA procedures have not been proposed for repeated measures data.
Purpose: The purpose of this research is to develop repeated measures discriminant analysis (RMDA) procedures based on parsimonious covariance structures, including compound symmetric and first order autoregressive structures, and that are robust (i.e., insensitive) to multivariate non-normal distributions. It also extends these methods to evaluate the relative importance of variables in multivariate repeated measures (i.e., doubly multivariate) data.
Method: Monte Carlo studies were conducted to investigate the performance of the proposed RMDA procedures under various degrees of group mean separation, repeated measures correlation structures, departure from multivariate normality, and magnitude of covariance mis-specification. Data from the Manitoba Inflammatory Bowel Disease Cohort Study, a prospective longitudinal cohort study about the psychosocial determinants of health and well-being, are used to illustrate their applications.
Results: The conventional maximum likelihood (ML) estimates of DFCs for RMDA procedures based on parsimonious covariance structures exhibited substantial bias and error when the covariance structure was mis-specified or when the data followed a multivariate skewed or heavy-tailed distribution. The DFCs of RMDA procedures based on robust estimators obtained from coordinatewise trimmed means and Winsorized variances, were less biased and more efficient when the data followed a multivariate non-normal distribution, but were sensitive to the effects of covariance mis-specification. Measures of relative importance for doubly multivariate data based on linear combinations of the within-variable DFCs resulted in the highest proportion of correctly ranked variables.
Conclusions: DA procedures based on parsimonious covariance structures and robust estimators will produce unbiased and efficient estimates of variable relative importance of variables in repeated measures data and can be used to test for change in relative importance over time. The choice among these RMDA procedures should be guided by preliminary descriptive assessments of the data.