2019-08-30

Speaker: Emily Slade, PhD, Department of Biostatistics, University of Kentucky

Title:"Pitfalls and Recommendations for Handling Missing Data in Canonical Correlation Analysis"

Canonical correlation analysis (CCA) provides a global test and measures of association between two multivariate sets of variables measured on the same individuals. In large multivariate settings, the proportion of subjects missing data on at least one variable can be high. Before performing CCA in practice, missing data has typically been handled by naïve methods such as complete case analysis and unconditional mean imputation. I examine the performance of CCA when used with these methods as well as more sophisticated conditional imputation methods. Unconditional mean imputation performs surprisingly well for estimation and testing, but properly-implemented conditional imputation can improve estimation of the first canonical correlation. In this talk, I offer recommendations for imputation in CCA based on simulated data with wide-ranging complexity, and I apply these methods to relate dietary variables to blood lipid levels in the Nurses' Health Study and Health Professionals Follow-up Study. I also discuss advances and challenges in using multiple imputation for canonical correlation analysis.

Stay connected TwitterFacebookLinkedInYouTubeInstagram