Stat Modelling. 2016 Apr;16(2):91-113. doi: 10.1177/1471082X15627004. Epub 2016 Mar 27.


Factor analysis is a commonly used method of modelling correlated multivariate exposure data. Typically, the measurement model is assumed to have constant factor loadings. However, from our preliminary analyses of the Environmental Protection Agency’s (EPA’s) PM2.5 fine speciation data, we have observed that the factor loadings for four constituents change considerably in stratified analyses. Since invariance of factor loadings is a prerequisite for valid comparison of the underlying latent variables, we propose a factor model that includes non-constant factor loadings that change over time and space using P-spline penalized with the generalized cross-validation (GCV) criterion. The model is implemented using the Expectation-Maximization (EM) algorithm and we select the multiple spline smoothing parameters by minimizing the GCV criterion with Newton’s method during each iteration of the EM algorithm. The algorithm is applied to a one-factor model that includes four constituents. Through bootstrap confidence bands, we find that the factor loading for total nitrate changes across seasons and geographic regions.

PMID:27528825 | PMC:PMC4982519 | DOI:10.1177/1471082X15627004