Inferential information in multivariate statistics: An integrated structural equation modeling approach

23 Apr 2024

Dr. Fei Gu

 

Principal component analysis (PCA; Hotelling, 1933), canonical correlation analysis (CCA; Hotelling, 1935, 1936), and redundancy analysis (RA; Van Den Wollenberg, 1977) are the basic analytic techniques developed in multivariate statistics. In applications of these methods, researchers often rely on certain cutoff values to make the decisions on determining the dimensionality or identifying the important variables to understand or interpret the components. However, the selection of cutoff value is often arbitrary and does not have any statistical or theoretical basis. Consequently, most decisions made are subjective and do not consider the sampling fluctuations. Additionally, various substantive hypotheses cannot be tested easily unless one uses the time-consuming and computationally intensive approaches such as bootstrap or permutation tests. Therefore, PCA, CCA, and RA are typically applied in exploratory data analysis, rather than confirmatory data analysis. The fundamental reason of this phenomenon is the lack of inferential information (e.g., standard errors and confidence intervals) in these methods.

 

Structural equation modeling (SEM), on the other hand, is a framework that provides comprehensive results including not only point estimates but also inferential information. Besides, various estimators have been developed in SEM to accommodate both normal and non-normal data. Moreover, the SEM framework allows flexible extensions (e.g., multi-group model) to satisfy a variety of needs to test hypotheses.

 

To aid the applications of multivariate statistics, one natural idea is to integrate the multivariate methods into the SEM framework. Such an integration will produce the inferential information in multivariate statistics and thus promote the applications of the relevant method. In fact, some work has been done in terms of the integration of PCA into the SEM framework. Specifically, Dolan (1996) first developed the regular model to analyze the unstandardized variables (or covariance matrices); Gu (2016) proposed the scale-invariant model to analyze the standardized variables (or correlation matrices) and developed the multi-group scale-invariant model that allows the invariance of parameters to be tested; and Ogasawara (2000) gave the standard errors for the rotated PCA estimates. Most recently, Gu and Cheung (2023) showed that inferential information in multivariate principal component regression can be obtained by extending the SEM framework for PCA. In terms of the integration of CCA into the SEM framework, some similar work has been done. Specifically, Gu, Yung, and Cheung (2019) provided the regular model and the scale-invariant model to analyze the unstandardized variables (or covariance matrices) and the standardized variables (or correlation matrices), separately; Gu and Wu (2018) developed the multi-group scale-invariant model to test the invariance of canonical loadings; and Gu, Wu, Yung, and Wilkins (2021) gave the standard errors for the rotated CCA estimates. In terms of the integration of RA into the SEM framework, Gu et al. (2023) developed the scale-invariant model for RA, because RA was originally defined for the standardized variables only.

 

 

Table 1. Existing work and future projects of integrating PCA, CCA, and RA into the SEM framework.

 

PCA
CCA
RA
Regular model
Dolan (1996)
Gu, Yung, & Cheung (2019)
N/A
Scale-invariant model
Gu (2016)
Gu, Yung, & Cheung (2019)
Gu, Yung, Cheung, Joo, & Nimon (2023)
Multi-group model
Gu (2016)
Gu & Wu (2018)
Future project 2
Standard errors for rotated estimates
Ogasawara (2000)
Gu, Wu, Yung, & Wilkins (2021)
Future project 3
Component-based regression model
Gu & Cheung (2023)
Future project 1
Future project 4

 

 

Table 1 provides a summary of the relevant work that has been done for the integration of PCA, CCA, and RA into the SEM framework, and this table also shows the future projects that can be done to complete this ongoing process of integration. The first future project is to extend the SEM framework for CCA to integrate canonical correlation regression (CCR). CCR is one of the commonly used methods in chemometrics (Burnham, Viveros, & MacGregor, 1996), and the integration of CCR into the SEM framework will produce the standard errors of the regression coefficients in CCR. The second future project is to develop the multi-group regular or scale-invariant model for RA so that the invariance of RA parameters can be tested. The third future project is to obtain the standard errors for rotated RA estimates, which can facilitate the interpretation of the rotated redundancy variates. Finally, the fourth future project is to clarify the equivalence and differences between RA and reduced-rank regression (RRR; a closely-related method that achieves the same mathematical goal but was independently developed by other statisticians such as Izenman, 1975; Tso, 1981; Davies and Tso, 1982). Such a clarification will give a better understanding of both RA and RRR and promote their applications in relevant studies.

 

In the next two years, the author is going to work on these four future projects and complete the integration of CCA and RA into the SEM framework.

 

 

References

 

Burnham, A. J., Viveros, R. & MacGregor, J. F. (1996). Frameworks for latent variable multivariate regression. Journal of Chemometrics, 10, 31-45. DOI: 10.1002/(SICI)1099-128X(199601)10:1<31::aid-cem398>3.0.CO;2-1

 

Davies, P. T., & Tso, M. K.-S. (1982). Procedures for reduced-rank regression. Journal of the Royal Statistical Society. Series C (Applied Statistics), 31, 244-255. DOI: 10.2307/2347998

 

Dolan, C. (1996). Principal component analysis using LISREL 8. Structural Equation Modeling, 3, 307-322. DOI: 10.1080/10705519609540049

 

Gu, F. (2016). Analysis of correlation matrices using scale-invariant common principal component models and a hierarchy of relationships between correlation matrices. Structural Equation Modeling, 23, 819-826. DOI: 10.1080/10705511.2016.1207180

 

Gu, F., & Cheung, M. W.-L. (2023). A model-based approach to multivariate principal component regression: Selection of principal components and standard error estimates for unstandardized regression coefficients. British Journal of Mathematical and Statistical Psychology, 76, 605-622. DOI: 10.1111/bmsp.12301

 

Gu, F., & Wu, H. (2018). Simultaneous canonical correlation analysis with invariant canonical loadings. Behaviormetrika, 45, 111-132. DOI: 10.1007/s41237-017-0042-8

 

Gu, F., Wu, H., Yung, Y.-F., & Wilkins, J. L. M. (2021). Standard error estimates for rotated estimates of canonical correlation analysis: An implementation of the infinitesimal jackknife method. Behaviormetrika, 48, 143-168. DOI: 10.1007/s41237-020-00123-7

 

Gu, F., Yung, Y.-F., & Cheung, M. W.-L. (2019). Four covariance structure models for canonical correlation analysis: A COSAN modeling approach. Multivariate Behavioral Research, 54, 192-223. DOI: 10.1080/00273171.2018.1512847

 

Gu, F., Yung, Y.-F., Cheung, M. W.-L., Joo, B.-K., & Nimon, K. (2023). Statistical inference in redundancy analysis: A direct covariance structure modeling approach. Multivariate Behavioral Research, 58, 877-893. DOI: 10.1080/00273171.2022.2141675

 

Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24, 417-441. DOI: 10.1037/h0071325

 

Hotelling, H. (1935). The most predictable criterion. Journal of Educational Psychology, 26, 139-142. DOI: 10.1037/h0058165

 

Hotelling, H. (1936). Relations between two sets of variates. Biometrika, 28, 321-377. DOI: 10.2307/2333955

 

Izenman, A. J. (1975). Reduced-rank regression for the multivariate linear model. Journal of Multivariate Analysis, 5, 248–264. DOI: 10.1016/0047-259X(75)90042-1

 

Ogasawara, H. (2000). Standard errors of the principal component loadings for unstandardized and standardized variables. British Journal of Mathematical and Statistical Psychology, 53, 155-174. DOI: 10.1348/000711000159277

 

Tso, M. K.-S. (1981). Reduced-rank regression and canonical analysis. Journal of the Royal Statistical Society. Series B, 43, 183-189. DOI: 10.1111/j.2517-6161.1981.tb01169.x

 

Van Den Wollenberg, A. (1977). Redundancy analysis: An alternative for canonical correlation analysis. Psychometrika, 42, 207-219. DOI: 10.1007/BF02294050

 

 

 


 

 

Author

Dr. Fei Gu

Lecturer in Quantitative Psychology Area

 

Share this content