medRxiv [Preprint]. 2024 Mar 4:2024.03.03.24303677. doi: 10.1101/2024.03.03.24303677.


Quantitative characterization of the health impacts associated with exposure to chemical mixtures has received considerable attention in current environmental and epidemiological studies. With many existing statistical methods and emerging approaches, it is important for practitioners to understand when each method is best suited for their inferential goals. In this study, we conduct a review and comparison of 11 analytical methods available for use in mixtures research, through extensive simulation studies for continuous and binary outcomes. These methods fall in three different classes: identifying important components of a mixture, identifying interactions and creating a summary score for risk stratification and prediction. We carry out an illustrative data analysis in the PROTECT birth cohort from Puerto Rico. Most importantly we develop an integrated package “CompMix” that provides a platform for mixtures analysis where the practitioner can implement a pipeline for several types of mixtures analysis. Our simulation results suggest that the choice of methods depends on the goal of analysis and there is no clear winner across the board. For selection of important toxicants in the mixture and for identifying interactions, Elastic net by Zou et al. (Enet), Lasso for Hierarchical Interactions by Bien et al (HierNet), Selection of nonlinear interactions by a forward stepwise algorithm by Narisetty et al. (SNIF) have the most stable performance across simulation settings. Additionally, the predictive performance of the Super Learner ensembling method by Van de Laan et al. and HierNet are found to be superior to the rest of the methods. For overall summary or a cumulative measure, we find that using the Super Learner to combine multiple Environmental Risk Scores can lead to improved risk stratification properties. We have developed an R package “CompMix: A comprehensive toolkit for environmental mixtures analysis”, allowing users to implement a variety of tasks under different settings and compare the findings. In summary, our study offers guidelines for selecting appropriate statistical methods for addressing specific scientific questions related to mixtures research. We identify critical gaps where new and better methods are needed.

PMID:38496435 | PMC:PMC10942527 | DOI:10.1101/2024.03.03.24303677