AUTHOR=Nkhoma Harrid , Mulaga Atupele Ngina , Kumwenda Save , Kamndaya Mphatso TITLE=Measuring the performance of LPA, LCGA, LGCM, and GMM in identifying the homogenous subgroups (latent classes) within the wider heterogeneous population of patients on DTG JOURNAL=Frontiers in Applied Mathematics and Statistics VOLUME=Volume 11 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/applied-mathematics-and-statistics/articles/10.3389/fams.2025.1664415 DOI=10.3389/fams.2025.1664415 ISSN=2297-4687 ABSTRACT=BackgroundIdentifying heterogeneity in longitudinal data is critical for understanding diverse trajectories in clinical and epidemiological research. Traditional analytical methods often fail to distinguish latent subpopulations. More advanced statistical models such as Latent Profile Analysis (LPA), Latent Class Growth Analysis (LCGA), Latent Growth Curve Modeling (LGCM), and Growth Mixture Modeling (GMM) provide a data-driven approach to uncovering the distinct patterns. This study evaluated the performance of these models in classifying longitudinal weight gain trajectories.MethodsA retrospective longitudinal dataset of 3,525 HIV positive individuals on DTG based regimen with repeated weight measurements over 24 months was analysed. Models were implemented using a stepwise approach: (1) LPA was applied to identify latent subgroups based on weight gain patterns without incorporating time, (2) LCGA and LGCM modelled individual trajectories assuming class-invariant and class-specific variances, respectively, and (3) GMM incorporated within-class variability to allow flexible trajectory shapes. Model performance was assessed using Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Deviance Statistics, and log-likelihood. Average Posterior Probability (AvePP) was used to evaluate classification certainty by measuring the mean probability of individuals being correctly classified into their assigned latent class. Clinical interpretability was also considered to assess real-world applicability.ResultsLCGA demonstrated the best model fit, with the lowest AIC (42,239.43) and BIC (42,301.1) and the highest log-likelihood (−21,109.71), identifying three distinct weight gain trajectories in the process. Although GMM captured greater within-class variability, LCGA demonstrated superior fit statistics, with the lowest AIC (42,239.43) and BIC (42,301.1) and the highest log-likelihood (−21,109.71), identifying three distinct trajectories.ConclusionLCGA and GMM were the most effective models for identifying distinct latent trajectories, with LCGA demonstrating the best overall fit for our data. These findings emphasize the importance of appropriate model selection in longitudinal data analysis, as different approaches yield varying capacities to detect meaningful subpopulations. Selecting an optimal model is essential for improving trajectory classification and supporting evidence-based decision-making in clinical and epidemiological research.