AUTHOR=Wang Jiawei , Qiao Shaojie , Xiang Dongsheng , Liao Yangcheng , Wang Chao TITLE=HFTC: a hierarchical fungal taxonomic classification model for ITS sequences using low-dimensional embedding features JOURNAL=Frontiers in Genetics VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/genetics/articles/10.3389/fgene.2025.1650244 DOI=10.3389/fgene.2025.1650244 ISSN=1664-8021 ABSTRACT=IntroductionFungal identification through ITS sequencing is pivotal for biodiversity and ecological studies, yet existing methods often face challenges with high-dimensional features and inconsistent taxonomy predictions.MethodWe proposed HFTC, a hierarchical fungal taxonomic classifier built upon a multi-level random forest (RF) architecture. Notably, HFTC incorporates a bidirectional k-mer strategy to capture contextual information from both sequence orientations. By leveraging Word2Vec embedding, it reduces feature dimensionality from 4k to only 200, significantly improving computational efficiency while preserving rich sequence context.ResultExperimental results demonstrate that HFTC outperforms Mothur, RDP, Sintax, QIIME2, and CNN-Duong, achieving a Matthews correlation coefficient (MCC) of 95.31% despite uneven class distributions. Its overall accuracy (ACC) reaches 95.25%. At the species level, it attains a hierarchical accuracy (HA) of 95.10%, surpassing the best-performing deep learning baseline, CNN-Duong, by 3.2%. Moreover, HFTC exhibits the smallest discrepancy between ACC and HA (1.60%), in contrast to CNN-Duong, which shows the largest gap (35.00%), highlighting HFTC’s superior hierarchical consistency.DiscussionHFTC offers a scalable and accurate approach for fungal taxonomic classification. Its compact feature representation and hierarchical architecture make it particularly suitable for microbial diversity research. The source code and datasets are publicly accessible at https://github.com/wjjw0731/HFTC/tree/master.