AUTHOR=Mölbert Carla , Haghverdi Laleh TITLE=Adjustments to the reference dataset design improve cell type label transfer JOURNAL=Frontiers in Bioinformatics VOLUME=Volume 3 - 2023 YEAR=2023 URL=https://www.frontiersin.org/journals/bioinformatics/articles/10.3389/fbinf.2023.1150099 DOI=10.3389/fbinf.2023.1150099 ISSN=2673-7647 ABSTRACT=The transfer of cell type labels from prior annotated (reference) to newly collected data is an important task in single-cell data analysis. As the number of publicly available annotated datasets which can be used as a reference, as well as the number of computational methods for cell type label transfer are constantly growing, rationals to understand and decide which reference design and which method to use for a particular query dataset is needed. Here, we benchmark a set of five popular cell type annotation methods, study the performance on different cell types and highlight the importance of the design of the reference data (number of cell samples for each cell type, inclusion of multiple datasets in one reference, gene set selection, etc.) for more reliable predictions.