AUTHOR=Goll Johannes B. , Bosinger Steven E. , Jensen Travis L. , Walum Hasse , Grimes Tyler , Tharp Gregory K. , Natrajan Muktha S. , Blazevic Azra , Head Richard D. , Gelber Casey E. , Steenbergen Kristen J. , Patel Nirav B. , Sanz Patrick , Rouphael Nadine G. , Anderson Evan J. , Mulligan Mark J. , Hoft Daniel F. TITLE=The Vacc-SeqQC project: Benchmarking RNA-Seq for clinical vaccine studies JOURNAL=Frontiers in Immunology VOLUME=Volume 13 - 2022 YEAR=2023 URL=https://www.frontiersin.org/journals/immunology/articles/10.3389/fimmu.2022.1093242 DOI=10.3389/fimmu.2022.1093242 ISSN=1664-3224 ABSTRACT=Over the last decade, the field of systems vaccinology has emerged, in which high throughput transcriptomics and other omics assays are used to probe changes of the innate and adaptive immune system in response to vaccination. RNA-Seq technology has matured in recent years and is now widely deployed for transcriptional analysis of clinical specimens. The goal of this study was to benchmark technical parameters of RNA-Seq in the context of a multi-site, double-blind randomized clinical trial using primary patient samples. We collected longitudinal peripheral blood mononuclear cell (PBMC) samples from 10 subjects after vaccination with a live attenuated Francisella tularensis vaccine and performed RNA-Seq at two different sites using aliquots from the same sample to generate two large-scale replicate datasets. We evaluated the impact of (i) filtering lowly-expressed genes, (ii) using external RNA controls, (iii) fold change and false discovery rate (FDR) filtering, (iv) read length, and (v) sequencing depth on differential expressed genes (DEGs) concordance between replicate datasets. Using synthetic mRNA spike-ins, we developed a method for empirically establishing minimal read-count thresholds for maintaining fold change accuracy on a per-experiment basis—the CpmERCCutoff R package implements this method and is publicly available on CRAN. We defined a reference PBMC transcriptome by pooling sequence data and established the impact of read depth and gene filtering on transcript representation. Lastly, we modeled statistical power to detect DE genes for a range of sample sizes, effect sizes, and coverage depths. The results from this study provide RNA sequencing benchmarks and guidelines for planning future similar vaccine studies.