<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Bioinform.</journal-id>
<journal-title>Frontiers in Bioinformatics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Bioinform.</abbrev-journal-title>
<issn pub-type="epub">2673-7647</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">786898</article-id>
<article-id pub-id-type="doi">10.3389/fbinf.2022.786898</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Bioinformatics</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>
<italic>DJExpress</italic>: An Integrated Application for Differential Splicing Analysis and Visualization</article-title>
<alt-title alt-title-type="left-running-head">Gallego-Paez and Mauer</alt-title>
<alt-title alt-title-type="right-running-head">Alternative Splicing Analysis With <italic>DJExpress</italic>
</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Gallego-Paez</surname>
<given-names>Lina Marcela</given-names>
</name>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1314937/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Mauer</surname>
<given-names>Jan</given-names>
</name>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1494747/overview"/>
</contrib>
</contrib-group>
<aff>
<institution>BioMed X Institute (GmbH)</institution>, <addr-line>Heidelberg</addr-line>, <country>Germany</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1083578/overview">Sean O&#x2019;Donoghue</ext-link>, Garvan Institute of Medical Research, Australia</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/929785/overview">Junfeng Xia</ext-link>, Anhui University, China</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1549342/overview">Yoseph Barash</ext-link>, University of Pennsylvania, United&#x20;States</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Lina Marcela Gallego-Paez, <email>linhiel@gmail.com</email>; Jan Mauer, <email>jan.mauer@gmail.com</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Data Visualization, a section of the journal Frontiers in Bioinformatics</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>24</day>
<month>02</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>2</volume>
<elocation-id>786898</elocation-id>
<history>
<date date-type="received">
<day>30</day>
<month>09</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>08</day>
<month>02</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2022 Gallego-Paez and Mauer.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Gallego-Paez and Mauer</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these&#x20;terms.</p>
</license>
</permissions>
<abstract>
<p>RNA-seq analysis of alternative pre-mRNA splicing has facilitated an unprecedented understanding of transcriptome complexity in health and disease. However, despite the availability of countless bioinformatic pipelines for transcriptome-wide splicing analysis, the use of these tools is often limited to expert bioinformaticians. The need for high computational power, combined with computational outputs that are complicated to visualize and interpret present obstacles to the broader research community. Here we introduce <italic>DJExpress</italic>, an R package for differential expression analysis of transcriptomic features and expression-trait associations. To determine gene-level differential junction usage as well as associations between junction expression and molecular/clinical features, <italic>DJExpress</italic> uses raw splice junction counts as input data. Importantly, <italic>DJExpress</italic> runs on an average laptop computer and provides a set of interactive and intuitive visualization formats. In contrast to most existing pipelines, <italic>DJExpress</italic> can handle both annotated and <italic>de novo</italic> identified splice junctions, thereby allowing the quantification of novel splice events. Moreover, <italic>DJExpress</italic> offers a web-compatible graphical interface allowing the analysis of user-provided data as well as the visualization of splice events within our custom database of differential junction expression in cancer (DJEC DB). DJEC DB includes not only healthy and tumor tissue junction expression data from TCGA and GTEx repositories but also cancer cell line data from the DepMap project. The integration of DepMap functional genomics data sets allows association of junction expression with molecular features such as gene dependencies and drug response profiles. This facilitates identification of cancer cell models for specific splicing alterations that can then be used for functional characterization in the lab. Thus, <italic>DJExpress</italic> represents a powerful and user-friendly tool for exploration of alternative splicing alterations in RNA-seq data, including multi-level data integration of alternative splicing signatures in healthy tissue, tumors and cancer cell&#x20;lines.</p>
</abstract>
<kwd-group>
<kwd>alternative splicing</kwd>
<kwd>splicing aberrations</kwd>
<kwd>differential splicing analysis</kwd>
<kwd>cancer splicing</kwd>
<kwd>The Cancer Genome Atlas Program (TCGA)</kwd>
<kwd>GTEx database</kwd>
</kwd-group>
<contract-sponsor id="cn001">Merck KGaA<named-content content-type="fundref-id">10.13039/100009945</named-content>
</contract-sponsor>
</article-meta>
</front>
<body>
<sec id="s1">
<title>Introduction</title>
<p>Splicing of pre-mRNA is a crucial process in eukaryotic gene expression regulation. In addition to canonical splicing, which leads to the inclusion of constitutive exons into the mature mRNA, the transcriptome is subjected to alternative splicing. Alternative splicing can give rise to multiple protein-coding isoforms from a single pre-mRNA and thus represents a major determinant for proteome diversity. Approximately 92%&#x2013;94% of human genes generate alternatively spliced transcripts, often with tissue-specific regulation (<xref ref-type="bibr" rid="B74">Wang et&#x20;al., 2008</xref>; <xref ref-type="bibr" rid="B2">Barbosa-Morais et&#x20;al., 2012</xref>). Alternative splicing is involved in a variety of cellular processes, such as cell proliferation, differentiation, migration and survival (<xref ref-type="bibr" rid="B45">Paronetto et&#x20;al., 2016</xref>; <xref ref-type="bibr" rid="B15">Gallego-Paez et&#x20;al., 2017</xref>). Emerging data indicate that alternative splicing plays a critical role in the pathogenesis of many diseases, including several molecular subtypes of cancer (<xref ref-type="bibr" rid="B44">Oltean and Bates, 2014</xref>; <xref ref-type="bibr" rid="B57">Scotti and Swanson, 2016</xref>; <xref ref-type="bibr" rid="B21">Jiang and Chen, 2021</xref>). Interrogating such splicing abnormalities can facilitate identification of disease drivers, drug resistance mechanisms, and molecules capable of regulating pathological splicing events. Thus, exploration of alternative and aberrant splicing phenotypes promises to shed light on novel aspects of health and disease.</p>
<p>The recent release of transcriptome-wide RNA sequencing (RNA-seq) data repositories such as The Cancer Genome Atlas (TCGA) (<xref ref-type="bibr" rid="B68">Tomczak et&#x20;al., 2015</xref>) and the Genotype-Tissue Expression (GTEx) project (<xref ref-type="bibr" rid="B34">Lonsdale et&#x20;al., 2013</xref>) have lifted alternative splicing analysis opportunities to an unprecedented level. However, a unified and accessible analysis strategy for this data has largely been missing.</p>
<p>The gradual development of RNA-seq technologies and cost-effective alternative splicing studies at the transcriptome level has allowed the parallel evolution of bioinformatic tools for splicing quantification and visualization. Most of these tools rely on two main computational approaches: 1) quantification of the Percent Spliced-In (PSI) metric, which uses the ratio between exon-exon junction spanning sequencing reads that provide evidence for the inclusion or exclusion of an alternatively spliced region [e.g., rMATS (<xref ref-type="bibr" rid="B61">Shen et&#x20;al., 2014</xref>), MISO (<xref ref-type="bibr" rid="B24">Katz et&#x20;al., 2010</xref>), SUPPA (<xref ref-type="bibr" rid="B1">Alamancos et&#x20;al., 2015</xref>), SplAdder (<xref ref-type="bibr" rid="B23">Kahles et&#x20;al., 2016</xref>), psichomics (<xref ref-type="bibr" rid="B56">Saravia-Agostinho and Barbosa-Morais, 2019</xref>), AltAnalyze (<xref ref-type="bibr" rid="B14">Emig et&#x20;al., 2010</xref>), SpliceSeq (<xref ref-type="bibr" rid="B54">Ryan et&#x20;al., 2012</xref>), VAST-TOOLS (<xref ref-type="bibr" rid="B19">Irimia et&#x20;al., 2014</xref>), MAJIQ (<xref ref-type="bibr" rid="B71">Vaquero-Garcia et&#x20;al., 2016</xref>), LeafCutter (<xref ref-type="bibr" rid="B31">Li et&#x20;al., 2018</xref>) and Whippet (<xref ref-type="bibr" rid="B65">Sterne-Weiler et&#x20;al., 2018</xref>)], and 2) quantification and de-convolution of the entire set of reads aligned to the gene to estimate transcript isoform abundance (e.g., Cufflinks (<xref ref-type="bibr" rid="B70">Trapnell et&#x20;al., 2010</xref>), RSEM (<xref ref-type="bibr" rid="B28">Li and Dewey, 2011</xref>), Sailfish (<xref ref-type="bibr" rid="B47">Patro et&#x20;al., 2014</xref>), Salmon (<xref ref-type="bibr" rid="B46">Patro et&#x20;al., 2017</xref>) and Kallisto (<xref ref-type="bibr" rid="B5">Bray et&#x20;al., 2016</xref>)) (see <xref ref-type="table" rid="T1">Table&#x20;1</xref> for a comparison of these tools). Although these bioinformatic tools have propelled transcriptome-wide alternative splicing analysis forward, they suffer from significant limitations. These include the need for high computational resources and bash-based operation, restrictions of input file formats, incomplete transcriptome annotation and consequently inaccurate transcript/PSI quantification. Furthermore, these tools suffer from complex static graphical outputs that are complicated to visualize and interpret or lack the option for association of splicing phenotypes to clinical or molecular data. These caveats are obstacles for a straight-forward interpretation of the biological and physiological relevance of alternative splicing in disease. Thus, despite the large variety of available tools, there is still a high demand for easy-to-use alternative splicing analysis strategies that can incorporate comprehensive data visualization and integration with external sample traits.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Feature comparison between <italic>DJExpress</italic> and other existing splicing analysis&#x20;tools.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Tool</th>
<th align="center">GUI</th>
<th align="center">User-selected alignment method</th>
<th align="center">Non-annotated junctions supported</th>
<th align="center">Splicing pattern visualization</th>
<th align="center">Downstream trait association</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">DJExpress</td>
<td align="left">Yes</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
</tr>
<tr>
<td align="left">MAJIQ</td>
<td align="left">Yes</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">No</td>
</tr>
<tr>
<td align="left">Psichomics</td>
<td align="left">Yes</td>
<td align="center">Yes</td>
<td align="center">No</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
</tr>
<tr>
<td align="left">AltAnalize</td>
<td align="left">Yes</td>
<td align="center">Yes</td>
<td align="center">No</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
</tr>
<tr>
<td align="left">LeafCutter</td>
<td align="left">Yes</td>
<td align="center">No</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
</tr>
<tr>
<td align="left">SplAdder</td>
<td align="left">No</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">No</td>
</tr>
<tr>
<td align="left">rMATS</td>
<td align="left">No</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">No</td>
<td align="center">No</td>
</tr>
<tr>
<td align="left">SpliceSeq</td>
<td align="left">Yes</td>
<td align="center">No</td>
<td align="center">No</td>
<td align="center">Yes</td>
<td align="center">No</td>
</tr>
<tr>
<td align="left">Whippet</td>
<td align="left">No</td>
<td align="center">No</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">No</td>
</tr>
<tr>
<td align="left">JunctionSeq</td>
<td align="left">No</td>
<td align="center">No</td>
<td align="center">Yes</td>
<td align="center">Yes</td>
<td align="center">No</td>
</tr>
<tr>
<td align="left">MISO</td>
<td align="left">No</td>
<td align="center">No</td>
<td align="center">No</td>
<td align="center">Yes</td>
<td align="center">No</td>
</tr>
<tr>
<td align="left">SUPPA</td>
<td align="left">No</td>
<td align="center">Yes</td>
<td align="center">No</td>
<td align="center">No</td>
<td align="center">No</td>
</tr>
<tr>
<td align="left">Cufflinks</td>
<td align="left">No</td>
<td align="center">No</td>
<td align="center">Yes</td>
<td align="center">No</td>
<td align="center">No</td>
</tr>
<tr>
<td align="left">Salmon</td>
<td align="left">No</td>
<td align="center">Yes</td>
<td align="center">No</td>
<td align="center">No</td>
<td align="center">No</td>
</tr>
<tr>
<td align="left">RSEM</td>
<td align="left">No</td>
<td align="center">Yes</td>
<td align="center">No</td>
<td align="center">No</td>
<td align="center">No</td>
</tr>
<tr>
<td align="left">Sailfish</td>
<td align="left">No</td>
<td align="center">No</td>
<td align="center">No</td>
<td align="center">No</td>
<td align="center">No</td>
</tr>
<tr>
<td align="left">VAST-TOOLS</td>
<td align="left">No</td>
<td align="center">No</td>
<td align="center">No</td>
<td align="center">No</td>
<td align="center">No</td>
</tr>
<tr>
<td align="left">Kallisto</td>
<td align="left">No</td>
<td align="center">No</td>
<td align="center">No</td>
<td align="center">No</td>
<td align="center">No</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Here we introduce a novel differential junction expression analysis pipeline, <italic>DJExpress</italic>, which is an R package for analysis of transcriptomic features and expression-trait associations. <italic>DJExpress</italic> runs on an average laptop computer (<xref ref-type="sec" rid="s9">Supplementary Figure S1</xref>) and provides a set of interactive and intuitive visualization formats. <italic>DJExpress</italic> uses raw splice junction counts&#x2014;derived from STAR aligner (<xref ref-type="bibr" rid="B13">Dobin et&#x20;al., 2013</xref>) or other junction quantification algorithms&#x2014;as input data to determine gene-level differential junction usage. The statistical approaches implemented by <italic>DJExpress</italic> include empirical Bayesian procedures to assess differential junction expression between experimental conditions and junction-level t-statistics tests to determine differences between each junction and all other junctions within the same&#x20;gene.</p>
<p>In contrast to the majority of existing pipelines, <italic>DJExpress</italic> can handle both annotated and <italic>de novo</italic> identified splice junctions, thereby allowing the characterization of novel splice events. Moreover, through gene-level differential junction usage calculation, <italic>DJExpress</italic> identifies associations between junction expression and molecular/clinical features using large matrix operations. An additional more advanced feature of <italic>DJExpress</italic> involves weighted junction co-expression network analysis (JCNA). JCNA-derived junction expression modules can be correlated with phenotypes of interest, thereby allowing differential splicing analysis on a systemic scale. For downstream processing, JCNA outputs can be exported in a format compatible with network visualization tools such as VisANT and Cytoscape (<xref ref-type="bibr" rid="B60">Shannon et&#x20;al., 2003</xref>; <xref ref-type="bibr" rid="B18">Hu et&#x20;al., 2004</xref>).</p>
<p>In addition to these locally accessible features, <italic>DJExpress</italic> offers a web-compatible graphical interface for the analysis of user-provided data as well as the visualization of DJEC DB, a custom database of cancer-specific splicing profiles and their association to external traits from tumor samples and cancer cell lines. DJEC DB includes not only TCGA and GTEx data, but also cancer cell line data from the Cancer Dependency Map (DepMap<xref ref-type="fn" rid="fn1">
<sup>1</sup>
</xref>) project. The integration of DepMap data allows association of junction expression with functional genomics features such as gene dependencies and drug response profiles. This facilitates identification of cancer cell models for specific splicing alterations that can then be used for functional characterization in the&#x20;lab.</p>
<p>Taken together, <italic>DJExpress</italic> represents a novel and versatile tool to analyze and explore alternative splicing phenotypes in health and disease.</p>
</sec>
<sec sec-type="methods" id="s2">
<title>Methods</title>
<sec id="s2-1">
<title>Differential Junction Expression Module</title>
<p>The data analysis workflow in the DJE module is depicted in <xref ref-type="fig" rid="F1">Figure&#x20;1</xref>. For differential junction expression (DJE) and junction co-expression network analysis (JCNA), <italic>DJExpress</italic> uses quantified raw reads aligned to exon-exon junction loci and the transcriptome annotation as the primary input. Mapped and quantified junction reads are typically generated from FASTQ or BAM files using common RNA-seq alignment/quantification tools [e.g., STAR (<xref ref-type="bibr" rid="B13">Dobin et&#x20;al., 2013</xref>), TopHat (<xref ref-type="bibr" rid="B69">Trapnell et&#x20;al., 2009</xref>), MapSplice (<xref ref-type="bibr" rid="B75">Wang et&#x20;al., 2010</xref>), Rsubread (<xref ref-type="bibr" rid="B32">Liao et&#x20;al., 2019</xref>)] (<xref ref-type="fig" rid="F2">Figure&#x20;2A</xref>). Following the statistical principles in limma Bioconductor package (<xref ref-type="bibr" rid="B26">Law et&#x20;al., 2014</xref>; <xref ref-type="bibr" rid="B53">Ritchie et&#x20;al., 2015</xref>), <italic>DJExpress</italic> first tests for differential expression of genomic features (here splice junction regions) using an initial input matrix of read count values as rows and sample ids as columns. Count data is then transformed to log2-counts per million (logCPM), and observation-level weights based on mean-variance relationship are computed (using the <italic>voom</italic> function from <italic>limma</italic>). Users can decide at this point whether to keep the default expression threshold for filtering junctions prior to hypothesis testing (10 minimum of read count mean per junction) or to adjust the threshold based on the mean-variance trend. A linear model is then fit per junction using a provided experimental design, and empirical Bayes moderated <italic>t</italic>-statistics are implemented to assess the significance level of the observed expression changes.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>General workflow of the DJE analysis module in <italic>DJExpress</italic>. Junction quantification files (e.g., SJ.out.tab files from STAR aligner) and transcriptome annotation files (gft file format) are provided by the user as input. Junctions are then annotated with their corresponding genes and filtered based on user-defined expression cutoffs. Differential junction expression is then calculated between experimental conditions. Significant differences in junction usage can be interactively visualized using the gene-wise PlotSplice graph. When external trait data is provided, the DJE module can identify significant junction-trait associations that can be further visualized using SpliceRadar plots.</p>
</caption>
<graphic xlink:href="fbinf-02-786898-g001.tif"/>
</fig>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>Calculation of differential junction expression using the DJE module. <bold>(A)</bold> After alignment and quantification of RNA-seq reads supporting exon&#x2013;exon junctions, differential junction expression is analyzed and depicted using the gene-wise splice plot visualization method. The schematic shows 8 junctions (J1-J8) in hypothetical gene, where each junction is plotted along the <italic>x</italic>-axis and ordered by genomic coordinate position. Relative log-fold change values (logFC), which indicate the difference between the expression of the target junction vs the average junction expression in the gene is shown in the <italic>y</italic>-axis. Junctions with logFC values above a user-defined threshold (absolute logFC of 1.0 in the example) are considered as differentially used and colored blue or red in case of downregulation and upregulation, respectively. <bold>(B)</bold> <italic>DJExpress</italic> determines alternatively spliced transcript regions based on both, alterations in their expression levels compared to the average expression of other junctions the same gene (differential usage, based on relative logFC) and alterations in junction abundance between experimental conditions (differential expression, based on absolute logFC). Junctions are then classified into four main groups. Group 0 corresponds to junctions without differential expression or differential usage and is visually represented as grey points in the scatter plot. Group 1 (red box and red/blue points in the scatter plot) comprises junctions with similar values of absolute and relative logFCs which reflects changes in splicing patterns between experimental conditions without confounding alterations in the total expression of the gene. Group 2 (green box and green points in the scatter plot) represents junctions with differential expression but no differential usage or vice-versa, which indicates the presence of altered total gene expression levels between conditions that explain observed differences. Group 3 (orange box and orange points in the scatter plot) designates junctions with significant but dissimilar levels of relative and absolute logFCs, indicating the presence of both, total gene expression and local splicing changes. Relative vs absolute logFC plots are produced within the output of the DJE module, where junctions are classified into specific groups according to the significance of their logFC values and their position inside or outside of the distribution by &#x2265;2 standard deviations. Arrows indicate example target junctions. <bold>(C)</bold> When external sample trait data (e.g., clinical or molecular data) are provided by the user, <italic>DJExpress</italic> can identify significant junction-trait associations within a target experimental condition using either correlation analysis, ANOVA test or linear regression models. If correlation is selected by the user (as in the depicted example), the results are used to construct heatmap or SpliceRadar plots with target splice junctions (e.g., inclusion junctions (red) and exclusion junction (blue) in an exon skipping event). In the case of SpliceRadars, positive correlation coefficients are located within the outer region (green) and negative correlation coefficients are found within the inner region (grey) of the radar chart, allowing the visual inspection of multivariate trait associations to user-selected alternative splicing events.</p>
</caption>
<graphic xlink:href="fbinf-02-786898-g002.tif"/>
</fig>
<p>The linear model framework of <italic>limma</italic> is also used in parallel to calculate differential junction usage, where significant differences in log-fold changes in the fit model between junctions from the same gene are tested (using the <italic>diffSplice</italic> function from <italic>limma</italic>). <italic>DJExpress</italic> thereby identifies alternatively spliced regions in transcripts based on two main features of splice junction expression: 1) Quantitative changes in the abundance of individual junctions between experimental groups, and 2) Differences in their expression levels compared to the average expression of other junctions in the&#x20;gene.</p>
<p>Following these criteria, splice junctions are classified based on their absolute log-fold change (e.g., experimental condition A vs B) and their relative log-fold change (target junction vs all other junctions in the gene) in one of the following expression groups (<xref ref-type="fig" rid="F2">Figure&#x20;2B</xref>):<list list-type="simple">
<list-item>
<p>Group 0: Junctions without differential expression or differential&#x20;usage.</p>
</list-item>
<list-item>
<p>Group 1: Junctions with equal levels of differential expression and differential usage, reflecting changes in splicing patterns between experimental conditions (in this case, both absolute and relative log-fold change values are similar, if not the same).</p>
</list-item>
<list-item>
<p>Group 2: Junctions with differential expression but no differential usage or vice versa, implying the occurrence of generalized changes in expression across the gene, rather than the presence of a differentially spliced region (in this case, either the absolute or relative log-fold change value is not significant).</p>
</list-item>
<list-item>
<p>Group 3: Junctions with divergent levels of differential expression and differential usage, indicating concomitant changes in splicing and total gene expression (in this case, the absolute and relative log-fold change values can substantially vary from each other).</p>
</list-item>
</list>
</p>
<p>One of the main features of DJE module&#x2019;s approach is the incorporation of an interactive gene-wise junction representation (<xref ref-type="fig" rid="F2">Figure&#x20;2A</xref>). This approach facilitates straight-forward visual inspection of differential splicing across the gene and exploration of supplementary information about each junction&#x2019;s expression. This includes the above-mentioned classification based on absolute and relative log-fold change patterns, basic statistics on expression levels (e.g., mean and median expression in each experimental condition, number of samples expressing the junction, etc.) as well as the identification of non-annotated and condition-specific junctions. The latter are also called &#x201c;neojunctions&#x201d; in the <italic>DJExpress</italic> pipeline, referring to junctions detected in the tested condition but are not found in the control condition.</p>
</sec>
<sec id="s2-2">
<title>Junction-Trait Association Module</title>
<p>Further exploration of the potential physiological relevance of alternative splicing is possible through the association of junction expression to external sample traits (e.g., clinical or molecular data). Significant junction-trait linkages are determined by large matrix operations including correlation analysis, ANOVA test or linear regression models [using <italic>cor and bicor</italic> from <italic>WGCNA</italic> (<xref ref-type="bibr" rid="B25">Langfelder and Horvath, 2008</xref>) and <italic>Matrix_eQTL_engine</italic> from <italic>MatrixEQTL</italic> (<xref ref-type="bibr" rid="B59">Shabalin, 2012</xref>)]. The top significant association can be visualized though heatmap plots or alternatively, using the SpliceRadar plot format (<xref ref-type="fig" rid="F2">Figure&#x20;2C</xref>), where the coefficient of top-ranked correlations is used to map each junction-trait association within a radar chart. This graphical concept allows the users to simultaneously visualize relevant associations between the expression of selected junctions (e.g., the top most differentially expressed junctions or a subset of junctions within a target gene) and external traits, as well as to elucidate expression-trait patterns shared among junctions of interest with potential biological relevance.</p>
</sec>
<sec id="s2-3">
<title>Junction Co-Expression Network Analysis Module</title>
<p>A widely used approach for describing correlation networks in systems biology is the weighted gene co-expression network analysis (WGCNA, <xref ref-type="bibr" rid="B25">Langfelder and Horvath, 2008</xref>). WGCNA is a screening method based on pairwise correlations between features in gene expression data. This approach allows the identification of clusters (or modules) of highly correlated genes, intramodular hub genes and representative module eigengenes (MEs). These can be used in the estimation of module membership values for each gene as well as in association analyses between modules and to external sample traits. This technique has been frequently implemented for the assessment of gene-network signatures and for the identification of functional pathways and candidate molecular biomarkers, integrating gene expression and clinical/molecular data from physiological and disease conditions (<xref ref-type="bibr" rid="B43">Oldham et&#x20;al., 2008</xref>; <xref ref-type="bibr" rid="B48">Presson et&#x20;al., 2008</xref>; <xref ref-type="bibr" rid="B36">Ma et&#x20;al., 2017</xref>; <xref ref-type="bibr" rid="B73">Vieira et&#x20;al., 2019</xref>).</p>
<p>The weighted junction co-expression network analysis module (JCNA) in <italic>DJExpress</italic> provides an implementation of <italic>WGCNA</italic> algorithms (version 1.70.3, <xref ref-type="bibr" rid="B25">Langfelder and Horvath, 2008</xref>) in the context of splice junction expression when sufficient sample size is provided (&#x2265;15 samples within single experimental conditions as suggested in the <italic>WGCNA</italic> guidelines) (<xref ref-type="fig" rid="F3">Figure&#x20;3A</xref>). JCNA initiates with a data pre-processing step where outlier samples (clustered using the average linkage method) and lowly expressed junctions are removed to ensure high confidence network construction. Correlation matrices (e.g., using Pearson, Spearman or the default biweight midcorrelation) (<xref ref-type="bibr" rid="B85">Wilcox, 2012</xref>) are then built for all pair-wise junctions. The full network is subsequently specified by a weighted adjacency matrix calculated with an appropriate soft threshold power (<xref ref-type="bibr" rid="B80">Zhang and Horvath, 2005</xref>). Summary plots of a network topology analysis are produced by JCNA (following <italic>WGCNA</italic> guidelines) to aid users in the selection of the soft-thresholding power around which scale-free topology in the junction network is achieved.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>General workflow of JCNA module in <italic>DJExpress</italic>. <bold>(A)</bold> For the <italic>DJExpress</italic> JCNA module, the user needs to provide junction read counts (or the output of the <italic>DJEanalize</italic> function) and a transcriptome annotation file. After removing outlier samples and lowly expressed junctions, a first round of co-expression analysis is performed where junction modules and module/junction vs trait associations are calculated. The user can continue into a second round of network construction, where co-expression analysis and trait association is produced using gene expression data. This information is used to identify and remove junction-trait correlations from the network that reflect gene expression-based associations. The remaining junction set is used to re-construct junction co-expression modules and module-trait correlations. <bold>(B)</bold> Dendrogram schematic of clustered junctions with assigned modules based on a dissimilarity measure (1-TOM) as described for WGCNA (<xref ref-type="bibr" rid="B25">Langfelder and Horvath, 2008</xref>). <bold>(C)</bold> Heatmap schematic of correlations between junction module eigengenes (MEs) and different sample traits. <bold>(D)</bold> Schematic representation of interaction networks of junctions within a co-expression module that can be produced using Cytoscape or VisANT visualization tools. Junctions belonging to the same gene are indicated by the same&#x20;color.</p>
</caption>
<graphic xlink:href="fbinf-02-786898-g003.tif"/>
</fig>
<p>Additional parameters such as minimum module size, module detection sensitivity or cut height of the hierarchical clustering dendrogram for module definition can be introduced for junction module identification (<xref ref-type="fig" rid="F3">Figure&#x20;3B</xref>). Calculation of MEs is also possible, where expression patterns of all junctions in a module are summarized into a single expression profile. This measure is then used in the correlation analysis with sample traits. Notably, ME calculation reduces the computational burden of multiple testing, which otherwise can be exceedingly high since junction quantification datasets usually comprise millions of expression features.</p>
<p>Users can either keep the output of a 1-pass JCNA or can continue into a second round of network construction. During this 2-pass JCNA, the gene expression-specific effect within junction modules is subtracted. This is particularly relevant in the context of junction-trait associations, since a considerable number of co-expressing junctions are expected to cluster into single modules as a result of intrinsic associations at the gene expression level. Here, 2-pass JCNA improves the identification of true co-splicing signatures, since junctions from the same gene or from highly correlated genes tend to cluster without any specific association to splicing.</p>
<p>For 2-pass JCNA, gene expression-based networks including correlations with a user-selected sample trait are calculated (<xref ref-type="fig" rid="F3">Figure&#x20;3C</xref>). The absolute value of junction significance, which represents the correlation coefficient between a given junction and the selected trait is plotted as a function of the corresponding gene significance. Junctions outside of the distribution by &#x2265; 2 standard deviations (showing no correlation between junction and gene significance for trait) are kept for network re-construction. Thus, 2-pass JCNA strategy allows the user to further explore associations between molecular/clinical traits and modules of co-expressed splicing events that can be defined once gene expression-related junction co-expression is identified and removed from the network.</p>
<p>Furthermore, as in the case of <italic>WGCNA</italic> pipeline, the resulting junction modules from JCNA can be also exported to network graphical tools such as Cytoscape or VisANT for further visual exploration and customization (<xref ref-type="fig" rid="F3">Figure&#x20;3D</xref>).</p>
</sec>
<sec id="s2-4">
<title>Run Time and Memory Benchmarks</title>
<p>For run time and memory consumption benchmarks of function within the DJE module (<italic>DJEimport</italic>, <italic>DJEannotate</italic>, <italic>DJEprepare</italic> and <italic>DJEanalyze</italic>), we used STAR-derived junction quantification files from the TCGA COADREAD tumor sample cohort. <italic>DJExpress</italic> pipeline was applied 10&#x20;times on two cores of a macOS X 11.6.1 system with 2.3&#xa0;GHz Quad-Core Intel Core i5 processor and 16&#xa0;GB of memory, RStudio Desktop 1.4. 1106 and R 4.0.5. Each run was performed on datasets with increasing number of samples (e.g., 10, 20, 40, 60, 80, 100, 200, 400,600, 800, 1000) and 100,000 randomly retrieved splice junctions. For the differential junction expression analysis using <italic>DJEanalyze</italic>, samples were randomly divided into two groups using Bernoulli distributed values with a 50% probability of success (<xref ref-type="sec" rid="s9">Supplementary Figure&#x20;S1</xref>).</p>
</sec>
<sec id="s2-5">
<title>Data Collection for Differential Junction Expression in Cancer Database</title>
<p>Using the pipelines described for the DJE and JCNA modules, we generated DJEC DB, a custom database of cancer-specific splicing profiles and their association to external traits from tumor samples and cancer cell lines (<xref ref-type="fig" rid="F4">Figure&#x20;4</xref>). DJEC DB can be accessed through a graphical interface based on the <italic>shiny</italic> package (version 1.6.0) and includes healthy and tumor tissue data for 9,842 human samples across 32 different tumor types from TCGA, 3,235 normal post-mortem tissue samples from GTEx and 1,019 cancer cell lines from the DepMap Project.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>Schematic representation of DJEC DB data generation. DJEC DB takes STAR-based junction quantification across cancer tissue types and normal tissue extracted from the Cancer Genome Atlas (TCGA) and the Genotype-Tissue Expression (GTEx) database respectively. Significant differences in junction usage between tumor and normal tissues were produced following DJE module pipeline. Cancer type-specific DJE with supplementary information (e.g., statistics summary, absolute vs relative logFC group, etc.) as well as gene-wise splice graphs and domain-annotated gene models with the position of user-selected junctions can be also visualized. Differentially expressed junctions in COADREAD were used as example data for junction co-expression network analysis (JCNA). Associations between DJE and TCGA-associated trait data including microsatellite instability (MSI), mutations (MUT), genomic alterations (GA) and pathway alterations (PA) can be explored within the &#x201c;JT association&#x201d; section. Junction quantification data from cell lines within DepMap repository was also introduced in the &#x201c;CCLE junctions&#x201d; section, allowing the user to identify cancer cell models for specific splicing alterations and splicing-trait associations that can be used for functional characterization of splicing-trait associations in the lab (TCGA tumor type abbreviation codes are as follows: ACC, adrenocortical carcinoma; BLCA, bladder urothelial carcinoma; BRCA, breast invasive carcinoma; CESC, cervical squamous cell carcinoma and endocervical adenocarcinoma; CHOL, cholangiocarcinoma; COAD, colon adenocarcinoma; DLBC, diffuse large B-cell lymphoma; ESCA, esophageal carcinoma; GBM, glioblastoma multiforme; HNSC, head and neck squamous cell carcinoma; KICH, chromophobe renal cell carcinoma; KIRC, clear cell renal clear cell carcinoma; KIRP, papillary renal cell carcinoma; LAML, acute myeloid leukemia; LGG, lower-grade glioma; LIHC, hepatocellular carcinoma; LUAD, lung adenocarcinoma; LUSC, lung squamous cell carcinoma; MESO, mesothelioma; OV, ovarian serous adenocarcinoma; PAAD, pancreatic adenocarcinoma; PCPG, phaeochromocytoma and paraganglioma; PRAD, prostate adenocarcinoma; READ, rectal adenocarcinoma; SARC, adult soft tissue sarcoma; SKCM, cutaneous melanoma; STAD, stomach adenocarcinoma; TGCT, testicular germ cell tumor; THCA, thyroid carcinoma; THYM, thymoma; UCEC, uterine corpus endometrial carcinoma; UCS, uterine carcinosarcoma; UVM, uveal melanoma).</p>
</caption>
<graphic xlink:href="fbinf-02-786898-g004.tif"/>
</fig>
<p>Alignment of GTEx and TCGA RNA-seq data sets to the GRCh37 reference genome and subsequent splice junction quantification, as well as removal of low-quality tissue samples was previously done (<xref ref-type="bibr" rid="B22">Kahles et&#x20;al., 2018</xref>) using the STAR aligner tool with the following arguments:</p>
<p>STAR --genomeDir GENOME --readFilesIn READ1 READ2 --runThreadN 4 --outFilterMultimapScoreRange 1 --outFilterMultimapNmax 20 --outFilterMismatchNmax 10 --alignIntronMax 500000 --alignMatesGapMax 1000000&#x20;--sjdbScore 2 --alignSJDBoverhangMin 1 --genomeLoad NoSharedMemory --limitBAMsortRAM 70000000000 --readFilesCommand cat --outFilterMatchNminOverLread 0.33 --outFilterScoreMinOverLread 0.33 --sjdbOverhang 100 --outSAMstrandField intronMotif --outSAMattributes NH HI NM MD AS XS --sjdbGTFfile GENCODE_ANNOTATION --limitSjdbInsertNsj 2000000 --outSAMunmapped None --outSAMtype BAM SortedByCoordinate --outSAMheaderHD @HD VN:1.4 --outSAMattrRGline ID::&#x3c;ID&#x3e; --twopassMode Basic --outSAMmultNmax 1</p>
<p>We used the raw junction counts from this study as the basis for DJEC DB. For this, differential junction expression analysis was implemented comparing junction abundance between each TCGA cancer type and all GTEx normal tissues. Cancer-specific changes in junction expression can be accessed through the DJE Module section in the DJEC DB web application (<xref ref-type="sec" rid="s9">Supplementary Figure S2</xref>). Here, users can select target junctions to visually explore interactive splice plots and differentially expressed junctions in the context of protein domain and post-translational modifications annotated within the Prot2HG database of protein domains mapped to the human genome (<xref ref-type="bibr" rid="B64">Stanek et&#x20;al., 2020</xref>).</p>
<p>In addition to RNA-seq data, the TCGA repository contains an extensive molecular and clinical annotation for tumor samples, including additional omics data (genotyping, DNA methylation, etc.) as well as multiple tumor classifications and clinical records of the patient. This data collection allows comprehensive correlation analyses between junction expression and tumor/patient traits. The junction-trait (JT) module section of DJEC DB (<xref ref-type="sec" rid="s9">Supplementary Figure S3</xref>) contains significant linkages found between differentially expressed junctions and microsatellite instability (MSI) or altered oncogenic signaling pathways based on mutations, copy-number changes (CNV), mRNA expression, gene fusions and DNA methylation (<xref ref-type="bibr" rid="B55">Sanchez-Vega et&#x20;al., 2018</xref>). This approach is an adaptation of the Matrix eQTL method (<xref ref-type="bibr" rid="B59">Shabalin, 2012</xref>), which uses large matrix operations of linear and ANOVA models containing covariates to account for external factors such as tumor grade or age of the patient.</p>
<p>Moreover, an exemplary co-expression network analysis can be also found within the JCNA section, where users can interactively explore junction expression modules as well as the results of junction-traits associations in TCGA colorectal (COADREAD) tumors (<xref ref-type="sec" rid="s9">Supplementary Figure S4</xref>). This implementation of WGCNA algorithms included the removal of junctions with excessive missing values and sample outliers&#x20;after sample hierarchical clustering using the <italic>goodSamplesGenes</italic> function (<xref ref-type="bibr" rid="B25">Langfelder and Horvath, 2008</xref>). The subsequent soft-thresholding procedure ensures a scale-free network, which emphasizes strong correlations between junctions and penalizes weak correlations. The scale-free network was constructed using the <italic>blockwiseModules</italic> function which converts the correlation matrix into a strengthened adjacency matrix that summarizes the association between all junctions.</p>
<p>Gene-trait correlation matrices were also calculated and used to identify and remove junctions whose correlation to external traits was gene expression-dependent. Junction co-expression modules were identified by dividing the junction expression dendrogram into branches using a dynamic tree cutting algorithm with medium sensitivity for cluster splitting (deepSplit &#x3d; 2). Different colors were then assigned to the modules for subsequent visualization. MEs significance values and correlations between MEs and clinical traits were also calculated. The same was done for individual junction-to-trait correlations.</p>
<p>To implement cancer cell line junction expression data into DJEC DB, we downloaded fastq files from CCLE (available through the Sequence Read Archive (SRA) under accession number PRJNA523380) and carried out alignment and junction quantification with the same strategy that was previously used for TCGA and GTEx data (<xref ref-type="bibr" rid="B22">Kahles et&#x20;al., 2018</xref>). This data was then integrated with DepMap functional genomics data in the CCLE DJE and CCLE SpliceRadar sections of DJEC DB (<xref ref-type="sec" rid="s9">Supplementary Figure S5</xref>). CCLE DJE comprises the results of DJE analysis in cancer cell lines within the same tissue of origin versus fibroblasts used as &#x201c;healthy&#x201d; control cell lines. Significant correlations between differentially expressed junctions and gene expression, CRISPR gene effect or drug response values (<xref ref-type="bibr" rid="B12">DepMap 21Q3 Public, 2021</xref>) are found within CCLE SpliceRadar. Here, users can plot SpliceRadar charts with selected junction-trait associations. These database components aim to facilitate the identification of cancer cell models for specific splicing alterations and junction-trait associations that can be further studied for functional characterization in the&#x20;lab.</p>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>Results</title>
<p>The <italic>DJExpress</italic> toolbox incorporates both an R package (containing DJE and JCNA modules) and a user-friendly Shiny-based web application for a visual exploration of DJEC DB as well as custom DJE analysis for user-provided junction quantification data. Input files can either be STAR aligner-derived &#x201c;SJ.out.tab&#x201d; files (containing splice junction counts per sample in tab-delimited format) or any other junction quantification files as long as they contain junction IDs as first columns, following the format chr:start:end:strand (e.g., chr1:123:456:1, where positive or negative strand are coded as 1 and 2, respectively). In the following paragraphs, we describe the use of <italic>DJExpress</italic> and DJEC DB in detail and use case studies to demonstrate how <italic>DJExpress</italic> and DJEC DB can be utilized to identify and computationally explore alternative splice events across cell lines and patient samples.</p>
<sec id="s3-1">
<title>Differential Junction Expression and Junction-Trait Association Analyses in Cancer Cell Lines</title>
<p>To demonstrate the workflow of <italic>DJExpress</italic>, we analyzed cancer cell lines from the DepMap repository, comprising 13 tissue types that contain &#x2265;30 individual cell lines per tissue (brain, breast, colon/colorectal, gastric, head and neck, kidney, leukemia, lung, lymphoma, myeloma, ovarian, pancreatic and skin cancer). <xref ref-type="table" rid="T2">Table&#x20;2</xref> summarizes the results of DJE analysis module per tissue, using junction expression in fibroblasts as normal control condition. Users can explore this data in the DJE-CCLE section of DJEC&#x20;DB.</p>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>Summary of DJE module junction statistics in CCLE.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">CCLE tissue</th>
<th align="center">Quantified junctions</th>
<th align="center">DE junctions</th>
<th align="center">DE junctions in Group 1</th>
<th align="center">DE junctions in Group 2</th>
<th align="center">DE junctions in Group 3</th>
<th align="center">Novel junctions</th>
<th align="center">Neojunctions</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Brain</td>
<td align="center">120,611</td>
<td align="center">846</td>
<td align="center">74</td>
<td align="center">73</td>
<td align="center">14</td>
<td align="center">3,456</td>
<td align="center">110</td>
</tr>
<tr>
<td align="left">Breast</td>
<td align="center">123,349</td>
<td align="center">2,153</td>
<td align="center">499</td>
<td align="center">431</td>
<td align="center">247</td>
<td align="center">3,426</td>
<td align="center">255</td>
</tr>
<tr>
<td align="left">Colon</td>
<td align="center">122,639</td>
<td align="center">3,363</td>
<td align="center">663</td>
<td align="center">722</td>
<td align="center">409</td>
<td align="center">3,400</td>
<td align="center">336</td>
</tr>
<tr>
<td align="left">Gastric</td>
<td align="center">126,487</td>
<td align="center">2,335</td>
<td align="center">540</td>
<td align="center">486</td>
<td align="center">293</td>
<td align="center">3,806</td>
<td align="center">320</td>
</tr>
<tr>
<td align="left">Head-Neck</td>
<td align="center">119,194</td>
<td align="center">2,398</td>
<td align="center">440</td>
<td align="center">391</td>
<td align="center">144</td>
<td align="center">3,573</td>
<td align="center">316</td>
</tr>
<tr>
<td align="left">Kidney</td>
<td align="center">117,989</td>
<td align="center">1,231</td>
<td align="center">185</td>
<td align="center">143</td>
<td align="center">119</td>
<td align="center">3,574</td>
<td align="center">164</td>
</tr>
<tr>
<td align="left">Leukemia</td>
<td align="center">123,295</td>
<td align="center">3,668</td>
<td align="center">631</td>
<td align="center">1,060</td>
<td align="center">511</td>
<td align="center">3,563</td>
<td align="center">514</td>
</tr>
<tr>
<td align="left">Lung</td>
<td align="center">130,297</td>
<td align="center">2,327</td>
<td align="center">386</td>
<td align="center">549</td>
<td align="center">154</td>
<td align="center">3,403</td>
<td align="center">368</td>
</tr>
<tr>
<td align="left">Lymphoma</td>
<td align="center">122,911</td>
<td align="center">3,795</td>
<td align="center">689</td>
<td align="center">1,012</td>
<td align="center">524</td>
<td align="center">3,772</td>
<td align="center">354</td>
</tr>
<tr>
<td align="left">Myeloma</td>
<td align="center">119,528</td>
<td align="center">3,307</td>
<td align="center">727</td>
<td align="center">678</td>
<td align="center">420</td>
<td align="center">3,734</td>
<td align="center">398</td>
</tr>
<tr>
<td align="left">Ovarian</td>
<td align="center">122,251</td>
<td align="center">1,603</td>
<td align="center">295</td>
<td align="center">283</td>
<td align="center">238</td>
<td align="center">3,512</td>
<td align="center">241</td>
</tr>
<tr>
<td align="left">Pancreatic</td>
<td align="center">121,817</td>
<td align="center">2,528</td>
<td align="center">448</td>
<td align="center">418</td>
<td align="center">308</td>
<td align="center">3,614</td>
<td align="center">220</td>
</tr>
<tr>
<td align="left">Skin</td>
<td align="center">120,200</td>
<td align="center">2,036</td>
<td align="center">186</td>
<td align="center">357</td>
<td align="center">247</td>
<td align="center">3,498</td>
<td align="center">197</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>
<italic>DJExpress</italic> identified on average of 1,918 differentially used junctions (FDR &#x3c; 0.05 and &#x7c;logFC&#x7c; &#x3e; 1), including previously described alternative splicing events in cancer, such as the downregulation of <italic>ACTN1</italic> exon 19b (<xref ref-type="bibr" rid="B16">Gardina et&#x20;al., 2006</xref>; <xref ref-type="bibr" rid="B67">Thorsen et&#x20;al., 2008</xref>; <xref ref-type="bibr" rid="B4">Bielli et&#x20;al., 2018</xref>), <italic>VCL</italic> exon 19 (<xref ref-type="bibr" rid="B16">Gardina et&#x20;al., 2006</xref>; <xref ref-type="bibr" rid="B67">Thorsen et&#x20;al., 2008</xref>), the upregulation of <italic>NUMB</italic> exon 12 (<xref ref-type="bibr" rid="B39">Misquitta-Ali et&#x20;al., 2011</xref>; <xref ref-type="bibr" rid="B3">Bechara et&#x20;al., 2013</xref>; <xref ref-type="bibr" rid="B82">Zhang et&#x20;al., 2014</xref>; <xref ref-type="bibr" rid="B84">Zong et&#x20;al., 2014</xref>), <italic>MAP3K7</italic> exon 12 (<xref ref-type="bibr" rid="B40">Munkley et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B49">Qiu et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B42">Oh et&#x20;al., 2021</xref>), <italic>CTNND1</italic> exon 20 (<xref ref-type="bibr" rid="B79">Yanagisawa et&#x20;al., 2008</xref>; <xref ref-type="bibr" rid="B58">Sebestyen et&#x20;al., 2015</xref>; <xref ref-type="bibr" rid="B76">Wang et&#x20;al., 2020</xref>), and <italic>EXOC1</italic> exon 11 (<xref ref-type="bibr" rid="B51">Ray et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B81">Zhang et&#x20;al., 2020</xref>), as well as of exons contained within the variant domain in <italic>CD44</italic> (<xref ref-type="bibr" rid="B62">Shirure et&#x20;al., 2015</xref>; <xref ref-type="bibr" rid="B8">Chen et&#x20;al., 2018</xref>; <xref ref-type="bibr" rid="B78">Wang et&#x20;al., 2018</xref>; <xref ref-type="bibr" rid="B10">Chen et&#x20;al., 2020</xref>) (<xref ref-type="fig" rid="F5">Figure&#x20;5</xref>; <xref ref-type="sec" rid="s9">Supplementary Figure S6</xref>). Moreover, the gene-wise visualization of differential junction expression allowed the identification of complex alternative splicing patterns and isoform switches in cancer, such as the case of the co-regulated inclusion of exon 11 and exclusion of exon 40 in <italic>MYO18A</italic> in lymphoma and myeloma, the complex local event involving exons 15&#x2013;18 in <italic>MARK3</italic> in leukemia, lymphoma, myeloma, breast, colon, gastric, lung and pancreatic cancer, or the isoform switches in <italic>RGS3</italic> in breast, colon, gastric, lung, ovarian and pancreatic cancers, and <italic>INPP5B</italic> in pancreatic cancer cell lines (<xref ref-type="fig" rid="F6">Figure&#x20;6</xref>; <xref ref-type="sec" rid="s9">Supplementary Figures S7, S8</xref>). These data demonstrate that <italic>DJExpress</italic> can not only reliably identify previously described alternative splicing events but can also facilitate the discovery and visualization of complex splice events within annotated splice regions.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>Expression profile and gene context of known alternative splicing events in cancers detected by <italic>DJExpress</italic> using cancer cell line data. Examples of known cancer-specific splice events are shown as gene-wise splice plots with relative logFC values (upper panels) and gene model plots with exon-to-protein domain annotation (lower panels). <bold>(A,B)</bold> show gene-wise splice plots of exon inclusion events in <italic>NUMB</italic> and <italic>ACTN1</italic> mRNA in breast and lung cancer cell lines, respectively. <bold>(C,D)</bold>&#x20;show gene-wise splice plots of exon skipping events in <italic>MAP3K7</italic> and <italic>VCL</italic> mRNA in gastric and breast cancer cell lines, respectively (Numbers on the <italic>x</italic>-axis in the upper panels indicate the first, last and differentially used junctions in the respective gene. Grey area indicate threshold for significance (&#x7c;logFC&#x7c; &#x3e; 1.0). Downregulated and upregulated junctions with &#x7c;logFC&#x7c; above threshold and significant FDR (&#x3c;0.05) are shown in blue and red, respectively. These same junctions are indicated within the gene model plots as dashed arcs connecting upstream and downstream exons. Colors within exonic regions indicate the presence of protein domains and/or post translational modifications (PTMs) annotated within the Prot2HG protein domain database. Arrows below gene model plots indicate direction of transcription. Coding and UTR exons are illustrated as long and short exons respectively. Junctions with both absolute and relative logFC above the threshold (&#x7c;logFC&#x7c; &#x3e; 1.0) but no significant FDR (&#x3e;0.05) for at least one of them are shown in black).</p>
</caption>
<graphic xlink:href="fbinf-02-786898-g005.tif"/>
</fig>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>Co-regulated splicing events within <italic>MYO18A</italic> transcript in blood cancer. Differentially used junctions as depicted in the gene-wise splice plot in <italic>MYO18A</italic> indicate the concomitant inclusion of exon 11 and exclusion of exon 40 in Myeloma and Lymphoma cell lines. Gene model plot with Prot2HG-based domain annotation suggest that these co-regulated splicing events involve exonic regions containing known <italic>MYO18A</italic> phosphorylation sites (brown), as well as regions comprising the core myosin-like ATPase motor domain, MYSc_Myo18 (orange). <italic>MYO18A</italic> gene-wise splice plot in lymphoma is used as example (Numbers on the <italic>x</italic>-axis in the upper panels indicate the first, last and differentially used junctions in the respective gene. Grey area indicate threshold for significance (&#x7c;logFC&#x7c; &#x3e; 1.0). Downregulated and upregulated junctions with &#x7c;logFC&#x7c; above threshold and significant FDR (&#x3c;0.05) are shown in blue and red, respectively. These same junctions are indicated within the gene model plots as dashed arcs connecting upstream and downstream exons. Colors within exonic regions indicate the presence of protein domains and/or post translational modifications (PTMs) annotated within the Prot2HG protein domain database. Arrows below gene model plots indicate direction of transcription. Coding and UTR exons are illustrated as long and short exons respectively. Junctions with both absolute and relative logFC above the threshold (&#x7c;logFC&#x7c; &#x3e; 1.0) but no significant FDR (&#x3e;0.05) for at least one of them are shown in black).</p>
</caption>
<graphic xlink:href="fbinf-02-786898-g006.tif"/>
</fig>
<p>Notably, an average of 3,563&#x20;non-annotated splice junctions per tissue and 292 neojunctions (defined as junctions not detected in control fibroblast cell lines) were also discovered by the DJE analysis module (<xref ref-type="table" rid="T2">Table&#x20;2</xref>). Here, the visualization of non-annotated junctions within the gene-wise DJE plots allowed us to identify the presence of previously unknown splicing events, including exon skipping, alternative 3&#x2032; splice sites, alternative 5&#x2032; splice sites and alternative first and last exons (<xref ref-type="sec" rid="s9">Supplementary Figure S9</xref>). Moreover, DJE plots also revealed the presence of novel splice junctions with genomic coordinates that suggest the presence of exons so far not described in the human transcriptome annotation (<xref ref-type="fig" rid="F7">Figure&#x20;7</xref>; <xref ref-type="sec" rid="s9">Supplementary Figure S10</xref>). These newly identified splicing events are potentially linked to cancer physiology and their functional characterization could be subject of future studies. Nevertheless, to further illustrate the capabilities of <italic>DJExpress</italic> and DJEC DB, we next focused on a well-described alternative splicing switch in <italic>NUMB</italic>&#x20;mRNA.</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption>
<p>DJE analysis suggests the presence of differentially spliced non-annotated exons in cancer cell lines. Gene-wise splicing as well as gene model plots show non-annotated splice junctions whose gene location indicates the presence of exons not described in the human transcriptome annotation. <bold>(A)</bold> Differentially expressed non-annotated junctions between exon 37 and 38 located in the vicinity of the CNH (dark green) and PH (orange) domains in <italic>CIT</italic>. <bold>(B)</bold> Differentially expressed non-annotated junctions between exon 12 and 13 in <italic>SPIRE1,</italic> which contain the Spir-box domain (pink) involved in the interaction between SPIRE1 and formin (FMN)-type actin nucleators, as well as protein phosphorylation sites (yellow). <bold>(C)</bold> Differentially expressed non-annotated junctions between exon 13 and 14 in <italic>HSP90B1</italic> occurring within the HSP90 chaperone domain (green). For <italic>CIT</italic> and <italic>SPIRE1</italic> gene-wise splice plots, breast cancer is used as example. For <italic>HSP90B1</italic>, lung cancer is used as example (Numbers on the <italic>x</italic>-axis in the upper panels indicate the first, last and differentially used junctions in the respective gene. Grey area indicate threshold for significance (&#x7c;logFC&#x7c; &#x3e; 1.0). Downregulated and upregulated junctions with &#x7c;logFC&#x7c; above threshold and significant FDR (&#x3c;0.05) are shown in blue and red, respectively. These same junctions are indicated within the gene model plots as dashed arcs connecting upstream and downstream exons. Colors within exonic regions indicate the presence of protein domains and/or post translational modifications (PTMs) annotated within the Prot2HG protein domain database. Arrows below gene model plots indicate direction of transcription. Coding and UTR exons are illustrated as long and short exons respectively. Junctions with both absolute and relative logFC above the threshold (&#x7c;logFC&#x7c; &#x3e; 1.0) but no significant FDR (&#x3e;0.05) for at least one of them are shown in black).</p>
</caption>
<graphic xlink:href="fbinf-02-786898-g007.tif"/>
</fig>
</sec>
<sec id="s3-2">
<title>Case Study 1: SpliceRadar-Based Identification of <italic>NUMB</italic> Alternative Splicing Regulators</title>
<p>
<italic>NUMB</italic> encodes for a key determinant of cell fate that regulates the trafficking of surface proteins such as Notch, integrins and E-cadherin and can undergo alternative splicing (<xref ref-type="bibr" rid="B41">Nishimura and Kaibuchi, 2007</xref>; <xref ref-type="bibr" rid="B37">McGill et&#x20;al., 2009</xref>; <xref ref-type="bibr" rid="B66">Teckchandani et&#x20;al., 2009</xref>; <xref ref-type="bibr" rid="B77">Wang et&#x20;al., 2009</xref>). Inclusion of <italic>NUMB</italic> exon 12 is frequently observed in different types of cancer, leading to a 48 amino acid extension of the proline-rich region (PRR) of the NUMB protein (<xref ref-type="bibr" rid="B9">Chen et&#x20;al., 2009</xref>; <xref ref-type="bibr" rid="B82">Zhang et&#x20;al., 2014</xref>; <xref ref-type="bibr" rid="B35">Lu et&#x20;al., 2015</xref>; <xref ref-type="bibr" rid="B50">Rajendran et&#x20;al., 2016</xref>). This longer NUMB isoform (Numb-L) was found to promote proliferation, whereas the shorter isoform (Numb-S) promotes differentiation of cancer cells (<xref ref-type="bibr" rid="B72">Verdi et&#x20;al., 1999</xref>). In lung cancer, the splicing factor <italic>QKI</italic> represses the inclusion of <italic>NUMB</italic> alternative exon through competing with a core splicing factor SF1, thereby inhibiting proliferation and Notch signaling (<xref ref-type="bibr" rid="B84">Zong et&#x20;al., 2014</xref>).</p>
<p>This well-documented <italic>NUMB</italic> isoform switch was also detected with <italic>DJExpress</italic>, which showed a &#x223c;16-fold (log<sub>2</sub> &#x223c;4-fold) upregulation of <italic>NUMB</italic> exon 12 inclusion junctions in breast cancer cell lines compared to fibroblasts (<xref ref-type="fig" rid="F5">Figure&#x20;5A</xref>). A similar <italic>NUMB</italic> splice pattern was observed across other cancer types (data not shown). Furthermore, by using <italic>DJExpress</italic> JT module, we corroborated the positive correlation between <italic>QKI</italic> gene expression and <italic>NUMB</italic> exon 12 exclusion (<xref ref-type="fig" rid="F8">Figure&#x20;8A</xref>). Moreover, SpliceRadar-based visualization identified additional positively and negatively correlated splicing regulators, including <italic>SRPK2</italic> and <italic>RBFOX2</italic>, which have both previously been implicated in the regulation of <italic>NUMB</italic> alternative splicing (<xref ref-type="bibr" rid="B35">Lu et&#x20;al., 2015</xref>). Thus, our data suggests that the control of <italic>NUMB</italic> alternative splicing in cancer may involve a more complex regulatory network than previously thought. These data demonstrate that <italic>DJExpress</italic> can not only validate known associations with splice events but can also, through functionality of the SpliceRadar tool, identify additional regulatory networks that may be altered in cancer.</p>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption>
<p>SpliceRadar plots of top trait associations to <italic>NUMB</italic> alternative splicing in lung cancer. <bold>(A)</bold> Expression of splice junctions supporting exon 12 inclusion in <italic>NUMB</italic> mRNA was correlated to the expression of a panel of manually curated splicing regulators in lung cancer cell lines. The top-ranked correlation coefficients (FDR &#x3c; 0.05 and &#x7c;rho&#x7c; &#x3e; 0.2) were used to construct the SpliceRadar chart with splicing factors depicted along the spokes, revealing a general trend of anti-correlation patterns to splicing factor expression between inclusion (red and dark red) and exclusion (blue) junctions. Previously known associations to <italic>NUMB</italic> splicing were corroborated (e.g., <italic>QKI</italic>, <italic>RBFOX2</italic> and <italic>SRPK2</italic>), and novel associations with similar correlation levels were identified, suggesting a more complex regulatory network of <italic>NUMB</italic> alternative splicing than previously described. <bold>(B)</bold> SpliceRadar plot showing top-ranked correlations (FDR &#x3c; 0.05 and &#x7c;rho&#x7c; &#x3e; 0.2) between exon inclusion junction expression in <italic>NUMB</italic> and gene dependencies (defined as gene loss effect on cell survival) using DepMap CRISPR screen data. Anti-correlation patterns of dependency values and expression of inclusion and exclusion junctions are also observed as in the case of panel <bold>(A)</bold>. <bold>(C)</bold> KEGG pathway enrichment analysis using gene names of significantly associated dependencies ranked by correlation coefficient. The enrichment plot shows top over-represented pathways within <italic>NUMB</italic> splicing-correlated gene dependencies (Dot size represents the number of genes in each KEGG pathway, color gradient indicates significance level of adjusted <italic>p</italic>-values).</p>
</caption>
<graphic xlink:href="fbinf-02-786898-g008.tif"/>
</fig>
<p>DJEC DB incorporates gene dependencies and drug response data from the DepMap repository. We thus expanded the landscape of phenotypic associations to <italic>NUMB</italic> alternative splicing in lung cancer cell lines (<xref ref-type="fig" rid="F8">Figure&#x20;8B</xref>). Pathway enrichment analysis of significantly associated gene dependencies revealed enrichment of components within the mTOR and insulin signaling pathways. This is consistent with previous studies, which suggested that activated ERK signaling is a common mechanism that regulates <italic>NUMB</italic> isoform expression in breast and lung cancer cells (<xref ref-type="bibr" rid="B50">Rajendran et&#x20;al., 2016</xref>) (<xref ref-type="fig" rid="F8">Figure&#x20;8C</xref>). Similarly, SpliceRadar plots using top correlations with drug response values also revealed associations between the expression of exon-inclusion junctions in <italic>NUMB</italic> and cell survival rates after treatment with several compounds targeting PI3K/mTOR and ERK MAPK signaling (<xref ref-type="sec" rid="s9">Supplementary Figure S11</xref>). These data reinforce the notion of a functional connection between <italic>NUMB</italic> exon 12 inclusion and pro-inflammatory signaling cascades.</p>
<p>Taken together, these results illustrate the potential of the <italic>DJExpress</italic> pipeline to identify <italic>bona fide</italic> differentially expressed splice junctions and reveal physiologically relevant associations between junction expression and various external traits. Thus, <italic>DJExpress</italic> can be used to support and generate hypotheses regarding the potential molecular mechanisms involved in the regulation and physiological consequences of alternative splicing.</p>
</sec>
<sec id="s3-3">
<title>DJEC DB Data Summary</title>
<p>TCGA project is a large-scale oncology study that has allowed the comprehensive characterization of multiple cancer types using a catalogue of clinical and molecular data, including RNA sequencing from thousands of patients across multiple tumor types. This resource harbors an excellent opportunity for cancer researchers and clinicians to explore and define tumor-specific transcriptomic signatures, and to integrate them with additional external traits such as mutations, copy number variations (CNV) or microsatellite instability (MSI). These features of TCGA can facilitate identification of novel therapeutic or diagnostic biomarkers. However, TCGA alternative splicing analyses, particularly the association of splice events with clinical and molecular traits, is currently not available in an accessible&#x20;way.</p>
<p>To fill this gap, we generated DJEC DB, a platform that provides an integration of differential junction expression analysis with TCGA molecular and clinical data. For this, we used splice junction quantification from a recently published study (<xref ref-type="bibr" rid="B22">Kahles et&#x20;al., 2018</xref>) where TCGA and GTEx RNA-seq samples were re-analyzed using 2-pass STAR alignment, thereby allowing identification of annotated and <italic>de novo</italic> splice events. Additionally, we quantified junction expression in cancer cell lines from CCLE fastq files and integrated this data with functional genomics data sets from the DepMap repository.</p>
<p>DJEC DB comprises four main sections: 1) Differential Junction Expression (DJE) in TCGA vs GTEx tissue, 2) Junction-Trait (JT) associations using external clinical and molecular sample data, 3) Junction Co-expression Network Analysis (JCNA) using junction expression in colorectal (COADREAD) tissue samples as example dataset, and 4) Differential Junction Expression in cancer cell lines and association with DepMap functional genomics data (DJE-CCLE).</p>
<p>The DJE section comprises summary statistics and visualization options for an average of 6,345 differentially expressed junctions across the 32 tumor tissue types analyzed (FDR &#x3c;0.05 and &#x7c;logFC&#x7c; &#x3e; 2, <xref ref-type="table" rid="T3">Table&#x20;3</xref>). In the JT section, an average of 674 statistically significant associations are shown between differentially expressed junctions and altered oncogenic signaling pathways determined by the presence of mutations, CNVs, altered gene expression, gene fusions, DNA methylation and MSI (in the case of COADREAD tumors).</p>
<table-wrap id="T3" position="float">
<label>TABLE 3</label>
<caption>
<p>Summary of DJE and JT junction statistics in DJEC DB.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">TCGA tissue cohort</th>
<th align="center">Sample size</th>
<th align="center">Quantified junctions</th>
<th align="center">DE junctions</th>
<th align="center">Associations to genomic alterations</th>
<th align="center">Associations to mutations</th>
<th align="center">Associations to pathway alterations</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">ACC</td>
<td align="center">79</td>
<td align="center">13,827,029</td>
<td align="center">2,335</td>
<td align="center">1</td>
<td align="center">2</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">BLCA</td>
<td align="center">408</td>
<td align="center">14,369,479</td>
<td align="center">2,935</td>
<td align="center">215</td>
<td align="center">274</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">BRCA</td>
<td align="center">1,083</td>
<td align="center">15,445,200</td>
<td align="center">3,740</td>
<td align="center">334</td>
<td align="center">306</td>
<td align="center">15</td>
</tr>
<tr>
<td align="left">CESC</td>
<td align="center">304</td>
<td align="center">14,260,819</td>
<td align="center">4,808</td>
<td align="center">14</td>
<td align="center">20</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">CHOL</td>
<td align="center">36</td>
<td align="center">13,786,637</td>
<td align="center">8,446</td>
<td align="center">10</td>
<td align="center">10</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">COADREAD</td>
<td align="center">372</td>
<td align="center">14,315,224</td>
<td align="center">5,534</td>
<td align="center">49</td>
<td align="center">44</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">DLBC</td>
<td align="center">48</td>
<td align="center">13,822,896</td>
<td align="center">6,150</td>
<td align="center">9</td>
<td align="center">5</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">GBM</td>
<td align="center">165</td>
<td align="center">13,995,214</td>
<td align="center">12,781</td>
<td align="center">2</td>
<td align="center">4</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">HNSC</td>
<td align="center">500</td>
<td align="center">14,592,967</td>
<td align="center">5,745</td>
<td align="center">49</td>
<td align="center">117</td>
<td align="center">2</td>
</tr>
<tr>
<td align="left">KIPAN</td>
<td align="center">738</td>
<td align="center">14,965,143</td>
<td align="center">2,836</td>
<td align="center">92</td>
<td align="center">93</td>
<td align="center">1</td>
</tr>
<tr>
<td align="left">LGG</td>
<td align="center">526</td>
<td align="center">14,536,867</td>
<td align="center">6,771</td>
<td align="center">6,708</td>
<td align="center">6,061</td>
<td align="center">404</td>
</tr>
<tr>
<td align="left">LIHC</td>
<td align="center">372</td>
<td align="center">855,905</td>
<td align="center">4,996</td>
<td align="center">97</td>
<td align="center">99</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">LUAD</td>
<td align="center">516</td>
<td align="center">14,681,817</td>
<td align="center">3,931</td>
<td align="center">153</td>
<td align="center">149</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">LUSC</td>
<td align="center">500</td>
<td align="center">14,804,638</td>
<td align="center">4,721</td>
<td align="center">107</td>
<td align="center">114</td>
<td align="center">10</td>
</tr>
<tr>
<td align="left">MESO</td>
<td align="center">82</td>
<td align="center">13,866,293</td>
<td align="center">4,078</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">OV</td>
<td align="center">199</td>
<td align="center">16,204,728</td>
<td align="center">8,509</td>
<td align="center">9</td>
<td align="center">10</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">PAAD</td>
<td align="center">178</td>
<td align="center">13,981,645</td>
<td align="center">4,942</td>
<td align="center">26</td>
<td align="center">26</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">PCPG</td>
<td align="center">183</td>
<td align="center">14,428,362</td>
<td align="center">8,973</td>
<td align="center">228</td>
<td align="center">228</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">PRAD</td>
<td align="center">497</td>
<td align="center">1,166,561</td>
<td align="center">4,097</td>
<td align="center">85</td>
<td align="center">94</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">SARC</td>
<td align="center">257</td>
<td align="center">14,106,882</td>
<td align="center">1,810</td>
<td align="center">12</td>
<td align="center">50</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">SKCM</td>
<td align="center">471</td>
<td align="center">14,106,882</td>
<td align="center">3,436</td>
<td align="center">16</td>
<td align="center">11</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">STES</td>
<td align="center">535</td>
<td align="center">18,214,111</td>
<td align="center">7,155</td>
<td align="center">418</td>
<td align="center">330</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">TGCT</td>
<td align="center">156</td>
<td align="center">14,050,087</td>
<td align="center">9,684</td>
<td align="center">14</td>
<td align="center">14</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">THCA</td>
<td align="center">500</td>
<td align="center">14,437,693</td>
<td align="center">4,885</td>
<td align="center">699</td>
<td align="center">714</td>
<td align="center">37</td>
</tr>
<tr>
<td align="left">THYM</td>
<td align="center">118</td>
<td align="center">13,939,486</td>
<td align="center">3,860</td>
<td align="center">30</td>
<td align="center">31</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">UCEC</td>
<td align="center">179</td>
<td align="center">14,038,958</td>
<td align="center">9,241</td>
<td align="center">114</td>
<td align="center">99</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">UCS</td>
<td align="center">56</td>
<td align="center">13,829,412</td>
<td align="center">9,091</td>
<td align="center">6</td>
<td align="center">5</td>
<td align="center">&#x2014;</td>
</tr>
<tr>
<td align="left">UVM</td>
<td align="center">80</td>
<td align="center">13,809,902</td>
<td align="center">9,285</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
<td align="center">&#x2014;</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>To exemplify the use of the JCNA approach, we selected the 372 samples from the TCGA COADREAD tumor cohort to construct a junction co-expression network (see methods for details). For this, we used a minimum module size of 20 junctions and an unsigned network type, meaning that the weight of connection between nodes (junctions) is calculated irrespectively of the direction of the association, so modules can contain both, positively and negatively correlated junctions (<xref ref-type="sec" rid="s9">Supplementary Figure&#x20;S4</xref>).</p>
<p>From a total of 7,404 junctions filtered by their gene expression-independent association to sample traits, 36 expression modules were found for this tumor type, with an average of 206 junctions per module. Module-trait associations were also determined throughout the correlation between ME expression values and tumor stage, MSI, mutations in TP53, EGFR, KRAS and BRAF genes, as well as expression across six splicing factor gene modules previously calculated from gene expression&#x20;data.</p>
<p>Finally, the DJE-CCLE section contains the results of the differential junction expression analysis of normal fibroblast cells vs cancer cell lines clustered by tissue of origin, as described above. Significant correlations between junction expression and functional genomics data obtained from the DepMap repository are displayed in a summary table and selected association patterns can be visualized using SpliceRadar&#x20;plots.</p>
</sec>
<sec id="s3-4">
<title>Search and Browse DJEC DB</title>
<p>Within the DJE section, users can first define the target tumor tissue type as well as the logFC and FDR cutoffs for the significance in differential expression (<xref ref-type="sec" rid="s9">Supplementary Figure S2</xref>). A table with the summary statistics is displayed and specific target genes or junctions can be selected by the users in order to display gene-wise splice plots as well as a zoomable gene model plots with exon-to-protein domain annotation. In addition, junction-trait associations in TCGA can be explored within the JT section following user-defined tumor tissue type and external molecular trait options (<xref ref-type="sec" rid="s9">Supplementary Figure&#x20;S3</xref>).</p>
<p>For the JCNA section using the TCGA COADREAD sample cohort, a junction dendrogram with expression module assignment, as well as a module-trait association heatmap are displayed (<xref ref-type="sec" rid="s9">Supplementary Figure S4</xref>). For intramodular analysis, users can select specific modules and traits to visualize module-to-trait significance plots, as well as module networks in interactive format. Both are helpful in identifying centrally located intramodular hub junctions with high module membership as well as high significance for selected traits. This allows the user to generate testable hypotheses about junction module expression, regulation and association to cancer phenotypes that can be implemented in validation experiments.</p>
<p>Similar interactive visualization can be also found within the DJE-CCLE section. Here, users can select the tissue of origin, the significance cutoff for differential expression, as well as target genes/junctions and junction-trait associations to be displayed in gene-wise splice and SpliceRadar plots (<xref ref-type="sec" rid="s9">Supplementary Figure&#x20;S5</xref>).</p>
</sec>
<sec id="s3-5">
<title>Case Study 2: Cancer Cell Line DJE Signature Is Recapitulated by Tumor Tissue Analysis in DJEC DB</title>
<p>One of the central features of DJEC DB is the possibility to interrogate the presence of alternative splicing patterns observed in cancer cell lines in the context of tumor tissues. <italic>NUMB</italic>, <italic>VCL</italic>, <italic>MAP3K7</italic> and <italic>EXOC1</italic> exon skipping events are examples of known splicing events that can be also observed in tumor tissue (<xref ref-type="sec" rid="s9">Supplementary Figures S12&#x2013;S15</xref>). Notably, the presence of a differentially expressed non-annotated exon between exon 12 and 13 in <italic>SPIRE1</italic>, which we detected in cancer cell lines (<xref ref-type="fig" rid="F7">Figure&#x20;7B</xref>), was also identified in BRCA, LUAD, KIPAN, PRAD, and THCA cohorts by DJEC DB data using gene-wise splicing visualization (<xref ref-type="fig" rid="F9">Figure&#x20;9</xref>). This suggests that the alternative inclusion of this previously unknown region in <italic>SPIRE1</italic> transcript may be a common feature across different cancer types <italic>in&#x20;vitro</italic> and <italic>in vivo</italic>. These data demonstrate the applicability of DJEC DB in identifying and cross-validating potentially oncogenic alternative splicing patterns both in cancer cell lines and tumor tissue.</p>
<fig id="F9" position="float">
<label>FIGURE 9</label>
<caption>
<p>Differentially expressed non-annotated junctions in <italic>SPIRE1</italic> are also found in the context of primary tumor tissue. Differential expression of junctions suggesting the presence of a non-annotated exon in <italic>SPIRE1</italic> mRNA were not only identified in cancer cell lines (see <xref ref-type="fig" rid="F7">Figure&#x20;7B</xref>) but are also found in BRCA, LUAD, KIPAN, PRAD, and THCA TCGA cohorts. Caption of DJEC DB DJE analysis in KIPAN is shown as example. The exon inclusion event can be found by filtering for differentially expressed junctions following cutoff criteria of &#x3c;0.05 for FDR and &#x7c;logFC&#x7c;&#x3e;1.0 (Panel 1) and then selecting any of the two inclusion junctions based on their genomic coordinates (Panel 2). DJEC DB displays gene-wise splice plots (Panel 3) as well as domain-annotated gene model plots (Panel 4).</p>
</caption>
<graphic xlink:href="fbinf-02-786898-g009.tif"/>
</fig>
<p>The JT module in DJEC DB provides a workflow to associate junction expression with user-provided molecular or clinical traits. In the case of <italic>CTNND1</italic> splicing event, we found significant associations between the expression of exon 20 inclusion junctions and <italic>TP53</italic> mutation status in BRCA, as well as with amplification of <italic>CCND1</italic> gene and epigenetic silencing of <italic>CDKN2A</italic> in STES (<xref ref-type="sec" rid="s9">Supplementary Figure S16</xref>). This is consistent with previous studies indicating that <italic>CCND1</italic> isoforms expression regulates cell proliferation and cell cycle progression by controlling the levels of cyclin proteins in cancer cells (<xref ref-type="bibr" rid="B7">Chartier et&#x20;al., 2007</xref>; <xref ref-type="bibr" rid="B20">Jiang et&#x20;al., 2012</xref>; <xref ref-type="bibr" rid="B33">Liu et&#x20;al., 2014</xref>).</p>
<p>Taken together, these data corroborate DJEC DB as a valuable bioinformatics resource for the exploration and visualization of differential junction expression, as well as for the interrogation of physiologically relevant junction-trait associations in the context of global splicing analysis in cancer cell lines and tumor tissue.</p>
</sec>
</sec>
<sec sec-type="discussion" id="s4">
<title>Discussion</title>
<p>With the increasing availability of NGS data sets, the possibility to perform transcriptome-wide alternative splicing analysis has become a commonality rather than an exception in disease research. Nevertheless, computational analysis pipelines that allow the broad research community to effortlessly interrogate alternative splicing phenotypes are largely missing.</p>
<p>Our custom pipeline, <italic>DJExpress</italic>, aims to address this issue. With <italic>DJExpress</italic>, we have incorporated multiple existing algorithms in a novel computational approach for differential splicing analysis, which is suitable for analysis of small-scale as well as large-scale splice junction datasets. Moreover, <italic>DJExpress</italic> allows the analysis of millions of exon-exon boundaries per sample, using <italic>limma&#x2019;s</italic> statistical framework. <italic>Limma&#x2019;s</italic> algorithm has been shown to be highly accurate for gene expression analysis (<xref ref-type="bibr" rid="B26">Law et&#x20;al., 2014</xref>; <xref ref-type="bibr" rid="B11">Corchete et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B17">Gerard, 2020</xref>), although a comprehesive analysis of accuracy for splicing is beyond the scope of this work and remains as a future direction. Nevertheless, the implication of <italic>limma</italic> methodology proved to be highly flexible. This is not only the case in terms of model specification (any contrast in a linear model including the use of continuous as well as categorical predictors can be related to differential junction expression) but also for the various parameters introduced into the fit model, including posterior variance estimators, observation weights and variance modelling. These features, together with <italic>limma&#x2019;s</italic> additional data pre-processing methods such as variance stabilization, all help to improve inference of differential junction expression.</p>
<p>Importantly and similar to gene expression studies (<xref ref-type="bibr" rid="B86">Peixoto et&#x20;al., 2015</xref>), removing or accounting for both known and unknown confounding factors (e.g., technical biases such as batch effects, or population structure such as molecular or clinical subtypes) is crucial when analyzing alternative splicing phenotypes in RNA-Seq data sets (<xref ref-type="bibr" rid="B63">Slaff et&#x20;al., 2021</xref>). Confounding factors can greatly increase the numbers of false positives and negatives, which ultimately will affect interpretation of potential biological relationships. Thus users should test for potential known confounder effects in their data, for example by using PCA or UMAP plots, and use dedicated tools to correct for confounders such as limma, ComBat, RUV, SVA and MOCCASIN (<xref ref-type="bibr" rid="B27">Leek, 2014</xref>; <xref ref-type="bibr" rid="B52">Risso et&#x20;al., 2014</xref>; <xref ref-type="bibr" rid="B83">Zhang et&#x20;al., 2020</xref>; <xref ref-type="bibr" rid="B63">Slaff et&#x20;al., 2021</xref>).</p>
<p>Apart from these statistical aspects, <italic>DJExpress</italic> provides a comprehensive framework to graphically summarize differential splicing. The adapted <italic>limma</italic>-based visualization approach allows inspection of alternative splicing not only at the level of individual junction loci, but also in the presence of more complex splicing patterns. These can involve simultaneous changes in the expression of multiple junctions across the entire gene. This is particularly advantageous, considering that existing splicing analysis tools are either focused on the definition of local alternative splicing events which can be both simple (exon skipping, alternative 3&#x2032; or 5&#x2032; splice sites, etc.) or complex (simultaneous occurrence of multiple splice events in a given mRNA), or only allow detection of known transcript isoforms. Thus, most previous tools disregard the simultaneous visual representation of the full spectrum of up- and down-regulated splicing patterns in a gene that is retrieved through junction quantification. Broadly used exceptions are LeafCutter (<xref ref-type="bibr" rid="B31">Li et&#x20;al., 2018</xref>) and MAJIQ (<xref ref-type="bibr" rid="B71">Vaquero-Garcia et&#x20;al., 2016</xref>), which can both also represent complex splicing changes across the entire&#x20;mRNA.</p>
<p>Notably, the differential junction usage analysis by <italic>DJExpress</italic> does not allow a direct assessment of intron retention events, which require intron and intron-exon junction read counts for their quantification. Nevertheless, dedicated tools such as MAJIQ (<xref ref-type="bibr" rid="B71">Vaquero-Garcia et&#x20;al., 2016</xref>), IRFinder (<xref ref-type="bibr" rid="B38">Middleton et&#x20;al., 2017</xref>), iREAD (<xref ref-type="bibr" rid="B29">Li et&#x20;al., 2020</xref>) or S-IRFinder (<xref ref-type="bibr" rid="B6">Broseus and Ritchie, 2020</xref>) are specifically designed for quantification of intron retention events and are thus well-suited for this specific type of analysis.</p>
<p>Recently, RNA-seq data from TCGA and GTEx was integrated within a large transcriptomic profiling workflow, including splicing quantification of more than 20,000 human normal and tumor tissue samples (<xref ref-type="bibr" rid="B22">Kahles et&#x20;al., 2018</xref>). Although this study provided unified splicing data across healthy and tumor tissue, the analysis is based on the construction of complex splicing graphs across thousands of samples and genes which are difficult to access and interpret. Furthermore, approaches to explore the data in a graphically visualized format were not the scope of this previous study. This limited the availability and accessibility of this data for the general research community as well as the feasibility of splicing-trait association analyses using genomic, epigenetic, and clinical records available within the TCGA repository. These points are addressed by <italic>DJExpress</italic> and DJEC DB which facilitate easy access, analysis and visualization of cancer splicing data. Moreover, by providing a simple analysis workflow for custom data sets, our pipeline is not restricted to cancer researchers but can be used to pursue a broad variety of alternative splicing-related scientific questions.</p>
<p>In conjunction with the usability of the <italic>DJExpress</italic> for differential splicing analysis and visualization using custom RNA-Seq data, the multidimensional integration of cancer data within DJEC DB represents a comprehensive resource of cancer-specific splicing signatures and junction-trait associations. We demonstrated that our pipeline has the potential to unveil novel splicing-related molecular signatures, which may contribute to improved patient stratification and more effective cancer treatment strategies. Moreover, the integration of DepMap data allows association of junction expression with molecular features such as gene dependencies and drug response profiles. This will help researchers to identify cancer cell models for specific splicing alterations that can then be used for functional characterization in the&#x20;lab.</p>
<p>Another recently established cancer splicing repository, RJunBase (<xref ref-type="bibr" rid="B30">Li et&#x20;al., 2021</xref>), follows a similar splicing analysis strategy as DJEC DB. While focusing on back-splice and fusion junctions, RJunBase provides splicing patterns at junction level and median junction expression information in GTEx and TCGA samples. However, it lacks differential junction expression analyses between cancer and healthy tissue and does not include association of splice events with molecular or clinical data. Thus, compared to RJunBase, DJEC DB not only includes differential junction expression analyses but also provides functional associations of splicing changes with phenotypic traits. These features make DJEC DB a comprehensive data base that can facilitate the discovery of novel cancer-related aberrant splicing patterns with potential phenotypic consequences.</p>
<p>Taken together, <italic>DJExpress</italic> provides researchers with a comprehensive toolbox for exploration of alternative splicing phenotypes in health and disease, and, with DJEC DB, includes multi-level data of alternative splicing signatures in healthy tissue, tumors and cancer cell&#x20;lines.</p>
</sec>
</body>
<back>
<sec id="s5">
<title>Data Availability Statement</title>
<p>GTEx and TCGA raw junction counts were provided by Dr. Andre Kahles (Biomedical Informatics Group, Department of Computer Science, ETH Z&#xfc;rich). All TCGA molecular and clinical data sets used in this study are publicly available and can be found here: <ext-link ext-link-type="uri" xlink:href="https://portal.gdc.cancer.gov/">https://portal.gdc.cancer.gov/</ext-link>. All cell line functional genomics data used in this study is publicly available and can be found here: <ext-link ext-link-type="uri" xlink:href="https://depmap.org/portal/download/">https://depmap.org/portal/download/</ext-link>. All raw RNA-Seq data files of cell lines from CCLE are available through the Sequence Read Archive under accession number PRJNA523380. All additional data and code are available from the authors upon reasonable request. <italic>DJExpress</italic> R package is available at <ext-link ext-link-type="uri" xlink:href="https://github.com/MauerLab/DJExpress">https://github.com/MauerLab/DJExpress</ext-link>. DJEC DB database is available at <ext-link ext-link-type="uri" xlink:href="https://gitlab.com/mauerlabrsc/djecdb">https://gitlab.com/mauerlabrsc/djecdb</ext-link>.</p>
</sec>
<sec id="s6">
<title>Author Contributions</title>
<p>JM conceived the study; LMG-P wrote the code and ran the <italic>in-silico</italic> analyses; LMG-P and JM wrote the manuscript.</p>
</sec>
<sec id="s7">
<title>Funding</title>
<p>This work was supported by Merck KGaA, Darmstadt, Germany (CrossRef Funder ID: 10.13039/100009945).</p>
</sec>
<sec sec-type="COI-statement" id="s8">
<title>Conflict of Interest</title>
<p>LG-P and JM are employees of BioMed X Institute (GmbH), Heidelberg, Germany. Merck KGaA had no part in the study design and collection, analysis, and interpretation of the results but provided feedback regarding the general research strategy.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ack>
<p>We thank all members of the Mauer laboratory for support. We thank Arne Knudsen for testing the <italic>DJExpress</italic> package and for critical feedback. We also would like to thank Edith Ross, Juliane Braun and Christina Esdar (Merck KGaA) for constructive feedback and helpful discussion. <xref ref-type="fig" rid="F4">Figure&#x20;4</xref> was created using images from iStock (<ext-link ext-link-type="uri" xlink:href="https://www.istockphoto.com">https://www.istockphoto.com</ext-link>) under standard license.</p>
</ack>
<sec id="s10">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fbinf.2022.786898/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fbinf.2022.786898/full&#x23;supplementary-material</ext-link>
</p>
<supplementary-material>
<label>Supplementary Figure&#x20;1</label>
<caption>
<p>Performance evaluation of DJE module. Median <bold>(A)</bold> and log2 median <bold>(B)</bold> process time following 10 repetitions of data import (<italic>DJEimport</italic>), junction annotation (<italic>DJEannotate</italic>), expression filtering (<italic>DJEprepare</italic>), normalization and differential junction expression analysis (<italic>DJEanalyze</italic>) within the DJE module of <italic>DJExpress.</italic> <bold>(C)</bold> Median memory consumption (in bytes) of the entire DJE module. Error bars represent standard deviations. Default settings with increasing sample size and random relative group sizes are used in the analysis.</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Figure&#x20;2</label>
<caption>
<p>Interactive DJE visualization in tumors using DJEC DB. <bold>(A)</bold> Start interface of the DJE section in DJEC DB. Panel 1 highlights the selection option section. Users can define the TCGA tumor type, and the significance cutoff for differential junction usage based on minimal &#x7c;logFC&#x7c; and FDR values. Panel 2 shows the downloadable summary statistics table for junctions passing the selected cutoff. Here, users can filter junctions by browsing specific gene IDs, junction IDs or genomic coordinates. After selecting a target junction by clicking over it on the table, gene-wise splice plots as well as junction in domain-annotated gene model context (Panels 3 and 4 respectively) can be interactively visualized. Hovering over each junction in the gene-wise splice plot displays a box with summarized DJE information, including relative and absolute logFC values, FDR values and expression group of the selected junction. Colors within exonic regions in the gene model plot indicate the presence of protein domains and/or post-translational modifications (PTMs). The position of the selected junction within the gene model plot is indicated by a dashed arc whose color correspond to the type of differential expression (blue for downregulation and red for upregulation). Specific regions within the gene model plot (e.g., position of the selected junction) can be further explored by cursor selection, which displays a zoomed image version of the selected gene region. <bold>(B)</bold> KIF13A exon inclusion event in BRCA TCGA cohort is used as an example. Significance cutoff was set to &#x7c;logFC&#x7c; &#x3e; 2.0 and minimal FDR cutoff of 0.05. The two exon inclusion junctions are shown in red within the gene-wise splice plot, and the gene model plot indicate the position of the selected junction, which happens close to an annotated phosphorylation site of the protein.</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Figure&#x20;3</label>
<caption>
<p>Visualization of JT section within DJEC DB. This section contains the results of the junction-trait association analyses using ANOVA and linear models from <italic>Matrix eQTL</italic> methods (<xref ref-type="bibr" rid="B59">Shabalin, 2012</xref>). Differentially expressed junctions within each TCGA tumor type were associated to microsatellite instability (MSI) or altered oncogenic signaling pathways based on mutations, copy-number changes (CNV), mRNA expression, gene fusions and DNA methylation (<xref ref-type="bibr" rid="B55">Sanchez-Vega et&#x20;al., 2018</xref>). Users can select the tissue of interest, as well as the trait to which junction expression is associated (Panel 1). A downloadable summary statistics table is displayed (Panel 2), where specific genes, junctions, genomic coordinates or traits can be browsed. When a specific association is selected from the table, interactive junction-trait association boxplots are displayed (Panel 3) and hoovering over them shows summarized statistics of the analysis. The image contains the example of the association between a differentially expressed junction in the transcript of S100 Calcium Binding Protein A14 (<italic>S100A14)</italic> and MSI, with high levels of MSI (MSI-H) in tumors (violet) being associated to significantly more inclusion levels of the junction than low levels of MSI (MSI-L) (red) and microsatellite stable (MSS) (blue) colorectal tumors.</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Figure&#x20;4</label>
<caption>
<p>Junction Co-expression Network Analysis (JCNA) of TCGA COADREAD in DJEC DB. <bold>(A)</bold> JCNA section comprises the results of the junction co-expression analysis across the 372 samples from the TCGA COADREAD tumor type. 7,404 junctions where clustered into 36 expression modules. The dendrogram of clustered junctions is displayed (panel 2), where each branch in the figure represents one junction, and every color below represents one co-expression module. The heatmap of module-trait associations (panel 3) based on correlation coefficients between junction modules and traits is also shown (blue and red indicate positive and negative correlations respectively). Traits are in the x-axis and junction modules with their respective assigned letter and color are in the y-axis. Traits analyzed include Microsatellite instability (MSI), BRAF, KRAS EGFR and TP53 mutation status, tumor stage and 6&#x20;co-expression modules of splicing factors calculated for COADREAD samples (SFG1-6). <bold>(B)</bold> Interactive scatter diagram of module membership vs. junction significance is shown when users select specific traits and modules within the selection options section (panel 1). <bold>(C)</bold> For the selected module, an interactive junction network is also displayed. Each node in the network represents a single junction. Junctions are colored based on gene ID. Users can select target genes within the network to highlight their respective junctions (e.g., EDEM2 junctions in the zoomed image).</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Figure&#x20;5</label>
<caption>
<p>Visualization of junction-trait associations using DepMap gene dependencies within JT-CCLE section in DJEC DB. This section contains the results of the junction-trait correlation analyses using junction expression and genome-wide gene dependency screens in cancer cell lines. Users can select the tissue of interest, as well as the absolute correlation coefficient cutoff to be used for SpliceRadar visualization (panel 1). A downloadable correlation matrix is displayed (panel 2), where specific genes, junctions, genomic coordinates or traits can be browsed. When specific junctions are selected (maximum 3) from the table, interactive SplicePlots with top 50&#x20;junction-dependencies correlations are displayed (panel 3). An example of significant associations between <italic>MYO18A</italic> exon 40 expression and gene dependencies in lymphoma cell lines is&#x20;shown.</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Figure&#x20;6</label>
<caption>
<p>Illustration of known alternative splicing in cancer using DJEC DB. <bold>(A)</bold> Cancer-specific inclusion of exon 11 in <italic>EXOC1</italic> involving differentially used junctions 11, 12 and 13. The alternative splicing events occurs within the C-terminus Sec3_C domain (pink) and adjacent to several phosphorylation sites (brown) as depicted by the domain-annotated gene model plot. <bold>(B)</bold> Exon 20 inclusion event in <italic>CTNND1</italic>, involving junctions 20 and 23. This exon localizes at the C-terminal domain of <italic>CTNND1</italic> and in the vicinity of several phosphorylation sites as indicated in the gene model plot. <bold>(C)</bold> Differentially used junctions are depicted within the gene-wise splice plot in <italic>CD44</italic> (downregulated junction indicating the exclusion of the variable region and upregulated junctions indicating the inclusion of exons 7&#x2013;14 within the variable region). Gene model plot with Prot2HG-based domain annotation indicate that the variable region in <italic>CD44</italic> correspond to the proteolytically cleavable extracellular Stem domain (dark gold) as previously described. For differential junction expression in <italic>EXOC1</italic>, <italic>CTNND1</italic> and <italic>CD44</italic>, colon, pancreatic and breast cancer cell line are shown as examples, respectively. (Numbers on the x-axis in the upper panels indicate the first, last and differentially used junctions in the respective gene. Grey area indicate threshold for significance (&#x7c;logFC&#x7c; &#x3e; 1.0). Downregulated and upregulated junctions with &#x7c;logFC&#x7c; above threshold and significant FDR (&#x3c; 0.05) are shown in blue and red, respectively. These same junctions are indicated within the gene model plots as dashed arcs connecting upstream and downstream exons. Colors within exonic regions indicate the presence of protein domains and/or post translational modifications (PTMs) annotated within the Prot2HG protein domain database. Arrows below gene model plots indicate direction of transcription. Coding and UTR exons are illustrated as long and short exons respectively. Junctions with both absolute and relative logFC above the threshold (&#x7c;logFC&#x7c; &#x3e; 1.0) but no significant FDR (&#x3e; 0.05) for at least one of them are shown in black. Junctions with either relative or absolute logFC below the indicated threshold are shown in grey).</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Figure&#x20;7</label>
<caption>
<p>Example local complex event in <italic>MARK3</italic> transcript in several cancer types. <bold>(A)</bold> Differentially used junctions as depicted in the gene-wise splice plot and gene model plot in <italic>MARK3</italic> indicate the presence of a splicing event involving several co-regulated junctions between exons 15&#x2013;18 (the event accounts for a double exon skipping event, where several exon-exon junctions, including an alternative 3&#x2032; splice site event are downregulated). CCLE Breast cancer vs fibroblast analysis cell lines is used as example. (Numbers on the x-axis in the upper panels indicate the first, last and differentially used junctions in the respective gene. Grey area indicate threshold for significance (&#x7c;logFC&#x7c; &#x3e; 1.0). Downregulated and upregulated junctions with &#x7c;logFC&#x7c; above threshold and significant FDR (&#x3c;0.05) are shown in blue and red, respectively. These same junctions are indicated within the gene model plots as dashed arcs connecting upstream and downstream exons. Colors within exonic regions indicate the presence of protein domains and/or post translational modifications (PTMs) annotated within the Prot2HG protein domain database. Arrows below gene model plots indicate direction of transcription. Coding and UTR exons are illustrated as long and short exons respectively. Junctions with both absolute and relative logFC above the threshold (&#x7c;logFC&#x7c; &#x3e; 1.0) but no significant FDR (&#x3e;0.05) for at least one of them are shown in black). <bold>(B)</bold> <italic>DJEplotSplice</italic> function in <italic>DJExpress</italic> allows the alternative interactive visualization of all found junctions for a target gene within the original junction quantification data, including those removed after coverage filtering. The full gene-wise plot of <italic>MARK3</italic> reveals the presence of 1084 junctions detected across all analyzed samples. Junctions filtered out for differential analysis based on user-defined expression cutoffs are shown in clear grey. <italic>DJEplotSplice</italic> output offers an additional read coverage information across the gene using the loess fit of median junction read count (blue line) as readout. Numbers in the x-axis of the read coverage plot indicate genomic coordinates of <italic>MARK3</italic> gene structure.</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Figure&#x20;8</label>
<caption>
<p>Examples of isoform switches detected by <italic>DJExpress</italic> in cancer cell lines. Visualization of differentially used junctions within gene-wise splice plots and gene model plots reveals cases of upregulation and downregulation of specific transcript isoforms. <bold>(A)</bold> <italic>INPP5B</italic> gene-wise splice plot in pancreatic cancer cell lines indicates the presence of one upregulated junction and a series of consecutive downregulated junctions at the 5&#x2032; region of the gene. When compared to the transcript isoform annotation for <italic>INPP5B</italic>, this pattern is indicative of downregulation of the long <italic>INPP5B</italic> isoform (bottom right) containing five additional exons at the 5&#x2032; region which corresponds to the Type II inositol 1,4,5-trisphosphate 5-phosphatase PH protein domain (INPP5B_PH) (green), while the short isoform (top right) containing an alternative first exon downstream of the INPP5B_PH domain appears upregulated. <bold>(B)</bold> <italic>RGS3</italic> isoform switch is also observed in breast, colon, gastric, lung, ovarian and pancreatic cancers. The series of upregulated junctions belongs to a long isoform version of <italic>RGS3</italic>, while downregulated junctions correspond to a shorter transcript variant with an alternative downstream promoter. This short isoform shares its second and third exon with the long isoform but differs in four downstream exons containing the Regulator of G protein Signaling (RGS_RGS3) (brown) protein domain. <italic>RGS3</italic> gene-wise splice plot in gastric cell lines is shown as example (Numbers on the <italic>x</italic>-axis in the upper panels indicate the first, last and differentially used junctions in the respective gene. Grey area indicate threshold for significance (&#x7c;logFC&#x7c; &#x3e; 1.0). Downregulated and upregulated junctions with &#x7c;logFC&#x7c; above threshold and significant FDR (&#x3c;0.05) are shown in blue and red, respectively. These same junctions are indicated within the gene model plots as dashed arcs connecting upstream and downstream exons. Colors within exonic regions indicate the presence of protein domains and/or post translational modifications (PTMs) annotated within the Prot2HG protein domain database. Arrows below gene model plots indicate direction of transcription. Coding and UTR exons are illustrated as long and short exons respectively. Junctions with both absolute and relative logFC above the threshold (&#x7c;logFC&#x7c; &#x3e; 1.0) but no significant FDR (&#x3e;0.05) for at least one of them are shown in black. Junctions with either relative or absolute logFC below the indicated threshold are shown in grey).</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Figure&#x20;9</label>
<caption>
<p>Example of alternative splicing event types identified by <italic>DJExpress</italic>. Differentially used non-annotated junctions are representative of different types of alternative splicing events. <bold>(A)</bold> <italic>XRCC6</italic> gene-wise splice plot in breast cancer cell lines indicates the presence of an alternative 3&#x2032; splice site (A3&#x2019;SS) in exon 6. This event occurs within the Von Willebrand factor type A protein domain (vWA_ku) (pink) known to be involved in protein-protein interactions. <bold>(B)</bold> An alternative first exon (AFE) event is detected in <italic>BIN1</italic> in lymphoma cell lines. The downregulated first exon is known to contain a region required for interaction with <italic>BIN2</italic> (orange). <bold>(C)</bold> Detection of an alternative 5&#x2032; splice site (A5&#x2032;SS) involving the first exon of <italic>LDLRAP1</italic> in myeloma. <bold>(D)</bold> The upregulated junction in <italic>C11orf58</italic> in brain cancer cell lines indicates the presence of both, an alternative 5&#x2032; splice site (A5&#x2032;SS) and an alternative 3&#x2032; splice site (A3&#x2032;SS) in exon 2 and 3, respectively, which occurs inside the region corresponding to the Small acidic protein family (SAMP) domain (pink) (Numbers on the <italic>x</italic>-axis in the upper panels indicate the first, last and differentially used junctions in the respective gene. Grey area indicate threshold for significance (&#x7c;logFC&#x7c; &#x3e; 1.0). Downregulated and upregulated junctions with &#x7c;logFC&#x7c; above threshold and significant FDR (&#x3c;0.05) are shown in blue and red, respectively. These same junctions are indicated within the gene model plots as dashed arcs connecting upstream and downstream exons. Colors within exonic regions indicate the presence of protein domains and/or post translational modifications (PTMs) annotated within the Prot2HG protein domain database. Arrows below gene model plots indicate direction of transcription. Coding and UTR exons are illustrated as long and short exons respectively. Junctions with both absolute and relative logFC above the threshold (&#x7c;logFC&#x7c; &#x3e; 1.0) but no significant FDR (&#x3e;0.05) for at least one of them are shown in black).</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Figure&#x20;10</label>
<caption>
<p>Example of a differentially spliced non-annotated exon in cancer cell lines. Differentially expressed non-annotated junctions indicate the presence of an exon inclusion event (junctions 18&#x2013;20) between exon 17 and 18 involving the actin-binding module (I_LWEQ) (violet) in <italic>TLN1</italic> as observed in the domain-annotated gene model plot. <italic>TLN1</italic> plots in breast cancer cell lines are used as example (Numbers on the <italic>x</italic>-axis in the upper panels indicate the first, last and differentially used junctions in the respective gene. Grey area indicate threshold for significance (&#x7c;logFC&#x7c; &#x3e; 1.0). Downregulated and upregulated junctions with &#x7c;logFC&#x7c; above threshold and significant FDR (&#x3c;0.05) are shown in blue and red, respectively. These same junctions are indicated within the gene model plots as dashed arcs connecting upstream and downstream exons. Colors within exonic regions indicate the presence of protein domains and/or post translational modifications (PTMs) annotated within the Prot2HG protein domain database. Arrows below gene model plots indicate direction of transcription. Coding and UTR exons are illustrated as long and short exons respectively. Junctions with both absolute and relative logFC above the threshold (&#x7c;logFC&#x7c; &#x3e; 1.0) but no significant FDR (&#x3e;0.05) for at least one of them are shown in black).</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Figure&#x20;11</label>
<caption>
<p>SpliceRadar plot of top associations between <italic>NUMB</italic> alternative splicing and drug treatment response in lung cancer. Expression of splice junctions involved in the exon inclusion event of <italic>NUMB</italic> was correlated to cell survival rates after drug treatment using DepMap drug screens data in lung cancer cell lines. The top-ranked correlation coefficients (FDR &#x3c; 0.05 and &#x7c;rho&#x7c; &#x3e; 0.2) were used to construct the SpliceRadar plot. A general trend of anti-correlation patterns with inclusion (red and dark red) and exclusion (blue) junctions are observed. Boxes indicate drugs targeting PI3K/mTOR and ERK MAPK signaling.</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Figure&#x20;12</label>
<caption>
<p>DJE section of DJEC DB showing summary statistics table, gene-wise splice plots and gene model plots of <italic>NUMB</italic> in TCGA BRCA. The two upregulated junctions indicating the inclusion of exon 12 in <italic>NUMB</italic> are shown in red within the gene-wise splice plot and the selected junction in the summary statistics table is also highlighted within the gene model plot (Panel 1 highlights the selection option section. Panel 2 contains the summary statistics table. Panel 3 and 4 show the gene-wise splice plot and the domain-annotated gene model plot, respectively).</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Figure&#x20;13</label>
<caption>
<p>Downregulation of exon 19 in <italic>VCL</italic> illustrated by DJE section in DJEC DB. Exon inclusion junctions are shown in blue within the gene-wise splice plot and the selected downregulated junction in the summary statistics table is also shown within the gene model plot. CESC TCGA results are shown as example (Panel 1 highlights the selection option section. Panel 2 contains the summary statistics table. Panel 3 and 4 show the gene-wise splice plot and the domain-annotated gene model plot, respectively).</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Figure&#x20;14</label>
<caption>
<p>Cancer-specific upregulation of exon 12 in <italic>MAP3K7</italic> as shown in DJEC DB. Exon inclusion and exclusion junctions are highlighted in red and blue respectively within the gene-wise splice plot. The selected upregulated junction in the summary statistics is illustrated within the gene model plot. COADREAD TCGA results are shown as example (Panel 1 highlights the selection option section. Panel 2 contains the summary statistics table. Panel 3 and 4 show the gene-wise splice plot and the domain-annotated gene model plot, respectively).</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Figure&#x20;15</label>
<caption>
<p>Cancer-specific alternative splicing in <italic>EXOC1</italic> as shown in DJEC DB. Junctions indicating the upregulation of exon 11 in <italic>EXOC1</italic> are shown in red within the gene-wise splice plot. The selected upregulated junction in the summary statistics is illustrated within the gene model plot. LUAD TCGA results are shown as example (Panel 1 highlights the selection option section. Panel 2 contains the summary statistics table. Panel 3 and 4 show the gene-wise splice plot and the domain-annotated gene model plot, respectively) (Panel 1 highlights the selection option section. Panel 2 contains the summary statistics table. Panel 3 and 4 show the gene-wise splice plot and the domain-annotated gene model plot, respectively).</p>
</caption>
</supplementary-material>
<supplementary-material>
<label>Supplementary Figure&#x20;16</label>
<caption>
<p>Significant associations using <italic>Matrix eQTL</italic> methods between <italic>CTNND1</italic> exon 20 inclusion event and genomic alterations in TCGA are shown within the JT section of DJEC DB. Selecting &#x201c;Associations with Genomic Alterations&#x201d; and &#x201c;BRCA&#x201d; tumor type within the selection panel (Panel 1), followed by &#x201c;CTNND1&#x201d; gene ID browsing within the summary statistics table (Panel 2) displays the significant association to <italic>TP53</italic> mutation. Box plots show decreased exon junction expression in the presence of TP53 mutation (MUT), compared to wild-type (WT) tumor samples (Panel 3). amplification of <italic>CCND1</italic> gene and epigenetic silencing of <italic>CDKN2A</italic> are also significantly associated to <italic>CTNND1</italic> alternative splicing event in TCGA STES (Panel&#x20;4).</p>
</caption>
</supplementary-material>
<supplementary-material xlink:href="DataSheet1.zip" id="SM1" mimetype="application/zip" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<fn-group>
<fn id="fn1">
<label>1</label>
<p>
<ext-link ext-link-type="uri" xlink:href="https://depmap.org/">https://depmap.org/.</ext-link>
</p>
</fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Alamancos</surname>
<given-names>G. P.</given-names>
</name>
<name>
<surname>Pag&#xe8;s</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Trincado</surname>
<given-names>J.&#x20;L.</given-names>
</name>
<name>
<surname>Bellora</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Eyras</surname>
<given-names>E.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Leveraging Transcript Quantification for Fast Computation of Alternative Splicing Profiles</article-title>. <source>RNA</source> <volume>21</volume>, <fpage>1521</fpage>&#x2013;<lpage>1531</lpage>. <pub-id pub-id-type="doi">10.1261/rna.051557.115</pub-id> </citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Barbosa-Morais</surname>
<given-names>N. L.</given-names>
</name>
<name>
<surname>Irimia</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Pan</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Xiong</surname>
<given-names>H. Y.</given-names>
</name>
<name>
<surname>Gueroussov</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>L. J.</given-names>
</name>
<etal/>
</person-group> (<year>2012</year>). <article-title>The Evolutionary Landscape of Alternative Splicing in Vertebrate Species</article-title>. <source>Science</source> <volume>338</volume>, <fpage>1587</fpage>&#x2013;<lpage>1593</lpage>. <pub-id pub-id-type="doi">10.1126/science.1230612</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bechara</surname>
<given-names>E. G.</given-names>
</name>
<name>
<surname>Sebesty&#xe9;n</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Bernardis</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Eyras</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Valc&#xe1;rcel</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>RBM5, 6, and 10 Differentially Regulate NUMB Alternative Splicing to Control Cancer Cell Proliferation</article-title>. <source>Mol. Cel</source> <volume>52</volume>, <fpage>720</fpage>&#x2013;<lpage>733</lpage>. <pub-id pub-id-type="doi">10.1016/j.molcel.2013.11.010</pub-id> </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bielli</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Panzeri</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Lattanzio</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Mutascio</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Pieraccioli</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Volpe</surname>
<given-names>E.</given-names>
</name>
<etal/>
</person-group> (<year>2018</year>). <article-title>The Splicing Factor PTBP1 Promotes Expression of Oncogenic Splice Variants and Predicts Poor Prognosis in Patients with Non-muscle-invasive Bladder Cancer</article-title>. <source>Clin. Cancer Res.</source> <volume>24</volume>, <fpage>5422</fpage>&#x2013;<lpage>5432</lpage>. <pub-id pub-id-type="doi">10.1158/1078-0432.CCR-17-3850</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bray</surname>
<given-names>N. L.</given-names>
</name>
<name>
<surname>Pimentel</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Melsted</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Pachter</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Erratum: Near-Optimal Probabilistic RNA-Seq Quantification</article-title>. <source>Nat. Biotechnol.</source> <volume>34</volume>, <fpage>888</fpage>&#x2013;<lpage>527</lpage>. <pub-id pub-id-type="doi">10.1038/nbt0816-888d</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Broseus</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Ritchie</surname>
<given-names>W.</given-names>
</name>
</person-group> (<year>20202020</year>). <article-title>S-IRFindeR: Stable and Accurate Measurement of Intron Retention</article-title>. <source>bioRxiv</source> <volume>0625</volume>, <fpage>164699</fpage>. <pub-id pub-id-type="doi">10.1101/2020.06.25.164699</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chartier</surname>
<given-names>N. T.</given-names>
</name>
<name>
<surname>Oddou</surname>
<given-names>C. I.</given-names>
</name>
<name>
<surname>Lain&#xe9;</surname>
<given-names>M. G.</given-names>
</name>
<name>
<surname>Ducarouge</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Marie</surname>
<given-names>C. A.</given-names>
</name>
<name>
<surname>Block</surname>
<given-names>M. R.</given-names>
</name>
<etal/>
</person-group> (<year>2007</year>). <article-title>Cyclin-dependent Kinase 2/cyclin E Complex Is Involved in P120 Catenin (P120ctn)-dependent Cell Growth Control: A New Role for P120ctn in Cancer</article-title>. <source>Cancer Res.</source> <volume>67</volume>, <fpage>9781</fpage>&#x2013;<lpage>9790</lpage>. <pub-id pub-id-type="doi">10.1158/0008-5472.CAN-07-0233</pub-id> </citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Karnad</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Freeman</surname>
<given-names>J.&#x20;W.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>The Biology and Role of CD44 in Cancer Progression: Therapeutic Implications</article-title>. <source>J.&#x20;Hematol. Oncol.</source> <volume>11</volume>, <fpage>64</fpage>&#x2013;<lpage>23</lpage>. <pub-id pub-id-type="doi">10.1186/s13045-018-0605-5</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Ye</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Symmetric Division and Expression of its Regulatory Gene Numb in Human Cervical Squamous Carcinoma Cells</article-title>. <source>Pathobiology</source> <volume>76</volume>, <fpage>149</fpage>&#x2013;<lpage>154</lpage>. <pub-id pub-id-type="doi">10.1159/000209393</pub-id> </citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>K. L.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>T. X.</given-names>
</name>
<name>
<surname>Chang</surname>
<given-names>S. W.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Structural Characterization of the CD44 Stem Region for Standard and Cancer-Associated Isoforms</article-title>. <source>Int. J.&#x20;Mol. Sci.</source> <volume>21</volume>. <pub-id pub-id-type="doi">10.3390/ijms21010336</pub-id> </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Corchete</surname>
<given-names>L. A.</given-names>
</name>
<name>
<surname>Rojas</surname>
<given-names>E. A.</given-names>
</name>
<name>
<surname>Alonso-L&#xf3;pez</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>De Las Rivas</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Guti&#xe9;rrez</surname>
<given-names>N. C.</given-names>
</name>
<name>
<surname>Burguillo</surname>
<given-names>F. J.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Systematic Comparison and Assessment of RNA-Seq Procedures for Gene Expression Quantitative Analysis</article-title>. <source>Sci. Rep.</source> <volume>10</volume>, <fpage>19737</fpage>. <pub-id pub-id-type="doi">10.1038/S41598-020-76881-X</pub-id> </citation>
</ref>
<ref id="B12">
<citation citation-type="web">
<person-group person-group-type="author">
<name>
<surname>DepMap 21Q3 Public</surname>
</name>
</person-group> (<year>2021</year>). <article-title>DepMap 21Q3 Public</article-title>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="https://fshare.com/articles/dataset/DepMap_21Q3_Public/15160110/2">https://figshare.com/articles/dataset/DepMap_21Q3_Public/15160110/2</ext-link> (Accessed August 18, 2021)</comment>. </citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dobin</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Davis</surname>
<given-names>C. A.</given-names>
</name>
<name>
<surname>Schlesinger</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Drenkow</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zaleski</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Jha</surname>
<given-names>S.</given-names>
</name>
<etal/>
</person-group> (<year>2013</year>). <article-title>STAR: Ultrafast Universal RNA-Seq Aligner</article-title>. <source>Bioinformatics</source> <volume>29</volume>, <fpage>15</fpage>&#x2013;<lpage>21</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/bts635</pub-id> </citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Emig</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Salomonis</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Baumbach</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Lengauer</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Conklin</surname>
<given-names>B. R.</given-names>
</name>
<name>
<surname>Albrecht</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>AltAnalyze and DomainGraph: Analyzing and Visualizing Exon Expression Data</article-title>. <source>Nucleic Acids Res.</source> <volume>38</volume>, <fpage>W755</fpage>&#x2013;<lpage>W762</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkq405</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gallego-Paez</surname>
<given-names>L. M.</given-names>
</name>
<name>
<surname>Bordone</surname>
<given-names>M. C.</given-names>
</name>
<name>
<surname>Leote</surname>
<given-names>A. C.</given-names>
</name>
<name>
<surname>Saraiva-Agostinho</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Ascens&#xe3;o-Ferreira</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Barbosa-Morais</surname>
<given-names>N. L.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Alternative Splicing: the Pledge, the Turn, and the Prestige : The Key Role of Alternative Splicing in Human Biological Systems</article-title>. <source>Hum. Genet.</source> <volume>136</volume>, <fpage>1015</fpage>&#x2013;<lpage>1042</lpage>. <pub-id pub-id-type="doi">10.1007/s00439-017-1790-y</pub-id> </citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gardina</surname>
<given-names>P. J.</given-names>
</name>
<name>
<surname>Clark</surname>
<given-names>T. A.</given-names>
</name>
<name>
<surname>Shimada</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Staples</surname>
<given-names>M. K.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Veitch</surname>
<given-names>J.</given-names>
</name>
<etal/>
</person-group> (<year>2006</year>). <article-title>Alternative Splicing and Differential Gene Expression in Colon Cancer Detected by a Whole Genome Exon Array</article-title>. <source>BMC Genomics</source> <volume>7</volume>, <fpage>325</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2164-7-325</pub-id> </citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gerard</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Data-based RNA-Seq Simulations by Binomial Thinning</article-title>. <source>BMC Bioinformatics</source> <volume>21</volume>, <fpage>206</fpage>. <pub-id pub-id-type="doi">10.1186/S12859-020-3450-9</pub-id> </citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hu</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Mellor</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>DeLisi</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2004</year>). <article-title>VisANT: An Online Visualization and Analysis Tool for Biological Interaction Data</article-title>. <source>BMC Bioinformatics</source> <volume>5</volume>, <fpage>17</fpage>&#x2013;<lpage>18</lpage>. <pub-id pub-id-type="doi">10.1186/1471-2105-5-17</pub-id> </citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Irimia</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Weatheritt</surname>
<given-names>R. J.</given-names>
</name>
<name>
<surname>Ellis</surname>
<given-names>J.&#x20;D.</given-names>
</name>
<name>
<surname>Parikshak</surname>
<given-names>N. N.</given-names>
</name>
<name>
<surname>Gonatopoulos-Pournatzis</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Babor</surname>
<given-names>M.</given-names>
</name>
<etal/>
</person-group> (<year>2014</year>). <article-title>A Highly Conserved Program of Neuronal Microexons Is Misregulated in Autistic Brains</article-title>. <source>Cell</source> <volume>159</volume>, <fpage>1511</fpage>&#x2013;<lpage>1523</lpage>. <pub-id pub-id-type="doi">10.1016/j.cell.2014.11.035</pub-id> </citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jiang</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Dai</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Stoecker</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>E.</given-names>
</name>
<etal/>
</person-group> (<year>2012</year>). <article-title>P120-catenin Isoforms 1 and 3 Regulate Proliferation and Cell Cycle of Lung Cancer Cells via &#x3b2;-catenin and Kaiso Respectively</article-title>. <source>PLoS One</source> <volume>7</volume>, <fpage>e30303</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0030303</pub-id> </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jiang</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Alternative Splicing: Human Disease and Quantitative Analysis from High-Throughput Sequencing</article-title>. <source>Comput. Struct. Biotechnol. J.</source> <volume>19</volume>, <fpage>183</fpage>&#x2013;<lpage>195</lpage>. <pub-id pub-id-type="doi">10.1016/j.csbj.2020.12.009</pub-id> </citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kahles</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Lehmann</surname>
<given-names>K. V.</given-names>
</name>
<name>
<surname>Toussaint</surname>
<given-names>N. C.</given-names>
</name>
<name>
<surname>H&#xfc;ser</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Stark</surname>
<given-names>S. G.</given-names>
</name>
<name>
<surname>Sachsenberg</surname>
<given-names>T.</given-names>
</name>
<etal/>
</person-group> (<year>2018</year>). <article-title>Comprehensive Analysis of Alternative Splicing across Tumors from 8,705 Patients</article-title>. <source>Cancer Cell</source> <volume>34</volume>, <fpage>211</fpage>&#x2013;<lpage>e6</lpage>. <pub-id pub-id-type="doi">10.1016/j.ccell.2018.07.001</pub-id> </citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kahles</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Ong</surname>
<given-names>C. S.</given-names>
</name>
<name>
<surname>Zhong</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>R&#xe4;tsch</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>SplAdder: Identification, Quantification and Testing of Alternative Splicing Events from RNA-Seq Data</article-title>. <source>Bioinformatics</source> <volume>32</volume>, <fpage>1840</fpage>&#x2013;<lpage>1847</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btw076</pub-id> </citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Katz</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>E. T.</given-names>
</name>
<name>
<surname>Airoldi</surname>
<given-names>E. M.</given-names>
</name>
<name>
<surname>Burge</surname>
<given-names>C. B.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Analysis and Design of RNA Sequencing Experiments for Identifying Isoform Regulation</article-title>. <source>Nat. Methods</source> <volume>7</volume>, <fpage>1009</fpage>&#x2013;<lpage>1015</lpage>. <pub-id pub-id-type="doi">10.1038/nmeth.1528</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Langfelder</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Horvath</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>WGCNA: An R Package for Weighted Correlation Network Analysis</article-title>. <source>BMC Bioinformatics</source> <volume>9</volume>, <fpage>559</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2105-9-559</pub-id> </citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Law</surname>
<given-names>C. W.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Smyth</surname>
<given-names>G. K.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Voom: Precision Weights Unlock Linear Model Analysis Tools for RNA-Seq Read Counts</article-title>. <source>Genome Biol.</source> <volume>15</volume>, <fpage>R29</fpage>&#x2013;<lpage>R17</lpage>. <pub-id pub-id-type="doi">10.1186/gb-2014-15-2-r29</pub-id> </citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Leek</surname>
<given-names>J.&#x20;T.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Svaseq: Removing Batch Effects and Other Unwanted Noise from Sequencing Data</article-title>. <source>Nucleic Acids Res.</source> <volume>42</volume>, <fpage>e161</fpage>. <pub-id pub-id-type="doi">10.1093/NAR/GKU864</pub-id> </citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Dewey</surname>
<given-names>C. N.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>RSEM: Accurate Transcript Quantification from RNA-Seq Data with or without a Reference Genome</article-title>. <source>BMC Bioinformatics</source> <volume>12</volume>, <fpage>323</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2105-12-323</pub-id> </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>H. D.</given-names>
</name>
<name>
<surname>Funk</surname>
<given-names>C. C.</given-names>
</name>
<name>
<surname>Price</surname>
<given-names>N. D.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>IREAD: A Tool for Intron Retention Detection from RNA-Seq Data</article-title>. <source>BMC Genomics</source> <volume>21</volume>, <fpage>128</fpage>. <pub-id pub-id-type="doi">10.1186/s12864-020-6541-0</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Lai</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>RJunBase: A Database of RNA Splice Junctions in Human normal and Cancerous Tissues</article-title>. <source>Nucleic Acids Res.</source> <volume>49</volume>, <fpage>D201</fpage>&#x2013;<lpage>D211</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkaa1056</pub-id> </citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>Y. I.</given-names>
</name>
<name>
<surname>Knowles</surname>
<given-names>D. A.</given-names>
</name>
<name>
<surname>Humphrey</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Barbeira</surname>
<given-names>A. N.</given-names>
</name>
<name>
<surname>Dickinson</surname>
<given-names>S. P.</given-names>
</name>
<name>
<surname>Im</surname>
<given-names>H. K.</given-names>
</name>
<etal/>
</person-group> (<year>2018</year>). <article-title>Annotation-free Quantification of RNA Splicing Using LeafCutter</article-title>. <source>Nat. Genet.</source> <volume>50</volume>, <fpage>151</fpage>&#x2013;<lpage>158</lpage>. <pub-id pub-id-type="doi">10.1038/s41588-017-0004-9</pub-id> </citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liao</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Smyth</surname>
<given-names>G. K.</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>W.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>The R Package Rsubread Is Easier, Faster, Cheaper and Better for Alignment and Quantification of RNA Sequencing Reads</article-title>. <source>Nucleic Acids Res.</source> <volume>47</volume>, <fpage>e47</fpage>. <pub-id pub-id-type="doi">10.1093/nar/gkz114</pub-id> </citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Caffrey</surname>
<given-names>T. C.</given-names>
</name>
<name>
<surname>Steele</surname>
<given-names>M. M.</given-names>
</name>
<name>
<surname>Mohr</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>P. K.</given-names>
</name>
<name>
<surname>Radhakrishnan</surname>
<given-names>P.</given-names>
</name>
<etal/>
</person-group> (<year>20142014</year>). <article-title>MUC1 Regulates Cyclin D1 Gene Expression through P120 Catenin and &#x3b2;-catenin</article-title>. <source>Oncogenesis</source> <volume>3</volume> (<issue>3</issue>), <fpage>e107</fpage>. <pub-id pub-id-type="doi">10.1038/oncsis.2014.19</pub-id> </citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lonsdale</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Thomas</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Salvatore</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Phillips</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Lo</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Shad</surname>
<given-names>S.</given-names>
</name>
<etal/>
</person-group> (<year>2013</year>). <article-title>The Genotype-Tissue Expression (GTEx) Project</article-title>. <source>Nat. Genet.</source> <volume>45</volume>, <fpage>580</fpage>&#x2013;<lpage>585</lpage>. <pub-id pub-id-type="doi">10.1038/ng.2653</pub-id> </citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Ji</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Feng</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Sourbier</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>Y.</given-names>
</name>
<etal/>
</person-group> (<year>2015</year>). <article-title>Alternative Splicing of the Cell Fate Determinant Numb in Hepatocellular Carcinoma</article-title>. <source>Hepatology</source> <volume>62</volume>, <fpage>1122</fpage>&#x2013;<lpage>1131</lpage>. <pub-id pub-id-type="doi">10.1002/hep.27923</pub-id> </citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ma</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Lv</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Teng</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Niu</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Yi</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Identifying Key Genes in Rheumatoid Arthritis by Weighted Gene Co-expression Network Analysis</article-title>. <source>Int. J.&#x20;Rheum. Dis.</source> <volume>20</volume>, <fpage>971</fpage>&#x2013;<lpage>979</lpage>. <pub-id pub-id-type="doi">10.1111/1756-185X.13063</pub-id> </citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>McGill</surname>
<given-names>M. A.</given-names>
</name>
<name>
<surname>Dho</surname>
<given-names>S. E.</given-names>
</name>
<name>
<surname>Weinmaster</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>McGlade</surname>
<given-names>C. J.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Numb Regulates post-endocytic Trafficking and Degradation of Notch1</article-title>. <source>J.&#x20;Biol. Chem.</source> <volume>284</volume>, <fpage>26427</fpage>&#x2013;<lpage>26438</lpage>. <pub-id pub-id-type="doi">10.1074/jbc.M109.014845</pub-id> </citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Middleton</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Thomas</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Au</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Wong</surname>
<given-names>J.&#x20;J.</given-names>
</name>
<etal/>
</person-group> (<year>2017</year>). <article-title>IRFinder: Assessing the Impact of Intron Retention on Mammalian Gene Expression</article-title>. <source>Genome Biol.</source> <volume>18</volume>, <fpage>51</fpage>&#x2013;<lpage>11</lpage>. <pub-id pub-id-type="doi">10.1186/S13059-017-1184-4/FIGURES/5</pub-id> </citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Misquitta-Ali</surname>
<given-names>C. M.</given-names>
</name>
<name>
<surname>Cheng</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>O&#x27;Hanlon</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>McGlade</surname>
<given-names>C. J.</given-names>
</name>
<name>
<surname>Tsao</surname>
<given-names>M. S.</given-names>
</name>
<etal/>
</person-group> (<year>2011</year>). <article-title>Global Profiling and Molecular Characterization of Alternative Splicing Events Misregulated in Lung Cancer</article-title>. <source>Mol. Cel. Biol.</source> <volume>31</volume>, <fpage>138</fpage>&#x2013;<lpage>150</lpage>. <pub-id pub-id-type="doi">10.1128/mcb.00709-10</pub-id> </citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Munkley</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Krishnan</surname>
<given-names>S. R. G.</given-names>
</name>
<name>
<surname>Hysenaj</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Scott</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Dalgliesh</surname>
<given-names>C.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Androgen-regulated Transcription of ESRP2 Drives Alternative Splicing Patterns in Prostate Cancer</article-title>. <source>Elife</source> <volume>8</volume>. <pub-id pub-id-type="doi">10.7554/eLife.47678.001</pub-id> </citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nishimura</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Kaibuchi</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>Numb Controls Integrin Endocytosis for Directional Cell Migration with aPKC and PAR-3</article-title>. <source>Dev. Cel</source> <volume>13</volume>, <fpage>15</fpage>&#x2013;<lpage>28</lpage>. <pub-id pub-id-type="doi">10.1016/j.devcel.2007.05.003</pub-id> </citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Oh</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Pradella</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Shao</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Choi</surname>
<given-names>N.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>Global Alternative Splicing Defects in Human Breast Cancer Cells</article-title>. <source>Cancers (Basel)</source> <volume>13</volume>, <fpage>3071</fpage>. <pub-id pub-id-type="doi">10.3390/cancers13123071</pub-id> </citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Oldham</surname>
<given-names>M. C.</given-names>
</name>
<name>
<surname>Konopka</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Iwamoto</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Langfelder</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Kato</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Horvath</surname>
<given-names>S.</given-names>
</name>
<etal/>
</person-group> (<year>2008</year>). <article-title>Functional Organization of the Transcriptome in Human Brain</article-title>. <source>Nat. Neurosci.</source> <volume>11</volume>, <fpage>1271</fpage>&#x2013;<lpage>1282</lpage>. <pub-id pub-id-type="doi">10.1038/nn.2207</pub-id> </citation>
</ref>
<ref id="B44">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Oltean</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Bates</surname>
<given-names>D. O.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Hallmarks of Alternative Splicing in Cancer</article-title>. <source>Oncogene</source> <volume>33</volume>, <fpage>5311</fpage>&#x2013;<lpage>5318</lpage>. <pub-id pub-id-type="doi">10.1038/onc.2013.533</pub-id> </citation>
</ref>
<ref id="B45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Paronetto</surname>
<given-names>M. P.</given-names>
</name>
<name>
<surname>Passacantilli</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Sette</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Alternative Splicing and Cell Survival: From Tissue Homeostasis to Disease</article-title>. <source>Cell Death Differ</source> <volume>23</volume>, <fpage>1919</fpage>&#x2013;<lpage>1929</lpage>. <pub-id pub-id-type="doi">10.1038/cdd.2016.91</pub-id> </citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Patro</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Duggal</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Love</surname>
<given-names>M. I.</given-names>
</name>
<name>
<surname>Irizarry</surname>
<given-names>R. A.</given-names>
</name>
<name>
<surname>Kingsford</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Salmon Provides Fast and Bias-Aware Quantification of Transcript Expression</article-title>. <source>Nat. Methods</source> <volume>14</volume>, <fpage>417</fpage>&#x2013;<lpage>419</lpage>. <pub-id pub-id-type="doi">10.1038/nmeth.4197</pub-id> </citation>
</ref>
<ref id="B47">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Patro</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Mount</surname>
<given-names>S. M.</given-names>
</name>
<name>
<surname>Kingsford</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Sailfish Enables Alignment-free Isoform Quantification from RNA-Seq Reads Using Lightweight Algorithms</article-title>. <source>Nat. Biotechnol.</source> <volume>32</volume>, <fpage>462</fpage>&#x2013;<lpage>464</lpage>. <pub-id pub-id-type="doi">10.1038/nbt.2862</pub-id> </citation>
</ref>
<ref id="B86">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Peixoto</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Risso</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Poplawski</surname>
<given-names>S. G.</given-names>
</name>
<name>
<surname>Wimmer</surname>
<given-names>M. E.</given-names>
</name>
<name>
<surname>Speed</surname>
<given-names>T. P.</given-names>
</name>
<name>
<surname>Wood</surname>
<given-names>M. A.</given-names>
</name>
<etal/>
</person-group> (<year>2015</year>). <article-title>How Data Analysis Affects Power, Reproducibility and Biological Insight of RNA-seq Studies in Complex Datasets</article-title>. <source>Nucleic Acids Res.</source> <volume>43</volume>, <fpage>7664</fpage>&#x2013;<lpage>7674</lpage>. <pub-id pub-id-type="doi">10.1093/NAR/GKV736</pub-id> </citation>
</ref>
<ref id="B48">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Presson</surname>
<given-names>A. P.</given-names>
</name>
<name>
<surname>Sobel</surname>
<given-names>E. M.</given-names>
</name>
<name>
<surname>Papp</surname>
<given-names>J.&#x20;C.</given-names>
</name>
<name>
<surname>Suarez</surname>
<given-names>C. J.</given-names>
</name>
<name>
<surname>Whistler</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Rajeevan</surname>
<given-names>M. S.</given-names>
</name>
<etal/>
</person-group> (<year>2008</year>). <article-title>Integrated Weighted Gene Co-expression Network Analysis with an Application to Chronic Fatigue Syndrome</article-title>. <source>BMC Syst. Biol.</source> <volume>2</volume>, <fpage>95</fpage>&#x2013;<lpage>21</lpage>. <pub-id pub-id-type="doi">10.1186/1752-0509-2-95</pub-id> </citation>
</ref>
<ref id="B49">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Qiu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Lyu</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Dunlap</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Harvey</surname>
<given-names>S. E.</given-names>
</name>
<name>
<surname>Cheng</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>A Combinatorially Regulated RNA Splicing Signature Predicts Breast Cancer EMT States and Patient Survival</article-title>. <source>RNA</source> <volume>26</volume>, <fpage>1257</fpage>&#x2013;<lpage>1267</lpage>. <pub-id pub-id-type="doi">10.1261/RNA.074187.119</pub-id> </citation>
</ref>
<ref id="B50">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rajendran</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Berry</surname>
<given-names>D. M.</given-names>
</name>
<name>
<surname>McGlade</surname>
<given-names>C. J.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Regulation of Numb Isoform Expression by Activated ERK Signaling</article-title>. <source>Oncogene</source> <volume>35</volume>, <fpage>5202</fpage>&#x2013;<lpage>5213</lpage>. <pub-id pub-id-type="doi">10.1038/onc.2016.69</pub-id> </citation>
</ref>
<ref id="B51">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ray</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Yun</surname>
<given-names>Y. C.</given-names>
</name>
<name>
<surname>Idris</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Cheng</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Boot</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Iain</surname>
<given-names>T. B. H.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>A Tumor-Associated Splice-Isoform of MAP2K7 Drives Dedifferentiation in MBNL1-Low Cancers via JNK Activation</article-title>. <source>Proc. Natl. Acad. Sci. U. S. A.</source> <volume>117</volume>, <fpage>16391</fpage>&#x2013;<lpage>16400</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.2002499117</pub-id> </citation>
</ref>
<ref id="B52">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Risso</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Ngai</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Speed</surname>
<given-names>T. P.</given-names>
</name>
<name>
<surname>Dudoit</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Normalization of RNA-Seq Data Using Factor Analysis of Control Genes or Samples</article-title>. <source>Nat. Biotechnol.</source> <volume>32</volume>, <fpage>896</fpage>&#x2013;<lpage>902</lpage>. <pub-id pub-id-type="doi">10.1038/NBT.2931</pub-id> </citation>
</ref>
<ref id="B53">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ritchie</surname>
<given-names>M. E.</given-names>
</name>
<name>
<surname>Phipson</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Law</surname>
<given-names>C. W.</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>W.</given-names>
</name>
<etal/>
</person-group> (<year>2015</year>). <article-title>Limma powers Differential Expression Analyses for RNA-Sequencing and Microarray Studies</article-title>. <source>Nucleic Acids Res.</source> <volume>43</volume>, <fpage>e47</fpage>. <pub-id pub-id-type="doi">10.1093/nar/gkv007</pub-id> </citation>
</ref>
<ref id="B54">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ryan</surname>
<given-names>M. C.</given-names>
</name>
<name>
<surname>Cleland</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Wong</surname>
<given-names>W. C.</given-names>
</name>
<name>
<surname>Weinstein</surname>
<given-names>J.&#x20;N.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>SpliceSeq: A Resource for Analysis and Visualization of RNA-Seq Data on Alternative Splicing and its Functional Impacts</article-title>. <source>Bioinformatics</source> <volume>28</volume>, <fpage>2385</fpage>&#x2013;<lpage>2387</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/bts452</pub-id> </citation>
</ref>
<ref id="B55">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sanchez-Vega</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Mina</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Armenia</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Chatila</surname>
<given-names>W. K.</given-names>
</name>
<name>
<surname>Luna</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>La</surname>
<given-names>K. C.</given-names>
</name>
<etal/>
</person-group> (<year>2018</year>). <article-title>Oncogenic Signaling Pathways in the Cancer Genome Atlas</article-title>. <source>Cell</source> <volume>173</volume>, <fpage>321</fpage>&#x2013;<lpage>e10</lpage>. <pub-id pub-id-type="doi">10.1016/j.cell.2018.03.035</pub-id> </citation>
</ref>
<ref id="B56">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Saraiva-Agostinho</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Barbosa-Morais</surname>
<given-names>N. L.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Psichomics: Graphical Application for Alternative Splicing Quantification and Analysis</article-title>. <source>Nucleic Acids Res.</source> <volume>47</volume>, <fpage>e7</fpage>. <pub-id pub-id-type="doi">10.1093/nar/gky888</pub-id> </citation>
</ref>
<ref id="B57">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Scotti</surname>
<given-names>M. M.</given-names>
</name>
<name>
<surname>Swanson</surname>
<given-names>M. S.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>RNA Mis-Splicing in Disease</article-title>. <source>Nat. Rev. Genet.</source> <volume>17</volume>, <fpage>19</fpage>&#x2013;<lpage>32</lpage>. <pub-id pub-id-type="doi">10.1038/nrg.2015.3</pub-id> </citation>
</ref>
<ref id="B58">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sebestyen</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Zawisza</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Eyras</surname>
<given-names>E.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Detection of Recurrent Alternative Splicing Switches in Tumor Samples Reveals Novel Signatures of Cancer</article-title>. <source>Nucleic Acids Res.</source> </citation>
</ref>
<ref id="B59">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shabalin</surname>
<given-names>A. A.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Matrix eQTL: Ultra Fast eQTL Analysis via Large Matrix Operations</article-title>. <source>Bioinformatics</source> <volume>28</volume>, <fpage>1353</fpage>&#x2013;<lpage>1358</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/bts163</pub-id> </citation>
</ref>
<ref id="B60">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shannon</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Markiel</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Ozier</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Baliga</surname>
<given-names>N. S.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>J.&#x20;T.</given-names>
</name>
<name>
<surname>Ramage</surname>
<given-names>D.</given-names>
</name>
<etal/>
</person-group> (<year>2003</year>). <article-title>Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks</article-title>. <source>Genome Res.</source> <volume>13</volume>, <fpage>2498</fpage>&#x2013;<lpage>2504</lpage>. <pub-id pub-id-type="doi">10.1101/gr.1239303</pub-id> </citation>
</ref>
<ref id="B61">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shen</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Park</surname>
<given-names>J.&#x20;W.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>Z. X.</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Henry</surname>
<given-names>M. D.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>Y. N.</given-names>
</name>
<etal/>
</person-group> (<year>2014</year>). <article-title>rMATS: Robust and Flexible Detection of Differential Alternative Splicing from Replicate RNA-Seq Data</article-title>. <source>Proc. Natl. Acad. Sci. U. S. A.</source> <volume>111</volume>, <fpage>E5593</fpage>&#x2013;<lpage>E5601</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1419161111</pub-id> </citation>
</ref>
<ref id="B62">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shirure</surname>
<given-names>V. S.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Delgadillo</surname>
<given-names>L. F.</given-names>
</name>
<name>
<surname>Cuckler</surname>
<given-names>C. M.</given-names>
</name>
<name>
<surname>Tees</surname>
<given-names>D. F.</given-names>
</name>
<name>
<surname>Benencia</surname>
<given-names>F.</given-names>
</name>
<etal/>
</person-group> (<year>2015</year>). <article-title>CD44 Variant Isoforms Expressed by Breast Cancer Cells Are Functional E-Selectin Ligands under Flow Conditions</article-title>. <source>Am. J.&#x20;Physiol. Cel Physiol</source> <volume>308</volume>, <fpage>C68</fpage>&#x2013;<lpage>C78</lpage>. <pub-id pub-id-type="doi">10.1152/ajpcell.00094.2014</pub-id> </citation>
</ref>
<ref id="B63">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Slaff</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Radens</surname>
<given-names>C. M.</given-names>
</name>
<name>
<surname>Jewell</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Jha</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Lahens</surname>
<given-names>N. F.</given-names>
</name>
<name>
<surname>Grant</surname>
<given-names>G. R.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>MOCCASIN: a Method for Correcting for Known and Unknown Confounders in RNA Splicing Analysis</article-title>. <source>Nat. Commun.</source> <volume>12</volume>, <fpage>1</fpage>&#x2013;<lpage>9</lpage>. <pub-id pub-id-type="doi">10.1038/s41467-021-23608-9</pub-id> </citation>
</ref>
<ref id="B64">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stanek</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Bis-Brewer</surname>
<given-names>D. M.</given-names>
</name>
<name>
<surname>Saghira</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Danzi</surname>
<given-names>M. C.</given-names>
</name>
<name>
<surname>Seeman</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Lassuthova</surname>
<given-names>P.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Prot2HG: A Database of Protein Domains Mapped to the Human Genome</article-title>. <source>Database (Oxford)</source> <volume>2020</volume>, <fpage>161</fpage>. <pub-id pub-id-type="doi">10.1093/database/baz161</pub-id> </citation>
</ref>
<ref id="B65">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sterne-Weiler</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Weatheritt</surname>
<given-names>R. J.</given-names>
</name>
<name>
<surname>Best</surname>
<given-names>A. J.</given-names>
</name>
<name>
<surname>Ha</surname>
<given-names>K. C. H.</given-names>
</name>
<name>
<surname>Blencowe</surname>
<given-names>B. J.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Efficient and Accurate Quantitative Profiling of Alternative Splicing Patterns of Any Complexity on a Laptop</article-title>. <source>Mol. Cel</source> <volume>72</volume>, <fpage>187</fpage>&#x2013;<lpage>e6</lpage>. <pub-id pub-id-type="doi">10.1016/j.molcel.2018.08.018</pub-id> </citation>
</ref>
<ref id="B66">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Teckchandani</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Toida</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Goodchild</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Henderson</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Watts</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Wollscheid</surname>
<given-names>B.</given-names>
</name>
<etal/>
</person-group> (<year>2009</year>). <article-title>Quantitative Proteomics Identifies a Dab2/integrin Module Regulating Cell Migration</article-title>. <source>J.&#x20;Cel Biol.</source> <volume>186</volume>, <fpage>99</fpage>&#x2013;<lpage>111</lpage>. <pub-id pub-id-type="doi">10.1083/jcb.200812160</pub-id> </citation>
</ref>
<ref id="B67">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Thorsen</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>S&#xf8;rensen</surname>
<given-names>K. D.</given-names>
</name>
<name>
<surname>Brems-Eskildsen</surname>
<given-names>A. S.</given-names>
</name>
<name>
<surname>Modin</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Gaustadnes</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Hein</surname>
<given-names>A. M.</given-names>
</name>
<etal/>
</person-group> (<year>2008</year>). <article-title>Alternative Splicing in colon, Bladder, and Prostate Cancer Identified by Exon Array Analysis</article-title>. <source>Mol. Cel. Proteomics</source> <volume>7</volume>, <fpage>1214</fpage>&#x2013;<lpage>1224</lpage>. <pub-id pub-id-type="doi">10.1074/mcp.M700590-MCP200</pub-id> </citation>
</ref>
<ref id="B68">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tomczak</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Czerwi&#x144;ska</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Wiznerowicz</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>The Cancer Genome Atlas (TCGA): An Immeasurable Source of Knowledge</article-title>. <source>Contemp. Oncol. (Pozn)</source> <volume>19</volume>, <fpage>A68</fpage>&#x2013;<lpage>A77</lpage>. <pub-id pub-id-type="doi">10.5114/wo.2014.47136</pub-id> </citation>
</ref>
<ref id="B69">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Trapnell</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Pachter</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Salzberg</surname>
<given-names>S. L.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>TopHat: Discovering Splice Junctions with RNA-Seq</article-title>. <source>Bioinformatics</source> <volume>25</volume>, <fpage>1105</fpage>&#x2013;<lpage>1111</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btp120</pub-id> </citation>
</ref>
<ref id="B70">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Trapnell</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Williams</surname>
<given-names>B. A.</given-names>
</name>
<name>
<surname>Pertea</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Mortazavi</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Kwan</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Van Baren</surname>
<given-names>M. J.</given-names>
</name>
<etal/>
</person-group> (<year>2010</year>). <article-title>Transcript Assembly and Quantification by RNA-Seq Reveals Unannotated Transcripts and Isoform Switching during Cell Differentiation</article-title>. <source>Nat. Biotechnol.</source> <volume>28</volume>, <fpage>511</fpage>&#x2013;<lpage>515</lpage>. <pub-id pub-id-type="doi">10.1038/nbt.1621</pub-id> </citation>
</ref>
<ref id="B71">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vaquero-Garcia</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Barrera</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Gazzara</surname>
<given-names>M. R.</given-names>
</name>
<name>
<surname>Gonz&#xe1;lez-Vallinas</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Lahens</surname>
<given-names>N. F.</given-names>
</name>
<name>
<surname>Hogenesch</surname>
<given-names>J.&#x20;B.</given-names>
</name>
<etal/>
</person-group> (<year>2016</year>). <article-title>A New View of Transcriptome Complexity and Regulation through the Lens of Local Splicing Variations</article-title>. <source>Elife</source> <volume>5</volume>, <fpage>e11752</fpage>. <pub-id pub-id-type="doi">10.7554/eLife.11752</pub-id> </citation>
</ref>
<ref id="B72">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Verdi</surname>
<given-names>J.&#x20;M.</given-names>
</name>
<name>
<surname>Bashirullah</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Goldhawk</surname>
<given-names>D. E.</given-names>
</name>
<name>
<surname>Kubu</surname>
<given-names>C. J.</given-names>
</name>
<name>
<surname>Jamali</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Meakin</surname>
<given-names>S. O.</given-names>
</name>
<etal/>
</person-group> (<year>1999</year>). <article-title>Distinct Human NUMB Isoforms Regulate Differentiation vs. Proliferation in the Neuronal Lineage</article-title>. <source>Proc. Natl. Acad. Sci. U. S. A.</source> <volume>96</volume>, <fpage>10472</fpage>&#x2013;<lpage>10476</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.96.18.10472</pub-id> </citation>
</ref>
<ref id="B73">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vieira</surname>
<given-names>S. E.</given-names>
</name>
<name>
<surname>Bando</surname>
<given-names>S. Y.</given-names>
</name>
<name>
<surname>De Paulis</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Oliveira</surname>
<given-names>D. B. L.</given-names>
</name>
<name>
<surname>Thomazelli</surname>
<given-names>L. M.</given-names>
</name>
<name>
<surname>Durigon</surname>
<given-names>E. L.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Distinct Transcriptional Modules in the Peripheral Blood Mononuclear Cells Response to Human Respiratory Syncytial Virus or to Human Rhinovirus in Hospitalized Infants with Bronchiolitis</article-title>. <source>PLoS One</source> <volume>14</volume>, <fpage>e0213501</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0213501</pub-id> </citation>
</ref>
<ref id="B74">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>E. T.</given-names>
</name>
<name>
<surname>Sandberg</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Khrebtukova</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Mayr</surname>
<given-names>C.</given-names>
</name>
<etal/>
</person-group> (<year>2008</year>). <article-title>Alternative Isoform Regulation in Human Tissue Transcriptomes</article-title>. <source>Nature</source> <volume>456</volume>, <fpage>470</fpage>&#x2013;<lpage>476</lpage>. <pub-id pub-id-type="doi">10.1038/nature07509</pub-id> </citation>
</ref>
<ref id="B75">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Singh</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Zeng</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Coleman</surname>
<given-names>S. J.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Savich</surname>
<given-names>G. L.</given-names>
</name>
<etal/>
</person-group> (<year>2010</year>). <article-title>MapSplice: Accurate Mapping of RNA-Seq Reads for Splice junction Discovery</article-title>. <source>Nucleic Acids Res.</source> <volume>38</volume>, <fpage>e178</fpage>. <pub-id pub-id-type="doi">10.1093/nar/gkq622</pub-id> </citation>
</ref>
<ref id="B76">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>S. X.</given-names>
</name>
<name>
<surname>Rao</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Modulator-Dependent RBPs Changes Alternative Splicing Outcomes in Kidney Cancer</article-title>. <source>Front. Genet.</source> <volume>11</volume>, <fpage>265</fpage>. <pub-id pub-id-type="doi">10.3389/fgene.2020.00265</pub-id> </citation>
</ref>
<ref id="B77">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Sandiford</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>S. S.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Numb Regulates Cell-Cell Adhesion and Polarity in Response to Tyrosine Kinase Signalling</article-title>. <source>EMBO J.</source> <volume>28</volume>, <fpage>2360</fpage>&#x2013;<lpage>2373</lpage>. <pub-id pub-id-type="doi">10.1038/emboj.2009.190</pub-id> </citation>
</ref>
<ref id="B78">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Hackert</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Z&#xf6;ller</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>CD44/CD44v6 a Reliable Companion in Cancer-Initiating Cell Maintenance and Tumor Progression</article-title>. <source>Front. Cel Dev. Biol.</source> <volume>6</volume>, <fpage>97</fpage>. <pub-id pub-id-type="doi">10.3389/fcell.2018.00097</pub-id> </citation>
</ref>
<ref id="B85">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wilcox</surname>
<given-names>R. R.</given-names>
</name>
</person-group> (<year>2012</year>). <source>Introduction to Robust Estimation and Hypothesis Testing</source>. <pub-id pub-id-type="doi">10.1016/C2010-0-67044-1</pub-id> </citation>
</ref>
<ref id="B79">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yanagisawa</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Huveldt</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Kreinest</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Lohse</surname>
<given-names>C. M.</given-names>
</name>
<name>
<surname>Cheville</surname>
<given-names>J.&#x20;C.</given-names>
</name>
<name>
<surname>Parker</surname>
<given-names>A. S.</given-names>
</name>
<etal/>
</person-group> (<year>2008</year>). <article-title>A P120 Catenin Isoform Switch Affects Rho Activity, Induces Tumor Cell Invasion, and Predicts Metastatic Disease</article-title>. <source>J.&#x20;Biol. Chem.</source> <volume>283</volume>, <fpage>18344</fpage>&#x2013;<lpage>18354</lpage>. <pub-id pub-id-type="doi">10.1074/jbc.M801192200</pub-id> </citation>
</ref>
<ref id="B80">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Horvath</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2005</year>). <article-title>A General Framework for Weighted Gene Co-expression Network Analysis</article-title>. <source>Stat. Appl. Genet. Mol. Biol.</source> <volume>4</volume>, <fpage>Article17</fpage>. <pub-id pub-id-type="doi">10.2202/1544-6115.1128</pub-id> </citation>
</ref>
<ref id="B81">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Bao</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Shen</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Pan</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Xiao</surname>
<given-names>M.</given-names>
</name>
<etal/>
</person-group> (<year>2020a</year>). <article-title>RNA Binding Motif Protein 10 Suppresses Lung Cancer Progression by Controlling Alternative Splicing of Eukaryotic Translation Initiation Factor 4H</article-title>. <source>EBioMedicine</source> <volume>61</volume>, <fpage>103067</fpage>. <pub-id pub-id-type="doi">10.1016/j.ebiom.2020.103067</pub-id> </citation>
</ref>
<ref id="B82">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Cao</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Ye</surname>
<given-names>Y.</given-names>
</name>
<etal/>
</person-group> (<year>2014</year>). <article-title>Transcriptome Profiling of a Multiple Recurrent Muscle-Invasive Urothelial Carcinoma of the Bladder by Deep Sequencing</article-title>. <source>PLoS One</source> <volume>9</volume>, <fpage>e91466</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0091466</pub-id> </citation>
</ref>
<ref id="B83">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Parmigiani</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Johnson</surname>
<given-names>W. E.</given-names>
</name>
</person-group> (<year>2020b</year>). <article-title>ComBat-seq: Batch Effect Adjustment for RNA-Seq Count Data</article-title>. <source>NAR Genom Bioinform</source> <volume>2</volume>, <fpage>lqaa078</fpage>. <pub-id pub-id-type="doi">10.1093/NARGAB/LQAA078</pub-id> </citation>
</ref>
<ref id="B84">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zong</surname>
<given-names>F. Y.</given-names>
</name>
<name>
<surname>Fu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wei</surname>
<given-names>W. J.</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>Y. G.</given-names>
</name>
<name>
<surname>Heiner</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Cao</surname>
<given-names>L. J.</given-names>
</name>
<etal/>
</person-group> (<year>2014</year>). <article-title>The RNA-Binding Protein QKI Suppresses Cancer-Associated Aberrant Splicing</article-title>. <source>Plos Genet.</source> <volume>10</volume>, <fpage>e1004289</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pgen.1004289</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>