Abstract

Deep mRNA sequencing (mRNAseq) is the state-of-the-art for whole transcriptome measurements. A key step is creating a library of cDNA sequencing fragments from RNA. This is generally done by random priming, creating multiple sequencing fragments along the length of each transcript. A 3’ end-focused library approach cannot detect differential splicing, but has potentially higher throughput at lower cost (~10-fold lower), along with the ability to improve quantification by using transcript molecule counting with unique molecular identifiers (UMI) to correct for PCR bias. Here, we compare implementation of such a 3’-digital gene expression (3’-DGE) approach with “conventional” random primed mRNAseq, which has not yet been done. We find that while conventional mRNAseq detects ~15% more genes, the resulting lists of differentially expressed genes and therefore biological conclusions and gene signatures are highly concordant between the two techniques. We also find good quantitative agreement on the level of individual genes between the two techniques in terms of both read counts and fold change between two conditions. We conclude that for high-throughput applications, the potential cost savings associated with the 3’-DGE approach are a very reasonable tradeoff for modest reduction in sensitivity and inability to observe alternative splicing, and should enable much larger scale studies focused on not only differential expression analysis, but also quantitative transcriptome profiling. The computational scripts and programs, along with experimental standard operating procedures used in our pipeline presented here, are freely available on our website (www.dtoxs.org).

Read more at bioRxiv