Skip to content

MonashBioinformaticsPlatform/RSeQC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RSeQC: An RNA-seq Quality Control Package

This is a fork of original RSeQC package from sorceforge site I'm making drastic rearrangement to this package to make it easier to follow. I'm also making changes to the code base At this stage only read_dist (read_distribution) and bam_stats (bam_stat) modules have been incorporated and both now can be accessed from main executable scripts/rseqc

Original message

RSeQC package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data. Some basic modules quickly inspect sequence quality, nucleotide composition bias, PCR bias and GC bias, while RNA-seq specific modules evaluate sequencing saturation, mapped reads distribution, coverage uniformity, strand specificity, transcript level RNA integrity etc.

Table of content

Quick start

Once installed use main executable file rseqc to run any of the sub-commands (modules)

e.g

rseqc read_dist --input_file yourBamFile.bam --gene_models yourGTFfile.gtf

OR

rseqc read_dist --input_file yourBamFile.bam --gene_models yourBED12file.bed --file_type bed

Installation

You will need either sudo or virtualenvs (which is my preferred method). If you are you going to use sudo please prefix python setup.py install and pip install numpy with sudo.

git clone --branch fresh https://github.com/MonashBioinformaticsPlatform/RSeQC.git
cd RSeQC
python setup.py install
rseqc --help

I haven't figured why, but numpy needs to be installed separately. It doesn't get pulled correctly from the dependencies list in setup.up.

pip install numpy

Input format

  • BED file is tab separated, 12-column, plain text file to represent gene models
  • GTF file is also represents gene models. This is an alternative file to BED12
  • SAM/BAM file holds information about read alignment to the reference genome.

Contact

Reference