Skip to content

MonashBioinformaticsPlatform/RSeQC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

.. toctree::
   :maxdepth: 2

RSeQC: An RNA-seq Quality Control Package

RSeQC package provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data. Some basic modules quickly inspect sequence quality, nucleotide composition bias, PCR bias and GC bias, while RNA-seq specific modules evaluate sequencing saturation, mapped reads distribution, coverage uniformity, strand specificity, transcript level RNA integrity etc.

Installation

Use pip to install RSeQC

pip install RSeQC

Install RSeQC from source code (Not recommended)

Prerequisite: gcc; python2.7; numpy; R

Install RSeQC (Example):

tar zxf RSeQC-VERSION.tar.gz

cd RSeQC-VERSION

#type 'python setup.py install --help' to see options
python setup.py install        #Note this requires root privilege
or
python setup.py install --root=/home/user/XXX/         #install RSeQC to user specificed location, does NOT require root privilege

#This is only an example. Change path according to your system configuration
export PYTHONPATH=/home/user/lib/python2.7/site-packages:$PYTHONPATH

#This is only an example. Change path according to your system configuration
export PATH=/home/user/bin:$PATH

Finally, type: python -c 'from qcmodule import SAM'. If no error message comes out, RSeQC modules have been installed successfully.

Input format

RSeQC accepts 4 file formats as input:

  • BED file is tab separated, 12-column, plain text file to represent gene model. Here is an example.
  • SAM or BAM files are used to store reads alignments. SAM file is human readable plain text file, while BAM is binary version of SAM, a compact and index-able representation of reads alignments. Here is an example.
  • Chromosome size file is a two-column, plain text file. Here is an example for human hg19 assembly. Use this script to download chromosome size files of other genomes.
  • Fasta file.

NOTE: If you have GFF/GTF format gene files, we found this Perl script might be useful to convert them to BED.

Fetch chromsome size file from UCSC

download this script and save as 'fetchChromSizes':

# Make sure it's executable
chmod +x fetchChromSizes

fetchChromSizes hg19 >hg19.chrom.sizes

fetchChromSizes danRer7  >zebrafish.chrom.sizes

Contact

Reference