Skip to content
This repository has been archived by the owner on Dec 18, 2022. It is now read-only.

orionzhou/rnaseq

Repository files navigation

RNA-Seq expression analysis code and data

This repository serves as a collection of publicly available RNA-Seq datasets (moslty in maize), provides codes / pipeline to systematically analyze them as well as QC and summarized data output. The focus is on large-scale (multiple tissues / developmental stages, multiple inbred / hybrid lines) Illumina RNA-Seq experiments, but also inlcude experiments done with other sequencing technologies (3' RNA-Seq, etc.).

Raw sequencing reads were downloaded from NCBI Sequence Read Archive (SRA), trimmed using Trim Galore / fastp and mapped to the maize B73 AGP_v4 genome using Hisat2 / STAR. Uniquely mapped reads were assigned to and counted for the 46,117 reference gene models (Ensembl Plants v37) using featureCounts. Raw read counts were then normalized using the TMM normalization approach to give CPMs (Counts Per Million reads) and then further normalized by gene CDS lengths to give FPKM (Fragments Per Kilobase of exon per Million reads) values. Hierarchical clustering and principal component analysis were used to explore sample clustering pattern.

See this table for a list of collected datasets.

Check here for a walk through of output files.

About

public RNA-Seq datasets

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages