RNA-Seq expression analysis code and data

This repository serves as a collection of publicly available RNA-Seq datasets (moslty in maize), provides codes / pipeline to systematically analyze them as well as QC and summarized data output. The focus is on large-scale (multiple tissues / developmental stages, multiple inbred / hybrid lines) Illumina RNA-Seq experiments, but also inlcude experiments done with other sequencing technologies (3' RNA-Seq, etc.).

Raw sequencing reads were downloaded from NCBI Sequence Read Archive (SRA), trimmed using Trim Galore / fastp and mapped to the maize B73 AGP_v4 genome using Hisat2 / STAR. Uniquely mapped reads were assigned to and counted for the 46,117 reference gene models (Ensembl Plants v37) using featureCounts. Raw read counts were then normalized using the TMM normalization approach to give CPMs (Counts Per Million reads) and then further normalized by gene CDS lengths to give FPKM (Fragments Per Kilobase of exon per Million reads) values. Hierarchical clustering and principal component analysis were used to explore sample clustering pattern.

See this table for a list of collected datasets.

Check here for a walk through of output files.

Name		Name	Last commit message	Last commit date
Latest commit History 153 Commits
data		data
misc		misc
nf		nf
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
jgi.plant.atlas.biopro.ids.txt		jgi.plant.atlas.biopro.ids.txt
jgi.plant.atlas.csv		jgi.plant.atlas.csv
jgi.plant.atlas.sum.csv		jgi.plant.atlas.sum.csv
output.md		output.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RNA-Seq expression analysis code and data

About

Releases

Packages

Languages

License

orionzhou/rnaseq

Folders and files

Latest commit

History

Repository files navigation

RNA-Seq expression analysis code and data

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages