Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

small output bug in starsolo 2.7.10a #1513

Closed
chubukov opened this issue Mar 24, 2022 · 3 comments
Closed

small output bug in starsolo 2.7.10a #1513

chubukov opened this issue Mar 24, 2022 · 3 comments
Labels
issue: code Likely to be an issue with STAR code
Milestone

Comments

@chubukov
Copy link

Looks to me like probably the wrong field is being output in Solo.out/Gene/summary.csv

Same input.
STAR-2.7.9a:

Reads Mapped to Gene: Unique+Multipe Gene,0.484805
Reads Mapped to Gene: Unique Gene,0.453594

STAR-2.7.10a_alpha_220314

Reads Mapped to Gene: Unique+Multiple Gene,0.453594
Reads Mapped to Gene: Unique Gene,0.453594

@alexdobin alexdobin added the issue: code Likely to be an issue with STAR code label Mar 29, 2022
@alexdobin
Copy link
Owner

Hi Victor,

thanks for catching it, looks like a bug introduced in 2.7.10a.
What parameters did you run with?

Thanks,
Alex

@alexdobin alexdobin added this to the 2.7.11 milestone Mar 29, 2022
@chubukov
Copy link
Author

Hi Alex,

Here's a typical run. One thing that might be relevant is that I built the genome with 2.7.9. Should it be necessary to rebuild it?

/home/vchubukov/STAR-2.7.10a_alpha_220314/source/STAR   --genomeDir /net/seq/data2/projects/prime_seq/genome   --readFilesIn /net/seq/data2/seque
ncers/211222_A01347_0078_AHMTNTDRXY/fastq/CGTACTAG_S2_L002_R2_001.fastq.gz /net/seq/data2/sequencers/211222_A01347_0078_AHMTNTDRXY/fastq/CGTACTAG
_S2_L002_R1_001.fastq.gz   --soloType CB_UMI_Simple   --soloCellReadStats Standard   --clip3pAdapterSeq AAAAAAAAAA   --clip3pAdapterMMp 0.1   --s
oloCBstart 1   --soloCBlen 12   --soloUMIstart 13   --soloUMIlen 16   --soloCBwhitelist /net/seq/data2/projects/prime_seq/barcodes-combined.txt  
 --soloCellFilter  EmptyDrops_CR 72 .99 10 45000 90000 100000 0.01 20000 0.01 10000   --quantMode "TranscriptomeSAM"   --outSAMunmapped Within   
--outFilterType BySJout   --outSAMattributes NH HI AS NM MD CR CY UR UY GX GN CB UB   --outFilterMultimapNmax 20     --outFilterMismatchNmax 999 
    --outFilterMismatchNoverReadLmax 0.04     --alignIntronMin 20     --alignIntronMax 1000000     --alignMatesGapMax 1000000     --alignSJoverha
ngMin 8     --alignSJDBoverhangMin 1   --sjdbScore 1   --readFilesCommand zcat   --runThreadN 10   --limitBAMsortRAM 30000000000   --outSAMtype B
AM SortedByCoordinate   --outSAMheaderCommentFile commentslong.txt   --outSAMheaderHD '@HD' 'VN:1.4' 'SO:coordinate'

@alexdobin
Copy link
Owner

Hi Victor,

after checking the code I realized this is correct behavior, as multi-gene reads are only counted if you specify --soloMultiMappers options. Without these options there is no difference between Unique and Unique+Multiple Gene counts.
To avoid confusion, in the next release I will replace the Unique+Multiple value with the "NoMulti" string.

Thanks again for pointing it out.
Alex

alexdobin added a commit that referenced this issue May 6, 2022
…Issue #1513: If --soloMultiMappers options are not requested, output 'NoMulti' in the 'Reads Mapped to Gene: Unique+Multiple Gene' line of the Summary.csv file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
issue: code Likely to be an issue with STAR code
Projects
None yet
Development

No branches or pull requests

2 participants