Name	Name	Last commit message	Last commit date
Latest commit History 153 Commits
Accession_lists	Accession_lists
database-building	database-building
diamond	diamond
diamond_analysis	diamond_analysis
metadata	metadata
setup_scripts	setup_scripts
README.md	README.md
environment.yml	environment.yml
gene_hits.tsv	gene_hits.tsv
get_data.sbatch	get_data.sbatch

Nicotine-Microbiome

IQ Biology Rotation Project

Abstract

Nicotine Metabolism Genes in the Oral Microbiomes of Nicotine Users and Non-Users

John Sterrett 1,2, Noah Fierer 3,4

Department of Integrative Physiology, University of Colorado, Boulder, CO
Interdisciplinary Quantitative Biology, University of Colorado, Boulder, CO
Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, CO
Cooperative Institute for Research in Environmental Sciences, Boulder, CO

Individuals who smoke cigarettes or use e-cigarettes inhale nicotine (among other compounds) frequently throughout the day, and such drug use increases risks of cancer, lung disease, and cardiovascular disease. Many studies have investigated the metabolism of nicotine within human cells and tissues, which occurs at a rate that is largely genetically determined, with this rate affecting the likelihood of successful smoking cessation. What has not been investigated is whether those bacteria living in the oral cavity are also capable of nicotine metabolism. Given that bacteria capable of nicotine metabolism have been found in other environments and given the well-characterized impacts of nicotine use on the taxonomic structure of oral microbiomes, it is possible that frequent nicotine use could select for nicotine-metabolizing bacteria in the oral microbiome. However, no research has specifically investigated whether genes for nicotine degradation are even present in the oral microbiome and, if so, whether the oral microbiomes of smokers and e-cigarette users have higher relative abundances of genes involved in nicotine catabolism.

To address this knowledge gap, I compiled a database of sequences for all known bacterial genes involved in nicotine degradation. I then analyzed a publicly available oral microbiome shotgun metagenomic dataset from a study of smokers, e-cigarette users, and non-users. The raw reads were quality filtered and merged, and bacterial nicotine degradation genes were identified. Nicotine gene counts were then normalized based on the number of single-copy bacterial marker genes per metagenome. I was able to detect genes encoding 16 of the 27 enzymes in the pyridine and pyrrolidine pathways for nicotine degradation, demonstrating that genes for nicotine degradation are indeed present in the oral microbiome. However, there were no significant increases in the relative abundances of nicotine degradation genes in smokers or e-cigarette users compared to non-users. The dataset, though, was limited by large variation in sequencing depth, and our analysis was constrained to relative abundances of such genes. The detection of nicotine degradation genes supports the application of our pipeline to other existing metagenomic datasets or the development of qPCR assays to quantify absolute abundances of these genes. Furthermore, the constructed database and pipeline could be used with reference genome databases to more identify which bacterial taxa, beyond those already described, are likely capable of nicotine catabolism. This work was supported in part by the Interdisciplinary Quantitative Biology (IQ Biology) program at the BioFrontiers Institute, University of Colorado, Boulder.

Goal

Assess number of nicotine-degrading genes in the oral microbiomes of smokers/vapers vs non-smokers/vapers

Main files of interest

diamond/ contains the Snakefile for the Snakemake pipeline
diamond_analysis/ contains the Jupyter notebook used to analyze the Diamond output
database-building/ contains diamond formatted databases for nicotine-degrading genes as well as the raw .fasta files for the database (used uniprot-fastas/uniprot-nicotine.fasta for the fasta)

Methods

Pull shotgun metagenomic data from this paper
- Project IDs: PRJNA548383, PRJNA544061, and PRJNA508385
Create database of sequences for genes involved in nicotine degradation, based on Mu et al., 2020
- MetaCyc degradation pathways tsvs added
- Pull fastas from GenBank
  - conda install -y -c conda-forge -c bioconda -c defaults entrez-direct
  - esearch -db protein -query 'Protein name' | efetch -format fasta
  - But some of the genes only have names, not accession IDs, so they return sequences for unwanted genes...
- Start with a few key genes to get the pipeline running
- Pull from UniProt and/or GenBank
  - UniProt API querying
- Then add others once things are working
- Check for these genes in their hosts in IMG as a sanity check
Merge reads using PEAR
Filter short reads using
Use Diamond to find nicotine-degrading genes in the oral microbiomes
Assess difference in nicotine-degrading genes in the oral microbiomes of smokes/vapers vs controls
Normalize based on total reads and bacterial single-copy marker genes (DIAMOND as well) from here

Database

Get MetaCyc gene details
Search E.C. IDs in UniProt
Download fastas for all matching sequences
- For 3.5.1.3, filtered only bacteria because lots of human and mouse sequences were coming up
- Skipped 1.3.99.- and 1.5.99.- because too many unrelated genes were coming up
- E.C. for nctB was sourced from Uniprot info on (S)-6-hydroxynicotine oxidase
- Skipped 1.5.8.M1 because it wasn't pulling up any results, and searching the name pulled up kdhA/B/C, which I already had

Diamond

Please cite http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)
tested diamond with diamond blastx -q PRJNA548383/SRR9641933/SRR9641933_1.fastq -d database-building/uniprot-nicotine-db.dmnd -o test.tsv
- Got a couple hundred hits

Files/details

setup_scripts contains scripts with code to install sra toolkit. I haven't tried these in script format (since I only needed to install it once), but I imagine I will need this code later if using fastq-dump on fiji.

If having issues see the documentation or config guidelines.
Note for future John can try the following instead of vdb-config -i:
- vdb-config --restore-defaults
- vdb-config -s "/repository/user/main/public/root=."
- vdb-config -s "cache-enabled=true"
- vdb-config --report-cloud-identity yes

get_data.sh is a script to loop through the SRA accession values for the projects and download the fastq files in directories named by their run IDs.

If having issues with this, consider something similar to the following:

while read line; do
wget http://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?cmd=dload&run_list=${line}&format=fastq;
done<list_of_ids

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nicotine-Microbiome

Abstract

Goal

Main files of interest

Methods

Database

Diamond

Files/details

About

Releases

Packages

Languages

sterrettJD/Nicotine-Microbiome

Folders and files

Latest commit

History

Repository files navigation

Nicotine-Microbiome

Abstract

Goal

Main files of interest

Methods

Database

Diamond

Files/details

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages