Skip to content

jordangumm/omics_16s

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Omics 16S

Automated workflow for OTU classification using shotgun data from Illumina Miseq fastq(s). It follows the Mothur MiSeq SOP except for the initial quality control steps. High-quality read length and abundance have been demonstrated to be primary factors in avoiding spurious and/or inflated OTU classification, so special emphasis was placed on those steps. It also attempts to refine freshwater OTU classifications using TaxAss.

It leverages third party tools and databases:

  • MeFit 2016: a merging and filtering tool for Illumina paired-end reads, designed specifically for 16S rRNA amplicaon sequencing data
  • Mothur 2009: software for describing and comparing microbial communities
  • Silva reference files v128 2016: 16S rRNA seed database and sequence/taxonomy references
  • TaxAss 2017: fine-scale taxonomic assignment for freshwater datasets (by default)

Setup

To be able to build CASPER, you'll need to make sure the g++ compiler and boost libraries are installed

  1. Clone the repository

$ git clone https://github.com/jordangumm/omics_16s.git

  1. Build the local conda environment with dependencies

$ cd omics_16s $./build.sh

  1. Activate the root environment

$source <path_to_project>/dependencies/miniconda/bin/activate

Run Analysis

Use the runner script against a sequencing run directory of fastq(s)

$python runner.py <run_dp>

You can also run on flux!

$python runner.py -q <queue> --flux <run_dp> --account <flux_account_name>

About

Base workflow for setting up OTU analysis from Illumina fastq(s)

Resources

Stars

Watchers

Forks

Packages

No packages published