-
Notifications
You must be signed in to change notification settings - Fork 205
Microbiome Helper 2 Amplicon SOP (qiime2‐amplicon‐2024.10)
Authors:
Modifications by: NA
Based on initial versions by: Gavin Douglas (Amplicon SOP v2 (qiime2 2018.6) and André Comeau (PacBio CCS Amplicon SOP v1 (qiime2)))
You should run the rest of the workflow in a conda environment, which makes sure the correct version of the Python packages required by QIIME 2 are being used. You can activate this conda environment with this command (you may need to swap in source
for conda
if you get an error):
conda activate qiime2-amplicon-2024.10
Several commands throughout this workflow can run on multiple cores in parallel. How many cores to use in these cases will be saved to the NCORES
variable defined below. We set this variable to 1 below, but you can change this to be however many cores you would like to use.
NCORES=1
Visualize sequence quality across raw reads. This is important as a sanity check that your reads are of reasonable quality and to determine how your reads should be trimmed in downstream steps. QIIME 2 comes with a plugin for visualizing read quality, which we will use at a later step. However, when dealing with raw reads the easiest method to use is a combination of [FASTQC][15] and [MultiQC][16]. Note that these tools are not packaged with QIIME 2 so you will need to install them separately.
This is an important step for identifying outlier samples with especially low quality, read sizes, read depth, and other metrics.
You can run FASTQC with this command (after creating the output directory).
mkdir fastqc_out
fastqc -t $NCORES raw_data/*.fastq.gz -o fastqc_out
If you receive the error Value "FASTQ" invalid for option threads (number expected)
(where "FASTQ" is an input filename) then make sure you have defined the NCORES
variable correctly and re-run the command.
FASTQC generates a report for each individual file. To aggregate the summary files into a single report we can run MultiQC with these commands:
multiqc fastqc_out --filename multiqc_report.html
The full report is found within multiqc_report.html
. You can view this report in a web-browser on your local computer. The most important reason to visualize this report is to ensure that your samples are of high-quality (based largely on whether the per-base quality is >30 across most of the reads) and that there are no outlier samples.
Note that this can be skipped if not using
Here you should pick just one option!!
Links to next tutorials including whole analysis workflow and basic visualisation and statistics in QIIME2
- Please feel free to post a question on the Microbiome Helper google group if you have any issues.
- General comments or inquires about Microbiome Helper can be sent to [email protected].