Skip to content

Microbiome Helper 2 Amplicon SOP (qiime2‐amplicon‐2024.10)

Monica Alvaro Fuss edited this page Jan 22, 2025 · 10 revisions

Authors:
Modifications by: NA
Based on initial versions by: Gavin Douglas (Amplicon SOP v2 (qiime2 2018.6) and André Comeau (PacBio CCS Amplicon SOP v1 (qiime2)))

1. First steps

1.1 Activate QIIME2 environment

You should run the rest of the workflow in a conda environment, which makes sure the correct version of the Python packages required by QIIME 2 are being used. You can activate this conda environment with this command (you may need to swap in source for conda if you get an error):

conda activate qiime2-amplicon-2024.10

1.2 Set number of cores

Several commands throughout this workflow can run on multiple cores in parallel. How many cores to use in these cases will be saved to the NCORES variable defined below. We set this variable to 1 below, but you can change this to be however many cores you would like to use.

NCORES=1

1.3 Inspect read quality

Visualize sequence quality across raw reads. This is important as a sanity check that your reads are of reasonable quality and to determine how your reads should be trimmed in downstream steps. QIIME 2 comes with a plugin for visualizing read quality, which we will use at a later step. However, when dealing with raw reads the easiest method to use is a combination of [FASTQC][15] and [MultiQC][16]. Note that these tools are not packaged with QIIME 2 so you will need to install them separately.

This is an important step for identifying outlier samples with especially low quality, read sizes, read depth, and other metrics.

You can run FASTQC with this command (after creating the output directory).

mkdir fastqc_out
fastqc -t $NCORES raw_data/*.fastq.gz -o fastqc_out

If you receive the error Value "FASTQ" invalid for option threads (number expected) (where "FASTQ" is an input filename) then make sure you have defined the NCORES variable correctly and re-run the command.

FASTQC generates a report for each individual file. To aggregate the summary files into a single report we can run MultiQC with these commands:

multiqc fastqc_out --filename multiqc_report.html

The full report is found within multiqc_report.html. You can view this report in a web-browser on your local computer. The most important reason to visualize this report is to ensure that your samples are of high-quality (based largely on whether the per-base quality is >30 across most of the reads) and that there are no outlier samples.

1.4 Format metadata table

Note that this can be skipped if not using

2. Import, trim primers and denoise

Here you should pick just one option!!

2.1 Illumina import, trim primers and denoise

2.1.1 Import FASTQs as QIIME 2 artifact

2.1.2 Summarize Summarize raw FASTQs

2.1.3 Trim primers with cutadapt

16S V4/V5 universal

16S V6/V8 bacteria-specific

16S V6/V8 archaea-specific

18S V4

ITS2

Other

2.1.4 Summarize trimmed FASTQs

2.1.5 Denoise with Deblur (recommended)

Join paired-end reads

Filter out low-quality reads

Running deblur

Summarizing deblur output

Copy denoised output to be same as from DADA2 below

2.1 PacBio import and denoise

2.1.1 Import FASTQs as QIIME 2 artifact

2.1.2 Summarize raw FASTQs

2.1.3 Denoise with DADA2 (recommended)

16S bacteria-specific

16S archaea-specific

18S

ITS

2.1.4 Summarizing deblur output

2.1.5 Copy denoised output to be same as from Deblur

3. Assign taxonomy to ASVs

3.1 Build or acquire taxonomic classifier

16S Greengenes2 (recommended)

16S/18S SILVA v138

16S HOMD

16S GTDB

ITS2

Building your own classifier

3.2 Run taxonomic classification

3.3 Assess subset of taxonomic assignments with BLAST

4. Filtering resultant table

4.1 Filter out rare ASVs

4.2 Filter out contaminant and unclassified ASVs

16S

Other

4.3 Exclude low-depth samples

4.4 Subset and summarize filtered table

5. Build tree

5.1 Insertion with SEPP into tree (16S)

5.2 Building a tree

Ghost tree for ITS2 or others?

Building a de novo tree - would ssu-align or similar do a better job than mafft?

6. Exporting the final abundance, profile and sequence files

Other resources

Links to next tutorials including whole analysis workflow and basic visualisation and statistics in QIIME2

Clone this wiki locally