Assembling SARS-CoV-2 Genomes
Analysis pipeline for the assembly of SARS-CoV-2 sequences from Illumina data, including quality trimming and removal of amplicon primers, map to reference, SNP calling and consensus generation.
Automated Workflow for Geneious Prime 2021.1.2 and above
The analysis pipeline described here has been combined into a Geneious Workflow, attached below.
To import the workflow into Geneious, drag and drop it onto the Geneious window. Then go to Manage Workflows, select the workflow and go View/Edit. Open each step of the workflow by clicking on the step and going View/Edit options. This will update the options for your copy of Geneious and enable you to select the primers and reference sequence from your own database.
To run the workflow, close the Manage Workflows window and select the files of raw Illumina reads you wish to assemble. Then click the Workflows button and select the workflow from the list.
NCBI has a new BLAST database for betacoronaviruses. To add this database to the Geneious NCBI BLAST, go to Tools -> Add/Remove Databases ->Set up BLAST Services. Under the NCBI tab, click Edit Databases, and then Add+. Enter the details as in the screenshot – ensure the database name is written exactly as shown here.
2. Artic Network Primers for Tiled Amplicon Sequencing
Sars-CoV-2 V3 primers for tiled amplicon sequencing from the Artic Network
Further information on importing other amplicon primer sets – Quick guide to importing primer sets from TSV/CSV files
4. SARS-CoV-2 Genomes from NCBI Viruses
(All nucleotide sequences available on the NCBI SARS-CoV-2 data hub as at 12th May 2020)
To update this list with new sequences as they are released, we recommend setting up an Agent. To do this, go to View -> Agents, then click Create. Select Nucleotide under Search Database, and search by Organism contains “Severe acute respiratory syndrome coronavirus 2”, as shown in the screenshot below.
Note that there is a delay of a few days between a sequence appearing on the NCBI SARS-CoV-2 data hub and being available for searching in NCBI Nucleotide.