Geneious Academy logo

Creating Automated Workflows Tutorial

Learn how to use Workflows to automate your analyses. In this tutorial you will learn how to set up a workflow starting from raw Sanger sequences and ending with a phylogenetic tree in a pipeline that includes several steps.

Introduction

Creating a Workflow in Geneious Prime

The goal of this tutorial is to show you how to create a workflow in Geneious Prime.

What are workflows?

Workflows enable you to automate your analysis, avoiding running separate steps in different operations. Using Workflows you can group Operations together, reducing to
the minimum the number of steps required to perform an often-used combination of analyses.

All options for each Operation may be preconfigured , so everything is preset before running. Alternatively, some or all options can be exposed to the user for configuration when the Workflow is run.

Geneious Prime provides a number of example Workflows for a variety of tasks that you can try. A series of more than 20 workflows are already available, for performing pipelines such as: Align DNA and then build tree, Apply Variants to Reference Sequence, Map reads then find variation/SNPs, and Randomly Sample Sequences.

Workflows can be run or managed via the Tools menu or the Geneious Toolbar.

Workflows can also be easily shared with other people either by exporting and importing them, or if you are connected to a shared database, by ticking the option to share them.

If you have programming knowledge you can add customized code in Java. For more detailed information about Workflows you can check Section 16 of the Geneious manual. In this tutorial we will show you how to create a simple workflow with example data.

INSTRUCTIONS
To complete the tutorial yourself with included sequence data, download the tutorial and install it by dragging and dropping the zip file into Geneious Prime. Do not unzip the tutorial.

DOWNLOAD TUTORIAL

EXERCISE 1
Automate Sanger data analysis with a workflow

How to create automated workflows

Exercise 1: Automate Sanger data analysis with a workflow

In this example we are going to automate the following steps for analyzing Sanger data into a Workflow:
– Trim Sanger sequences ends with error probability > 1%
– De novo assemble Sanger data of overlapping regions sequenced with forward and reverse primers based on the file naming and save the contigs
– Extract consensus sequences
– Multiple Align Consensus sequences
– Build a Phylogenetic tree

The input sequences are from a barcoding project of two shark species : the Longnose spurdog (Squalus blainville) and the Shortnose spurdog (Squalus megalops), very difficult to tell apart based on morphology, but easily barcoded by Sanger sequencing. The chromatogram data has been downloaded from the BOLD database (Accession BOLD:AAA1550). Each sample has been sequenced with forward and reverse primers.

Let’s build a workflow for the analysis of these sequences.

Create a new workflow

Press Workflow (in the Toolbar), select Manage Workflows and New Workflow from the dialog.

Name it Sanger analysis and add a description. i.e. “Sanger sequences trimming, de novo assembly, multiple alignment and tree building” and select the Chromatogram as Icon.

Now we are ready to add the steps to our workflow.

Add trimming operation to the workflow

Add the first operation by pressing Add step → Add operation.

The first step we want to run is to trim the sequences at 3’ and 5’ for low quality. Start writing the name of the operation you would like to run for automatically filter the available operations. Select Trim ends and OK.

The Trim Ends operation will be added to the workflow.

Double click on the operation for checking/modifying the options. In this example we are not going to expose any option to the user and we leave all the options as default, except for decreasing the error probability limit to 0.01, as shown below. Then click OK.

Add de novo assembly operation to the workflow

Now we can add a second step, following the same procedure as above: Add step → Add operation and select Align/Assemble → De novo assembly.

Also for this operation, we are not going to expose any of the options. Double click on the new step and modify only the options highlighted in the following screenshot:

The assembly will take into account the file naming and assemble all the chromatograms that have the same name prior to the dash (hyphen) into one contig. In this example we want to assemble forward and reverse reads for each of the sample sequenced. As result of this operation we would like to save the contigs.

Note that if you intend to return more than one output at this stage (e.g. contigs and consensus sequences) and then add another step, you will need to specify which output you would like to use for the next step using for example a filtering option (Add step → Filter documents).

Add more operations to the workflow

The following two operations to be added, in this order, are:

– Generate Consensus Sequence
– Alignment → MUSCLE Alignment

Both with default options. The Workflow at this stage should look like the following:

The next and last step we would like to add to the workflow is:

– TreeBuilding → Geneious Tree Builder (nucleotide)

But in this case we would like to expose the option to change the Tree building method (Neighbor-Joining or UPGMA). For that, select Expose some options and then find the option corresponding to the Tree building method as shown below:

Finalize the workflow

Workflows return only the result of the last step . If you would like to save intermediate steps, you just need to add the step called Save Documents / Branch from the Add Step menu.

In this example, we would like to save the contigs generated by the de novo assembly step. Add the step called Save Document / Branch with the following options:

And drag and drop this step right after the de novo assembly step:

Congratulations! Your workflow it’s now set up. You can click OK now and in the next step you will test it on the example data.

Run the workflow

Now select all the example chromatograms provided:

And then run the Workflow you just designed Sanger analysis via the Tool menu or the Toolbar.

Select Neighbor-Joining as Tree Build method.

As a result you will get an assembly document per sample as well as a phylogenetic tree including all the samples.

Under the tree view option Show Tip Labels you can select the Species or the Common name of each sample (as in the screenshot). Notice how it is possible to tell apart the two morphologically similar species with one single barcoding PCR.

Congratulations! You’ve successfully designed and run your first workflow!

Debugging and Quick access

As a debugging option, we advise to test each step on your data before adding new steps, to make sure everything runs as expected.

If you want to easily access your newly built workflow, you can add a direct access to the Toolbar by right clicking on the Toolbar, select customize and then tick your workflow on the list.

Notice that the buttons can also be added to the Toolbar using the option Show Toolbar Shortcut that you find by pressing the gear button in the lower left corner in any Geneious operation.