Analyzing Microsatellite Traces
Learn how to use the Microsatellite plugin for Geneious to fit a ladder, call peaks, bin alleles and produce a table of genotypes.
Complete the tutorial yourself with included sequence data. Download the tutorial then install by dragging and dropping the zip file into Geneious Prime. Do not unzip the tutorial.
The Microsatellites Plugin for Geneious Prime provides a user friendly tool that imports ABI fragment analysis files and allows you to visualize traces, fit ladders, call peaks, predict bins, display alleles in tabular format and export your data for further analysis.
To install the plugin, go to Tools → Plugins, choose Microsatellite Plugin from the list of available plugins and click Install. You will then need to restart Geneious Prime.
For background information on microsatellites, you can refer to this Wikipedia page.
This tutorial covers all the core features of the Geneious plugin.
Exercise 1 – Ladders
Before any alleles can be called, you need to check that the ladders have been called correctly and edit any that haven’t. Geneious will automatically fit the correct ladder once all the ladder peaks are called correctly.
Select all of the provided documents and the Traces tab. Set spacing to 80, uncheck Allow Vertical Overlap and Scale X Axes, set sizing method to 3rd Order Least Squares. Make sure Loci says New Loci so you’re starting fresh. Uncheck all dyes except the ladder (LIZ in this case), and turn on show traces, show peak calls, and show peak labels. Peak calls are shown by the vertical line below the trace. Not all of the peak labels (at the bottom of the peak calls) may show up, depending on the width of your screen. Mouse Co-ordinates, Document Names, and Y Axes Scale can be turned on too. You should end up with the view looking like this:
Examine the selected files looking for ones without a ladder or where the ladder hasn’t fitted correctly. Select each one in the document table: A01 and A03 initially. Delete them from the document table.
Once you’ve deleted them, select all again and carry on checking for others that don’t look good. Check the ladder starts at 60 and finishes at 600. Some may finish at less than 600.
Notice how A06 has ladder peaks at the end which look wrong. Select only the A06 sequence and check the Ladder tab – it should be obvious that the peaks have diverged from the ideal:
Return to the Traces tab. Select A05, A06 and A07 all at once so the ladders should all roughly line up. A06 appears to have some missing peak calls. You can either manually call these peaks or just discard the peaks after 480. In this case there aren’t any alleles after 480 (as you’ll see if you turn on the other dyes and turn off LIZ, so we can remove the peaks by selecting that region and hitting Remove Peak. If you don’t remove these incorrect peaks the sizing algorithm may not work correctly.
Save your changes and check the Ladder tab again to see that the remaining peaks fit the ideal properly now. Return to the Traces tab and select all again.
A10 is over-trimmed so you should select that document and move the trimmed region. Drag it left until the first peak called in LIZ is 60. This ladder should be recognized as GeneScan 600. Save that document and then select all again.
With all the trims and ladders correct, you can turn off the ladder (LIZ) and just see the other dyes. Hit the scale X button to draw in base coordinates scaled according to the ladder instead of the electropherogram machine coordinates.
Now that you’ve turned the ladder (LIZ) off, notice that A10 appears to be a null data set too as there are only PET peaks and no others. If you increase the Y scale you will see peaks appearing but these are perfectly aligned with the ladder so are likely to be pull-up artefacts. Delete A10 and reselect all remaining traces.
You’re now ready to move on and set the loci.
Exercise 2 – Setting the Loci
Select all remaining traces. To set the loci, click the Locus Info button to bring up a dialogue like this:
Work through all the dyes setting them as per this table. These settings are based on the known characteristics of the microsatellite loci we are using.
|Trace||Expected # of peaks||Repeat Unit||Range Start||Range Stop|
Click OK, then save the loci as a new file by clicking the Save button. This will bring up the following dialogue allowing you to choose a suitable name.
Again, check your data and fix any peaks that appear to be in the right range and haven’t been called. You can do this by unchecking all but one dye and then Geneious will draw the size range for each locus so you can see that all your peaks fit within that. If not, you can edit the values again and resave.
Here’s what the locus region for PET should look like:
Known data artefacts such as plus-A, minus-A, stutter (slippage) peaks and pull-up could exist. Here you’ll see plus-A, minus-A and stutter peaks as well as the main peak which should have been automatically called. Plus-A or minus-A (caused by incomplete addition of terminal A nucleotides in the PCR) will be the smaller peak that is one bp away, and stutter peaks are minor peaks that are 1-4 repeat units away from the actual peak.
In particular, look at A06 and A09 because these appear to have similar peaks which haven’t been called for the PET trace. First, look at the called peaks and see that they are the product of stutter so you should remove the peak for the shorter of the pair, and where a peak isn’t called, add one. The correct peaks should be at about 404 and 462 in both sequences. Similarly, in A05 and A08 there are stutter peaks at 211 that need to be removed as well. Save your edited documents – they should look like this:
Now that you’ve set the loci and corrected the incorrect peak calls you can move on and perform the binning step.
Exercise 3: Binning
Select all remaining traces. Click the Predict Bins button, choose each dye in turn, and just click OK in the dialogue that pops up. Do the same for the other three dyes (not LIZ.)
This creates bins based on the observed peaks and their size using the current sizing algorithm. Save now.
Go to the Alleles Table, turn on all the dyes and see if there are any unbinned peaks (displayed as red warning boxes with Unbinned Peak). If you have unbinned peaks, go to the affected trace document and turn on just the dye that has the unbinned peak. Select the bin the peak should be in and choose ‘Edit Bin’ and extend the range, or just drag the bin to include the peak.
Go back to the Alleles table to see that the peaks are now binned. Warnings about no peaks can be ignored in this case, as this means that no alleles were amplified in those samples.
Also, have a look at the Allele Size Distribution tab to see a graph showing the range of sizes. Typically, you would only look at one microsatellite dye and see a range of steps indicating allele sizes. Since this is such a small data set the steps aren’t obvious, but if you look at the LIZ dye you can see what the steps would look like for a larger microsatellite data set:
The final step is to export the Alleles table to a CSV file which can be opened in a spreadsheet (such as Excel). You can export with or without warnings. If you export without warnings, it will leave those fields blank, otherwise they’ll contain the text such as No peaks.
You’re now done calling alleles and ready to move on to your next step of microsat analysis.