Not signed in (Sign In)
Geneious User Forum
Welcome, Guest
Welcome to the Geneious user forum. Here you can discuss Geneious with other users and the team at Biomatters. Questions about using Geneious, requests for features and even questions about bioinformatics in general are welcome! If you have an account, sign in now.
If you don't have an account, sign up for one.
Bottom of Page
Geneious Bugs: when importing GenBank file, 5-digit base numbers are considered "invalid bases"
1 to 4 of 4
Jan 28th 2010
I encountered an error when trying to import a plasmid sequence in GenBank format into Geneious. A dialog came up saying that "invalid bases" were found, but there was nothing beside A, C, T, or G in the sequence. It turns out that the line numbers that show the bp position at the start of each line of sequence were being interpreted as bases. This happens only for the line numbers greater than 10,000 bp. It handles the 4-digit bp numbers fine, but once their are 5 digits it ends up counting the numerical digits as bases and turning them into N's in my imported sequence.
LOCUS pBAC-lacZ Sequence 11151 bp DNA SYN 08-Jun-2009
DEFINITION pBAC-lacZ Sequence
...
...
...
9481 GTTCATTAGG TTGTTCTGTC CATTGCTGAC ATAATCCGCT CCACTTCAAC GTAACACCGC
9541 ACGAAGATTT CTATTGTTCC TGAAGGCATA TTCAAATCGT TTTCGTTACC GCTTGCAGGC
9601 ATCATGACAG AACACTACTT CCTATAAACG CTACACAGGC TCCTGAGATT AATAATGCGG
9661 ATCTCTACGA TAATGGGAGA TTTTCCCGAC TGTTTCGTTC GCTTCTCAGT GGATAACAGC
9721 CAGCTTCTCT GTTTAACAGA CAAAAACAGC ATATCCACTC AGTTCCACAT TTCCATATAA
9781 AGGCCAAGGC ATTTATTCTC AGGATAATTG TTTCAGCATC GCAACCGCAT CAGACTCCGG
9841 CATCGCAAAC TGCACCCGGT GCCGGGCAGC CACATCCAGC GCAAAAACCT TCGTGTAGAC
9901 TTCCGTTGAA CTGATGGACT TATGTCCCAT CAGGCTTTGC AGAACTTTCA GCGGTATACC
9961 GGCATACAGC ATGTGCATCG CATAGGAATG GCGGAACGTA TGTGGTGTGA CCGGAACAGA
problem begins here --> 10021 GAACGTCACA CCGTCAGCAG CAGCGGCGGC AACCGCCTCC CCAATCCAGG TCCTGACCGT
10081 TCTGTCCGTC ACTTCCCAGA TCCGCGCTTT CTCTGTCCTT CCTGTGCGAC GGTTACGCCG
10141 CTCCATGAGC TTATCGCGAA TAAATACCTG TGACGGAAGA TCACTTCGCA GAATAAATAA
10201 ATCCTGGTGT CCCTGTTGAT ACCGGGAAGC CCTGGGCCAA CTTTTGGCGA AAATGAGACG
10261 TTGATCGGCA CGTAAGAGGT TCCAACTTTC ACCATAATGA AATAAGATCA CTACCGGGCG
10321 TATTTTTTGA GTTATCGAGA TTTTCAGGAG CTAAGGAAGC TAAAATGGAG AAAAAAATCA
10381 CTGGATATAC CACCGTTGAT ATATCCCAAT GGCATCGTAA AGAACATTTT GAGGCATTTC
10441 AGTCAGTTGC TCAATGTACC TATAACCAGA CCGTTCAGCT GGATATTACG GCCTTTTTAA
10501 AGACCGTAAA GAAAAATAAG CACAAGTTTT ATCCGGCCTT TATTCACATT CTTGCCCGCC
10561 TGATGAATGC TCATCCGGAA TTTCGTATGG CAATGAAAGA CGGTGAGCTG GTGATATGGG
10621 ATAGTGTTCA CCCTTGTTAC ACCGTTTTCC ATGAGCAAAC TGAAACGTTT TCATCGCTCT
10681 GGAGTGAATA CCACGACGAT TTCCGGCAGT TTCTACACAT ATATTCGCAA GATGTGGCGT
10741 GTTACGGTGA AAACCTGGCC TATTTCCCTA AAGGGTTTAT TGAGAATATG TTTTTCGTCT
10801 CAGCCAATCC CTGGGTGAGT TTCACCAGTT TTGATTTAAA CGTGGCCAAT ATGGACAACT
10861 TCTTCGCCCC CGTTTTCACC ATGGGCAAAT ATTATACGCA AGGCGACAAG GTGCTGATGC
10921 CGCTGGCGAT TCAGGTTCAT CATGCCGTTT GTGATGGCTT CCATGTCGGC AGAATGCTTA
10981 ATGAATTACA ACAGTACTGC GATGAGTGGC AGGGCGGGGC GTAATTTTTT TAAGGCAGTT
11041 ATTGGTGCCC TTAAACGCCT GGTTGCTACG CCTGAATAAG TGATAATAAG CGGATGAATG
11101 GCAGAAATTC GATGATAAGC TGTCAAACAT GAGAATTGGT CGACGGCCCG G
//
LOCUS pBAC-lacZ Sequence 11151 bp DNA SYN 08-Jun-2009
DEFINITION pBAC-lacZ Sequence
...
...
...
9481 GTTCATTAGG TTGTTCTGTC CATTGCTGAC ATAATCCGCT CCACTTCAAC GTAACACCGC
9541 ACGAAGATTT CTATTGTTCC TGAAGGCATA TTCAAATCGT TTTCGTTACC GCTTGCAGGC
9601 ATCATGACAG AACACTACTT CCTATAAACG CTACACAGGC TCCTGAGATT AATAATGCGG
9661 ATCTCTACGA TAATGGGAGA TTTTCCCGAC TGTTTCGTTC GCTTCTCAGT GGATAACAGC
9721 CAGCTTCTCT GTTTAACAGA CAAAAACAGC ATATCCACTC AGTTCCACAT TTCCATATAA
9781 AGGCCAAGGC ATTTATTCTC AGGATAATTG TTTCAGCATC GCAACCGCAT CAGACTCCGG
9841 CATCGCAAAC TGCACCCGGT GCCGGGCAGC CACATCCAGC GCAAAAACCT TCGTGTAGAC
9901 TTCCGTTGAA CTGATGGACT TATGTCCCAT CAGGCTTTGC AGAACTTTCA GCGGTATACC
9961 GGCATACAGC ATGTGCATCG CATAGGAATG GCGGAACGTA TGTGGTGTGA CCGGAACAGA
problem begins here --> 10021 GAACGTCACA CCGTCAGCAG CAGCGGCGGC AACCGCCTCC CCAATCCAGG TCCTGACCGT
10081 TCTGTCCGTC ACTTCCCAGA TCCGCGCTTT CTCTGTCCTT CCTGTGCGAC GGTTACGCCG
10141 CTCCATGAGC TTATCGCGAA TAAATACCTG TGACGGAAGA TCACTTCGCA GAATAAATAA
10201 ATCCTGGTGT CCCTGTTGAT ACCGGGAAGC CCTGGGCCAA CTTTTGGCGA AAATGAGACG
10261 TTGATCGGCA CGTAAGAGGT TCCAACTTTC ACCATAATGA AATAAGATCA CTACCGGGCG
10321 TATTTTTTGA GTTATCGAGA TTTTCAGGAG CTAAGGAAGC TAAAATGGAG AAAAAAATCA
10381 CTGGATATAC CACCGTTGAT ATATCCCAAT GGCATCGTAA AGAACATTTT GAGGCATTTC
10441 AGTCAGTTGC TCAATGTACC TATAACCAGA CCGTTCAGCT GGATATTACG GCCTTTTTAA
10501 AGACCGTAAA GAAAAATAAG CACAAGTTTT ATCCGGCCTT TATTCACATT CTTGCCCGCC
10561 TGATGAATGC TCATCCGGAA TTTCGTATGG CAATGAAAGA CGGTGAGCTG GTGATATGGG
10621 ATAGTGTTCA CCCTTGTTAC ACCGTTTTCC ATGAGCAAAC TGAAACGTTT TCATCGCTCT
10681 GGAGTGAATA CCACGACGAT TTCCGGCAGT TTCTACACAT ATATTCGCAA GATGTGGCGT
10741 GTTACGGTGA AAACCTGGCC TATTTCCCTA AAGGGTTTAT TGAGAATATG TTTTTCGTCT
10801 CAGCCAATCC CTGGGTGAGT TTCACCAGTT TTGATTTAAA CGTGGCCAAT ATGGACAACT
10861 TCTTCGCCCC CGTTTTCACC ATGGGCAAAT ATTATACGCA AGGCGACAAG GTGCTGATGC
10921 CGCTGGCGAT TCAGGTTCAT CATGCCGTTT GTGATGGCTT CCATGTCGGC AGAATGCTTA
10981 ATGAATTACA ACAGTACTGC GATGAGTGGC AGGGCGGGGC GTAATTTTTT TAAGGCAGTT
11041 ATTGGTGCCC TTAAACGCCT GGTTGCTACG CCTGAATAAG TGATAATAAG CGGATGAATG
11101 GCAGAAATTC GATGATAAGC TGTCAAACAT GAGAATTGGT CGACGGCCCG G
//
Jan 28th 2010
Hi, the GenBank importer can cope with files with more than 10000 bases generally speaking so there must be something else in your file setting it off. Could you please send the whole file to amy@biomatters.com?
Jan 28th 2010
OK, I have fixed Geneious to import that file. The fix will be included in the next release of Geneious.
The issue was that the GenBank format specifies that the base numbers are right-justified ending 10 bases into the line, but in this file they ended 5 bases into the line. Our importer relied on the bases not going all the way to the start of the line. So to import this particular file in the current version, you can just add spaces at the start of those last 20 or so sequence lines (Geneious won't care that they're not right justified any more).
Could you tell me where you got this plasmid sequence from?
The issue was that the GenBank format specifies that the base numbers are right-justified ending 10 bases into the line, but in this file they ended 5 bases into the line. Our importer relied on the bases not going all the way to the start of the line. So to import this particular file in the current version, you can just add spaces at the start of those last 20 or so sequence lines (Geneious won't care that they're not right justified any more).
Could you tell me where you got this plasmid sequence from?
Jan 28th 2010
cool, thanks for getting to the bottom of it
I got the sequence from addgene
http://www.addgene.org/pgvec1?cmd=viewseq&f=c&plasmidid=13422&submit=Analyze+Sequence
I got the sequence from addgene
http://www.addgene.org/pgvec1?cmd=viewseq&f=c&plasmidid=13422&submit=Analyze+Sequence
1 to 4 of 4
Top of PageBack to discussionsLussumo Vanilla (0.9.2.5) Copyright © 2001-2005