« 05.22 EMBOSSを試す その4 | ココ | 05.25 EMBOSSを試す その6 »
2009年5月24日
EMBOSSを試す その5
前回のその4で、1対1の配列を比較するアラインメントをやりました。こんどはまとめてやるマルチプルアラインメントです。これをやるのはいろんなプログラムがあるんですが、ClustalWのラッパーというemmaではなく、edialignというコマンドでやります。
- FJ981612 FJ981612.1 Influenza A virus (A/Texas/04/2009(H1N1)) segment 4 hemagglutinin (HA) gene, complete cds.
- GQ117112 GQ117112.1 Influenza A virus (A/Michigan/02/2009(H1N1)) segment 4 hemagglutinin (HA) gene, complete cds.
- GQ131023 GQ131023.1 Influenza A virus (A/Korea/01/2009(H1N1)) segment 4 hemagglutinin (HA) gene, complete cds.
- AB013810 AB013810.1 Influenza A virus (A/Tokyo/1567/98(H3N2)) gene for hemagglutinin, partial cds.
- AB043499 AB043499.1 Influenza A virus (A/Yokohama/24/2000(H1N1)) HA gene for hemagglutinin, partial cds.
- AF144305 AF144305.1 Influenza A virus (A/Goose/Guangdong/1/96(H5N1)) hemagglutinin (HA) gene, complete cds.
今回は上の配列を用意しておきます。上から順に3つは豚インフル(H1N1)、香港型(H3N2)、ソ連型(H1N1)、鳥インフル(H5N1)です。どれもHAタンパクの遺伝子です。H1, H3, H5というのは、このHAタンパクの種類で分かれてることになってます。これを局所的アライメントで相同性を比べてみます。
DIALIGN 2.2.1
*************Program code written by Burkhard Morgenstern and Said Abdeddaim
e-mail contact: dialign (at) gobics (dot) dePublished research assisted by DIALIGN 2 should cite:
Burkhard Morgenstern (1999).
DIALIGN 2: improvement of the segment-to-segment
approach to multiple sequence alignment.
Bioinformatics 15, 211 - 218.For more information, please visit the DIALIGN home page at
http://bibiserv.techfak.uni-bielefeld.de/dialign/
************************************************************
DIALIGNというプログラムが使われているようです。
program call: edialign in.seq stdout
Aligned sequences: length:
================== =======1) FJ981612 1701
2) GQ117112 1701
3) GQ131023 1701
4) AB013810 710
5) AB043499 1032
6) AF144305 1760Average seq. length: 1434.2
Please note that only upper-case letters are considered to be aligned.
Alignment (DIALIGN format):
===========================FJ981612 1 ---------- ---------- ---------- ---------- ----------
GQ117112 1 ---------- ---------- ---------- ---------- ----------
GQ131023 1 ---------- ---------- ---------- ---------- ----------
AB013810 1 ttgttgaacg cagcaaagct tacagcaact gttaccctta tgatgtgccg
AB043499 1 ---------- ---------- ---------- ---------- ----------
AF144305 1 gca------- ---------- ---------- ---------- ----------0000000000 0000000000 0000000000 0000000000 0000000000
FJ981612 1 ---------- ---------- ---------- ---------- ----------
GQ117112 1 ---------- ---------- ---------- ---------- ----------
GQ131023 1 ---------- ---------- ---------- ---------- ----------
AB013810 51 gattatgcct cccttaggtc actagttgcc tcatccggca ccctggagtt
AB043499 1 ---------- ---------- ---------- ---------- ----------
AF144305 4 ---------- ---------- ---------- ---------- ----------0000000000 0000000000 0000000000 0000000000 0000000000
FJ981612 1 -----ATGAA GGCAATACTA GTAGTTCTGC TATATAC--- ----------
GQ117112 1 -----ATGAA GGCAATACTA GTAGTTCTGC TATATAC--- ----------
GQ131023 1 -----ATGAA GGCAATACTA GTAGTTCTGC TATATAC--- ----------
AB013810 101 taacaATGAA AGCttcaatt GGACTGGAGT CGCTCAGAAT GGAACAAGCT
AB043499 1 -----ATGAA AGCAAAACTA CTAGTTCTGT TGTGTGC--- ----------
AF144305 4 ---------- ---------- GGGGTATAAT CTGTCAAAAT GGAGAAAATA0000055555 5555555555 5555555555 5555555000 0000000000
FJ981612 33 ---------- ---------- ATTTGCAACC GCAAATGCAG ACACATTATG
GQ117112 33 ---------- ---------- ATTTGCAACC GCAAATGCAG ACACATTATG
GQ131023 33 ---------- ---------- ATTTGCAACC GCAAATGCAG ACACATTATG
AB013810 151 TTGCTTgcaa aaggagatct ATTAAAAGTT TCTTTAGTAG ATTGAATTGG
AB043499 33 ---------- ---------- ATTTACAGCT ACATATGCAG ACACAATATG
AF144305 34 GTGCTTcttc ttgcaatagt cagtcttgtc aaaagtgatc ag---ATTTG0000000000 0000000000 5555555444 4777777777 7777777788
FJ981612 63 TATAGGTTAT CATGCGAACA ATTCAACAGA CACTGTAGAC ACAGTACTAG
GQ117112 63 TATAGGTTAT CATGCGAACA ATTCAACAGA CACTGTAGAC ACAGTACTAG
GQ131023 63 TATAGGTTAT CATGCGAACA ATTCAACAGA CACTGTAGAC ACAGTACTAG
AB013810 201 Ttgcaccaat taaaat---- ---------- --------AC AAATATCCAG
AB043499 63 TATAGGCTAC CATGCGAACA ACTCAACTGA CACTGTTGAC ACAGTACTTG
AF144305 81 CATTGGTTAC CATGCAAACA ACTCGACAGA gcagGTTGAC ACAATAATGG8888888888 8888888888 8577777777 6666888888 8888888888
FJ981612 113 AAAAGAATGT AACAGTAACA CACTCTGTTA ACCTTCTAGA AGACAAGCAT
GQ117112 113 AAAAGAATGT AACAGTAACA CACTCTGTTA ACCTTCTAGA AGACAAGCAT
GQ131023 113 AAAAGAATGT AACAGTAACA CACTCTGTTA ACCTTCTAGA AGACAAGCAT
AB013810 229 CACTGAACGT GACTATGCCA AACAATGACA A--------- ----------
AB043499 113 AGAAGAACGT GACAGTGACA CACTCTGTCA ACCTACTTGA GGACAGTCAC
AF144305 131 AAAAGAACGT TACTGTTACA CAtgcccaag ACATACTGGA AAAGACACAC8888888677 7777777777 7766666666 6666666666 6666655555
FJ981612 163 AACGGGAAAC TATGCAAACT AAGAGGGGTA GCCCCATTGC ATTTGGGTAA
GQ117112 163 AACGGGAAAC TATGCAAACT AAGAGGGGTA GCCCCATTGC ATTTGGGTAA
GQ131023 163 AACGGGAAAC TATGCAAACT AAGAGGGGTA GCCCCATTGC ATTTGGGTAA
AB013810 260 ---------- ---------- ---------- ---------- ----------
AB043499 163 AACGGAAAAC TATGCCGACT AAAAGGaacA GCCCCACTAC AATTGGGTAA
AF144305 181 AATGGGAAGC TCTGCGATCT AAAtggagtg aagcctctca ttttgagagA5555555555 5555555555 5555554445 5555555555 5555555555
FJ981612 213 ATGTAACATT GCTGGCTGGA TCCTGGGAAA TCCAGAGTGT GAATCACTCT
GQ117112 213 ATGTAACATT GCTGGCTGGA TCCTGGGAAA TCCAGAGTGT GAATCACTCT
GQ131023 213 ATGTAACATT GCTGGCTGGA TCCTGGGAAA TCCAGAGTGT GAATCACTCT
AB013810 260 ---------- ---------- ---------- ---------- ----------
AB043499 213 TTGCAGCATT GCCGGATGGA TCTTAGGAAA TCCAGAATGC GAATCACTgt
AF144305 231 TTGTAGTGTA GCTGGATGGC TCCTCGGAAA CCCTATGTGT Gacgaattca5666666666 6666656666 6666666666 6666666666 6666666644
FJ981612 263 CCACAGCAAG CTCATGGTCC TACATTGTGG AAACATCTAG TTCAGACAAT
GQ117112 263 CCACAGCAAG CTCATGGTCC TACATTGTGG AAACATCTAG TTCAGACAAT
GQ131023 263 CCACAGCAAG CTCATGGTCC TACATTGTGG AAACATCTAG TTCAGACAAT
AB013810 260 ---------- ---------- ---------- ---------- ----------
AB043499 263 tttctaagga aTCATGGTCT TACATTGCAG AAACAccaaa ccctaaaAAT
AF144305 281 tcaatgtgcc ggaATGGTCT TACATAGTGG AGAAGGCCAG TCCAGCCAAT4444444444 4556666666 6666666666 6666655555 5555555777
FJ981612 313 GGAACGTGTT ACCCAGGAGA TTTCATCGAT TATGAGGAGC TAAGAGAGCA
GQ117112 313 GGAACGTGTT ACCCAGGAGA TTTCATCGAT TATGAGGAGC TAAGAGAGCA
GQ131023 313 GGAACGTGTT ACCCAGGAGA TTTCATCGAT TATGAGGAGC TAAGAGAGCA
AB013810 260 ---------- ---------- ---------- ---------- ----------
AB043499 313 GGAACATGTT ACCCAGGGTA TTTCGCCGAC TATGAGGAAC TGAGGGAGCA
AF144305 331 GacctcTGTT ACCCAGGGGA TTTCAACGAC TATGAAGAAC TGAaacacct7666668888 8888888888 8888888888 8888888668 8776666666
FJ981612 363 ATTGAGCTCA GTGTCATCAT TTGAAAGGTT TGAGATATTC CCCAAGACAA
GQ117112 363 ATTGAGCTCA GTGTCATCAT TTGAAAGGTT TGAGATATTC CCCAAGACAA
GQ131023 363 ATTGAGCTCA GTGTCATCAT TTGAAAGGTT TGAGATATTC CCCAAGACAA
AB013810 260 ---------- ---------- -------ATT TGACAAATTg taca------
AB043499 363 ATTGAGCTCA GTATCATCAT TCGAGAGATT TGAAATATTC CCCAAGGATA
AF144305 381 attgagcaga acaaaccatT TTGAGAAAAT TCAGATCATC CCCAA---AA6666666666 6666666666 6666666466 6666666666 6666666666
FJ981612 413 GTTCATGGCC CAATCATGAC TCGAACAAAG GTGTAACGGC AGCATGTCCT
GQ117112 413 GTTCATGGCC CAATCATGAC TCGAACAAAG GTGTAACGGC AGCATGTCCT
GQ131023 413 GTTCATGGCC CAATCATGAC TCGAACAAAG GTGTAACGGC AGCATGTCCT
AB013810 277 ---------- ---------- ---------- ---------- ----------
AB043499 413 GCTCATGGCC CAACCAcact gtaACCAAAG GAGTGACGGC ATCATGCTCC
AF144305 428 GTTCTTGGTC CAATCATGAt gcctcatcAG GGGTGAGCTC AGCATGTCCA6666666666 6666665554 4445555555 5555555555 5555555555
FJ981612 463 CATGCTGGAG CAAAAAGCTT CTACAAAAAT TTAATATGGC TAGTTAAAAA
GQ117112 463 CATGCTGGAG CAAAAAGCTT CTACAAAAAT TTAATATGGC TAGTTAAAAA
GQ131023 463 CATGCTGGAG CAAAAAGCTT CTACAAAAAT TTAATATGGC TAGTTAAAAA
AB013810 277 ---------- ---------- ---------- ---------- ----------
AB043499 463 CATAATGGGA aaagcAGCTT TTACAAAAAT TTGCTATGGC TGACGGAGAA
AF144305 478 TACCATGGGA ggtcctcCTT TTTCAGAAAT GTGGTATGGC TTATCAAAAA5555555544 4444455666 6666666666 6666666666 6555555554
FJ981612 513 AGGAAATTCA TACCCAAAGC TCAGCAAATC CTACATTAAT GATAAAGGGA
GQ117112 513 AGGAAATTCA TACCCAAAGC TCAGCAAATC CTACATTAAT GATAAAGGGA
GQ131023 513 AGGAAATTCA TACCCAAAGC TCAGCAAATC CTACATTAAT GATAAAGGGA
AB013810 277 ---------- ---------- ---------- ---------- ----------
AB043499 513 GAAtggcttg TACCCAAATC TGAGCAAGTC CTATGTAAAC AAAAAGGGaA
AF144305 528 GAAcAGTGCA TACCCAACAA TAAAGAGGAG CTACAATAAT accaaccaag4444444444 5555555555 5555555555 5555555555 5555555546
FJ981612 563 AAGAAGTCCT CGTGCTATGG GGCATTCACC ATCCATCTAC TAGTGCTGAC
GQ117112 563 AAGAAGTCCT CGTGCTATGG GGCATTCACC ATCCATCTAC TAGTGCTGAC
GQ131023 563 AAGAAGTCCT CGTGCTATGG GGCATTCACC ATCCATCTAC TAGTGCTGAC
AB013810 277 ---------- -----TTTGG GGGGTTCACC ACCCGAGTAC GGACAGTGAC
AB043499 563 AAGAAGTCCT TGTGCTATGG GGTGTTCATC ACCCGTCTAa catgggggac
AF144305 578 AAGATCTTTT AGTACTGTGG GGGATTCACC ATCCtaatga tgcggcagag7777777777 7777777777 7777777777 7777766664 4444444444
FJ981612 613 CAA------- ---------- ---------- -----CAAAG TCTCTATCAG
GQ117112 613 CAA------- ---------- ---------- -----CAAAG TCTCTATCAG
GQ131023 613 CAA------- ---------- ---------- -----CAAAG TCTCTATCAG
AB013810 312 CAAaccagcc tatatgctca agcatcaggg agagtCACAG TCTCTACCAA
AB043499 613 caacgggcca ---------- ---------- ---------- --TCTATCAT
AF144305 628 cagacaaag- ---------- ---------- ---------- -CTCTATCAA4440000000 0000000000 0000000000 0000044444 4455555555
FJ981612 631 AATGCAGATG CATATGTTTT TGTGGGGTCA TCAAGATACA GCAAGAAGTT
GQ117112 631 AATGCAGATG CATATGTTTT TGTGGGGTCA TCAAGATACA GCAAGAAGTT
GQ131023 631 AATGCAGATG CATATGTTTT TGTGGGGTCA TCAAGATACA GCAAGAAGTT
AB013810 362 AAgaagccaa caaactg--- ---------- ---------- ---------T
AB043499 631 AAAGAAAATG CTTATGTTTC TGTATTGTCT TCacattata gcagaAGATT
AF144305 646 AACCCAACCA CTTACATTTC CGTTGGAACA Tcaacactga accagAGATT5555555555 5555555555 5555555555 5544444444 4444444466
FJ981612 681 CAAGCCGGAA ATAGCAATAA GACCCAAAGT GAGGGATCAA GAAGGGAGAA
GQ117112 681 CAAGCCGGAA ATAGCAATAA GACCCAAAGT GAGGGATCAA GAAGGGAGAA
GQ131023 681 CAAGCCGGAA ATAGCAATAA GACCCAAAGT GAGGGATCAA GAAGGGAGAA
AB013810 380 AATCCCGAAT ATCGGATCTA GACCCTGGGT AAGGGGTgtc tccaGCAGAA
AB043499 681 CACCCCAGAA ATAGCAAAAA GGCCCAAAGT AAGAGATCAA GAAGGGAGAA
AF144305 696 GGTTCCAGAA ATAGCTACTA GACCCAAAGT AAACGGgCAA AGTGGAAGAA6666777777 7777777777 7777777777 6654666666 6667779999
FJ981612 731 TGAACTATTA CTGGACACTA GTAGAGCCGG GAGACAAAAT AACATTCGAA
GQ117112 731 TGAACTATTA CTGGACACTA GTAGAGCCGG GAGACAAAAT AACATTCGAA
GQ131023 731 TGAACTATTA CTGGACACTA GTAGAGCCGG GAGACAAAAT AACATTCGAA
AB013810 430 TAAGCATCTA TTGGACAATA GTAAAACCGG GAGACAtact tctgattaAC
AB043499 731 TTAACTACTA CTGGACTCTG CTGGAACCCG GGGACACAAT AATATTTGAG
AF144305 746 TGGAGTTCTT CTGGACAATT TTAAAGCCGa atgatgCCAT CAATTTCGAG9999999999 9999998886 6666668877 7777776666 6666666666
FJ981612 781 GCAACTGGAA ATCTAGTGGT ACCGAGATAT GCATTCGCAA TGGAAAGAAA
GQ117112 781 GCAACTGGAA ATCTAGTGGT ACCGAGATAT GCATTCGCAA TGGAAAGAAA
GQ131023 781 GCAACTGGAA ATCTAGTGGT ACCGAGATAT GCATTCGCAA TGGAAAGAAA
AB013810 480 AGCACAGGGA ATCTAATTGC TCCTCGGGGT TACTTCAAAA Tacgaagtgg
AB043499 781 GCAAATGGAA ATCTAATAGC GCCGTGGTAC GCTTTCGCAC TGAGTAGAgg
AF144305 796 AGTAATGGAA ATTTCATTGC TCCAGAATAT GCATACAAAA Ttgtcaagaa6666666666 6666644444 4444444444 4444444444 4444444444
FJ981612 831 TGCTGGATCT GGTATTATCA TTTCAGATAC ACCAGTCCAC GATTGCAATA
GQ117112 831 TGCTGGATCT GGTATTATCA TTTCAGATAC ACCAGTCCAC GATTGCAATA
GQ131023 831 TGCTGGATCT GGTATTATCA TTTCAGATAC ACCAGTCCAC GATTGCAATA
AB013810 530 gaaaagctc- --AATAATGA GGTCAGATGC ACCCATTGGC AAATGCAATT
AB043499 831 cttTGGGTCA GGAATCATCA TCTCAAACGC ATCAATGGGT GAATGTGACG
AF144305 846 aggggactca gcaattatga aaagtgaatt ggaataTGGT AACTGCAACA4444444444 4445555555 5555555555 5555554444 4445555555
FJ981612 881 CAACTTGTCA GACACCCAAG GGTGCTATAA ACACCAGCCT CCCATTTCAG
GQ117112 881 CAACTTGTCA GACACCCAAG GGTGCTATAA ACACCAGCCT CCCATTTCAG
GQ131023 881 CAACTTGTCA GACACCCAAG GGTGCTATAA ACACCAGCCT CCCATTTCAG
AB013810 577 CTGAATGCAT CACTCCAAAT GGaagcattc ccaatgaaaa aCCATTTCAA
AB043499 881 CTAAGTGTCA AACACCCCAA GGAGCTATAA ACAGTAGTCT CCCCTTCCAG
AF144305 896 CCAAGTGTCA AACTCCAATG GGGGCGATAA ACTCTAGTAT GCCATTCCAC5555566666 6666666666 6666666666 6654466666 6777777777
FJ981612 931 AATATACATC CGATCACAAT TGGAAAATGT CCAAAATATG TAAAAAGCAC
GQ117112 931 AATATACATC CGATCACAAT TGGAAAATGT CCAAAATATG TAAAAAGCAC
GQ131023 931 AATATACATC CGATCACAAT TGGAAAATGT CCAAAATATG TAAAAAGCAC
AB013810 627 AATGTAAACA GGATCACATA TGGGGCCTGT CCCAGATATG TTAAGCAAAA
AB043499 931 AATGTACACC CAGTCACAAT AGGAGAGTGT CCAAAGTATG TCAGGAGTAC
AF144305 946 AACATACACC CCCTCACCAT CGGGGAATGC CCCAAATATG TGAAATCAAA7777777777 7777777666 6666555666 6666666666 6666655555
FJ981612 981 AAAATTGAGA CTGGCCACAG GATTGAGGAA TGTCCCGTCT ATTCAATCT-
GQ117112 981 AAAATTGAGA CTGGCCACAG GATTGAGGAA TGTCCCGTCT ATTCAATCT-
GQ131023 981 AAAATTGAGA CTGGCCACAG GATTGAGGAA TGTCCCGTCT ATTCAATCT-
AB013810 677 CACtcTGAAA TTGGCAACAG GGATGCGGAA TGTa------ ----------
AB043499 981 AAAATTAAGG ATGGTTACAG GACTAAGGAA CGTCCCATCC ATTCAATCC-
AF144305 996 CAGATTAGTC CTTGCGACTG GACTCAGAAA TACCCCtcag agagagagaa5555566666 6666556666 6666666666 6655555555 5555555550
FJ981612 1030 ---------- -AGAGGCCTA TTTGGGGCCA TTGCCGGTTT CATTGAAGGG
GQ117112 1030 ---------- -AGAGGCCTA TTTGGGGCCA TTGCCGGTTT CATTGAAGGG
GQ131023 1030 ---------- -AGAGGCCTA TTTGGGGCCA TTGCCGGTTT CATTGAAGGG
AB013810 711 ---------- ---------- ---------- ---------- ----------
AB043499 1030 ---------- -AGA------ ---------- ---------- ----------
AF144305 1046 gaagaaaaaa gAGAGGACTA TTTGGAGCTA TAGCAGGTTT TATAGAGGGa0000000000 0777555555 5555555555 5555555555 5555555554
FJ981612 1069 GGGTGGACAG GGATGGTAGA TGGATGGTAC GGTTATCACC ATCAAAATGA
GQ117112 1069 GGGTGGACAG GGATGGTAGA TGGATGGTAC GGTTATCACC ATCAAAATGA
GQ131023 1069 GGGTGGACAG GGATGGTAGA TGGATGGTAC GGTTATCACC ATCAAAATGA
AB013810 711 ---------- ---------- ---------- ---------- ----------
AB043499 1033 ---------- ---------- ---------- ---------- ----------
AF144305 1096 ggatggcagG GAATGGTAGA TGGTTGGTAT GGGTACCACC ATagcAATGA4444444445 5555555555 5555555555 5555555555 5544455555
FJ981612 1119 GCAGGGGTCA GGATATGCAG CCGACCTGAA GAGCACACAG AATGCCATTG
GQ117112 1119 GCAGGGGTCA GGATATGCAG CCGACCTGAA GAGCACACAG AATGCCATTG
GQ131023 1119 GCAGGGGTCA GGATATGCAG CCGACCTGAA GAGCACACAG AATGCCATTG
AB013810 711 ---------- ---------- ---------- ---------- ----------
AB043499 1033 ---------- ---------- ---------- ---------- ----------
AF144305 1146 GCAGGGGAGT GGATACGCTG CAGACaaaga atcCACTCAA AAGGCAATAG5555555555 5555555555 5555544444 4444444444 4444444444
FJ981612 1169 ACGAAATTAC TAACAAAGTA AATTCTGTTA TTGAAAAGAT GAATACACAG
GQ117112 1169 ACGAGATTAC TAACAAAGTA AATTCTGTTA TTGAAAAGAT GAATACACAG
GQ131023 1169 ACGAGATTAC TAACAAAGTA AATTCTGTTA TTGAAAAGAT GAATACACAG
AB013810 711 ---------- ---------- ---------- ---------- ----------
AB043499 1033 ---------- ---------- ---------- ---------- ----------
AF144305 1196 ATGGAGTCAC CAATAAGGTC AACTCGATCA TTGACAAAAT GAACACTCAG4444444444 4444444444 4444444444 4455555555 5555555555
FJ981612 1219 TTCACAGCAG TAGGTAAAGA GTTCAACCAC CTGGAAAAAA GAATAGAGAA
GQ117112 1219 TTCACAGCAG TAGGTAAAGA GTTCAACCAC CTGGAAAAAA GAATAGAGAA
GQ131023 1219 TTCACAGCAG TAGGTAAAGA GTTCAACCAC CTGGAAAAAA GAATAGAGAA
AB013810 711 ---------- ---------- ---------- ---------- ----------
AB043499 1033 ---------- ---------- ---------- ---------- ----------
AF144305 1246 TTTGAGGCCG TTGGaaggGA ATTTAATAAC TTGGAAAGGA GGATAGAGAA5444444444 4444444455 5555555555 5555555555 5555555555
FJ981612 1269 TTTAAATAAA AAAGTTGATG ATGGTTTCCT GGACATTTGG ACTTACAATG
GQ117112 1269 TTTAAATAAA AAAGTTGATG ATGGTTTCCT GGACATTTGG ACTTACAATG
GQ131023 1269 TTTAAATAAA AAAGTTGATG ATGGTTTCCT GGACATTTGG ACTTACAATG
AB013810 711 ---------- ---------- ---------- ---------- ----------
AB043499 1033 ---------- ---------- ---------- ---------- ----------
AF144305 1296 TTTAAAcaag cagatggaaG ACGGATTCCT AGATGTCTGG ACTTATAATG5555554444 4444444445 5555555555 5555555555 5555555555
FJ981612 1319 CCGAACTGTT GGTTCTATTG GAAAATGAAA GAACTTTGGA CTACCACGAT
GQ117112 1319 CCGAACTGTT GGTTCTATTG GAAAATGAAA GAACTTTGGA CTACCACGAT
GQ131023 1319 CCGAACTGTT GGTTCTATTG GAAAATGAAA GAACTTTGGA CTACCACGAT
AB013810 711 ---------- ---------- ---------- ---------- ----------
AB043499 1033 ---------- ---------- ---------- ---------- ----------
AF144305 1346 CTGAACTtcT GGTTCTCATG GAAAATGAGA GAACTCTAGA CTttCATGAC5555555445 5555555555 5555555555 5555555555 5544555555
FJ981612 1369 TCAAATGTGA AGAACTTATA TGAAAAGGTA AGAAGCCAGC TAAAAAACAA
GQ117112 1369 TCAAATGTGA AGAACTTATA TGAAAAGGTA AGAAGCCAGT TAAAAAACAA
GQ131023 1369 TCAAATGTGA AGAACTTATA TGAAAAGGTA AGAAGCCAGC TAAAAAACAA
AB013810 711 ---------- ---------- ---------- ---------- ----------
AB043499 1033 ---------- ---------- ---------- ---------- ----------
AF144305 1396 TCAAATGTCA AGAACCTTTA TGACAAGGTc cgactaCAGC TTAGGGATAA5555555555 5555555555 5555555554 4444444444 4444444444
FJ981612 1419 TGCCAAGGAA ATTGGAAACG GCTGCTTTGA ATTTTACCAC AAATGCGATA
GQ117112 1419 TGCCAAGGAA ATTGGAAACG GCTGCTTTGA ATTTTACCAC AAATGCGATA
GQ131023 1419 TGCCAAGGAA ATTGGAAACG GCTGCTTTGA ATTTTACCAC AAATGCGATA
AB013810 711 ---------- ---------- ---------- ---------- ----------
AB043499 1033 ---------- ---------- ---------- ---------- ----------
AF144305 1446 TGCAAAGGAg cTGGGTAATG GTTGTTTCGA GTTCTATCAC AAATGTGATA4444444444 4555555555 5555555555 5555555555 5555555555
FJ981612 1469 ACACGTGCAT GGAAAGTGTC AAAAATGGGA CTTATGACTA CCCAAAATAC
GQ117112 1469 ACACGTGCAT GGAAAGTGTC AAAAATGGGA CTTATGACTA CCCAAAATAC
GQ131023 1469 ACACGTGCAT GGAAAGTGTC AAAAATGGGA CTTATGACTA CCCAAAATAC
AB013810 711 ---------- ---------- ---------- ---------- ----------
AB043499 1033 ---------- ---------- ---------- ---------- ----------
AF144305 1496 AtgaaTGTAT GGAAAGTGTA AAAAACGGAA CGTATGACTA CCCgcagTAT5444466666 6666666666 6666666666 6666666666 6664444666
FJ981612 1519 TCAGAGGAAG CAAAATTAAA CAGAGAAGAA ATAGATGGGG TAAAACTGGA
GQ117112 1519 TCAGAGGAAG CAAAATTAAA CAGAGAAGAA ATAGATGGGG TAAAGCTGGA
GQ131023 1519 TCAGAGGAAG CAAAATTAAA CAGAGAAGAA ATAGATGGGG TAAAGCTGGA
AB013810 711 ---------- ---------- ---------- ---------- ----------
AB043499 1033 ---------- ---------- ---------- ---------- ----------
AF144305 1546 TCAGAAGAAG CAAGACTAAA CAGAGAGGAA ATAagTGGAG TAAAATTGGA6666666666 6666666666 6666666666 6664455555 5522244444
FJ981612 1569 ATCAACAAGG ATTTACCAGA TTTTGGCGAT CTATTCAACT GTCGCCAGTT
GQ117112 1569 ATCAACAAGG ATTTACCAGA TTTTGGCGAT CTATTCAACT GTCGCCAGTT
GQ131023 1569 ATCAACAAGG ATTTACCAGA TTTTGGCGAT CTATTCAACT GTCGCCAGTT
AB013810 711 ---------- ---------- ---------- ---------- ----------
AB043499 1033 ---------- ---------- ---------- ---------- ----------
AF144305 1596 ATCAATGGGA ACTTACCAAA TACTGtcaAT TTATTCAACA GTGGCGAGTT4444444444 4445555555 5555544455 5555555555 5555555555
FJ981612 1619 CATTGGTACT GGTAGTCTCC CTGGGGGCAA TCAGTTTCTG GATGTGCTCT
GQ117112 1619 CATTGGTACT GGTAGTCTCC CTGGGGGCAA TCAGTTTCTG GATGTGCTCT
GQ131023 1619 CATTGGTACT GGTAGTCTCC CTGGGGGCAA TCAGTTTCTG GATGTGCTCT
AB013810 711 ---------- ---------- ---------- ---------- ----------
AB043499 1033 ---------- ---------- ---------- ---------- ----------
AF144305 1646 CCCTAGCACT GGCAATCatg gtagctggtc tatcTTTATG GATGTGCTCC5555555555 5555555444 4444444444 4444555555 5555555555
FJ981612 1669 AATGGGTCTC TACAGTGTAG AATATGTATT TAA------- ----------
GQ117112 1669 AATGGGTCTC TACAGTGTAG AATATGTATT TAA------- ----------
GQ131023 1669 AATGGGTCTC TACAGTGTAG GATATGTATT TAA------- ----------
AB013810 711 ---------- ---------- ---------- ---------- ----------
AB043499 1033 ---------- ---------- ---------- ---------- ----------
AF144305 1696 AATGGATCGT TACAATGCAG AATttgcatt taaatttgtg agttcagatt5555555555 5544433333 3332222222 2220000000 0000000000
FJ981612 1702 ---------- -----
GQ117112 1702 ---------- -----
GQ131023 1702 ---------- -----
AB013810 711 ---------- -----
AB043499 1033 ---------- -----
AF144305 1746 gtagttaaaa acacc0000000000 00000
一緒に出力されるFASTA形式のファイルとともに、系統樹作成のためのデータも出力されます。
Sequence tree:
==============Tree constructed using UPGMA based on DIALIGN fragment weight scores
((((FJ981612 :0.000585(GQ117112 :0.000583GQ131023 :0.000583):0.000002)
:0.002449AB043499 :0.003033):0.001340AF144305 :0.004373)
:0.032379AB013810 :0.036752);
これを見ると、GQ117112(豚)-GQ131023(豚)-FJ981612(豚)が1つの塊で、AB043499(ソ連), AF144305(鳥), AB013810(香港)の順で違いが大きくなっている、という計算結果が出ています。鳥インフルの遺伝子配列の方が近いとはちょっと意外ですが。それはソ連型と香港型の遺伝子配列が完全長ではないためかもしれません。
By ただ at 23:53 カテゴリー ; 生命科学
« 05.22 EMBOSSを試す その4 |
05月の記事
| 05.25 EMBOSSを試す その6 »
トラックバック
このエントリーのトラックバックURL:
http://pinmarch.sakura.ne.jp/mt/mt-tb.cgi/1419