« 05.25 もうそろそろ梅雨のようで | ココ | 05.28 マスクのない世界へ »
2009年5月26日
EMBOSSを試す その7
塩基配列を扱うときに、制限酵素(切断)地図を出せると組み換えをしたり変異を見つけたり(RFLP)するのに便利です。それがremapコマンドです。ただ…CUIでは環状で出せないのが無念ですね。
$ remap genbank:GQ149630
Display restriction enzyme binding sites in a nucleotide sequence
Comma separated enzyme list [all]:
Minimum recognition site length [4]:
Output file [gq149630.remap]: stdout
GQ149630
Influenza A virus (A/Mexico/4603/2009(H1N1)) segment 4 hemagglutinin
(HA) gene, complete cds.EMBOSS An error in remap.c at line 238:
Cannot locate enzyme file. Run REBASEEXTRACT
早速やってみると、エラーが出ました。最初に酵素のデータを入力しておかなければならないようです。ここにもあるように、http://rebase.neb.com/rebase/rebase.htmlから"proto.数値"と"withrefm.数値"をダウンロードします。FTPを使ってもいけます。
$ wget ftp://ftp.neb.com/pub/rebase/proto.905
$ wget ftp://ftp.neb.com/pub/rebase/withrefm.905
$ rebaseextract
Process the REBASE database for use by restriction enzyme applications
REBASE database withrefm file: withrefm.905
REBASE database proto file: proto.905
$ remap
Display restriction enzyme binding sites in a nucleotide sequence
Input nucleotide sequence(s): genbank:GQ169382
Comma separated enzyme list [all]:
Minimum recognition site length [4]:
Output file [gq169382.remap]: stdout
GQ169382
Influenza A virus (A/Thailand/104/2009(H1N1)) segment 4 hemagglutinin
(HA) gene, complete cds.SpeI
|MaeI FaiI
Hin4II || TspDTI | FaiI
\ \\ \ \ \
gcaaaagcaggggaaaacaaaagcacaaaatgaaggcaatactagtagttctgctatata
10 20 30 40 50 60
----:----|----:----|----:----|----:----|----:----|----:----|
cgttttcgtccccttttgttttcgtgttttacttccgttatgatcatcaagacgatatat
/ /// / /
Hin4II ||SpeI | FaiI
|MaeI FaiI
TspDTIA K A G E N K S T K * R Q Y * * F C Y I
Q K Q G K T K A Q N E G N T S S S A I Y
K S R G K Q K H K M K A I L V V L L Y T
----:----|----:----|----:----|----:----|----:----|----:----|
A F A P S F L L V F H L C Y * Y N Q * I
X L L L P F C F C L I F A I S T T R S Y
C F C P F V F A C F S P L V L L E A I YSetI
| FatI
| |CviAII
| ||FaiI
| ||| NlaIII
CviRI FaiI | ||| | TspEI
| AciI CviRI | FaiI | ||| | | AgsI
\ \ \ \ \ \ \\\ \ \ \
catttgcaaccgcaaatgcagacacattatgtataggttatcatgcgaacaattcaacag
70 80 90 100 110 120
----:----|----:----|----:----|----:----|----:----|----:----|
gtaaacgttggcgtttacgtctgtgtaatacatatccaatagtacgcttgttaagttgtc
/ / / / // / /// / //
CviRI AciI CviRI | |SetI | ||FatI | |AlwNI
| FaiI | |CviAII | TscAI
FaiI | FaiI TspEI
NlaIII AgsIH L Q P Q M Q T H Y V * V I M R T I Q Q
I C N R K C R H I M Y R L S C E Q F N R
F A T A N A D T L C I G Y H A N N S T D
----:----|----:----|----:----|----:----|----:----|----:----|
C K C G C I C V C * T Y T I M R V I * C
V N A V A F A S V N H I P * * A F L E V
M Q L R L H L C M I Y L N D H S C N L LMseI
|HpaI
AlwNI TatI |MjaIV
|SfeI Tsp4CI |HindII
||Tsp4CI |Csp6I || SetI
||| AccI ||RsaI MaeIII || |XbaI
||| |MjaIV ||ScaI | MaeIII || ||MaeI
||| |TscAI ||| MaeI | Tsp4CI || ||Hpy178III
\\\ \\ \\\ \ \ \ \\ \\\
acactgtagacacagtactagaaaagaatgtaacagtaacacactctgttaaccttctag
130 140 150 160 170 180
----:----|----:----|----:----|----:----|----:----|----:----|
tgtgacatctgtgtcatgatcttttcttacattgtcattgtgtgagacaattggaagatc
/ // / /// / / / // //
| || | ||| MaeI | MaeIII |MseI |XbaI
| || | ||TatI MaeIII |SetI Hpy178III
| || | |Csp6I Tsp4CI HindII MaeI
| || | ScaI MjaIV
| || | RsaI HpaI
| || Tsp4CI
| |AccI
| MjaIV
| SfeI
Tsp4CIT L * T Q Y * K R M * Q * H T L L T F *
H C R H S T R K E C N S N T L C * P S R
T V D T V L E K N V T V T H S V N L L E
----:----|----:----|----:----|----:----|----:----|----:----|
V S Y V C Y * F L I Y C Y C V R N V K *
S V T S V T S S F F T V T V C E T L R R
C Q L C L V L F S H L L L V S Q * G E LHin4II FaiI
| BbvII | CviRI CviJI
| | FaiI | | MnlI | BsrDI
| | | MboII | | | DdeI | | CviRI
\ \ \ \ \ \ \ \ \ \ \
aagacaagcataacgggaaactatgcaaactaagaggggtagccccattgcatttgggta
190 200 210 220 230 240
----:----|----:----|----:----|----:----|----:----|----:----|
ttctgttcgtattgccctttgatacgtttgattctccccatcggggtaacgtaaacccat
/ // / // / // /
Hin4II |MboII | |MnlI DdeI |BsrDI CviRI
|BbvII | CviRI CviJI
FaiI FaiIK T S I T G N Y A N * E G * P H C I W V
R Q A * R E T M Q T K R G S P I A F G *
D K H N G K L C K L R G V A P L H L G K
----:----|----:----|----:----|----:----|----:----|----:----|
F V L M V P F * A F * S P Y G W Q M Q T
S S L C L P F S H L S L P T A G N C K P
L C A Y R S V I C V L L P L G M A N P Y...
AlwNI SfeI
|Hpy178III |BsmAI
|| BseGI |Eco31I
|| | SduI || Tsp4CI FaiI
|| | HgiAI || | TscAI | TfiI
|| | | FokI || | | FaiI MseI | HinfI
\\ \ \ \ \\ \ \ \ \ \ \
gtttctggatgtgctctaatgggtctctacagtgtagaatatgtatttaaccataggatt
1690 1700 1710 1720 1730 1740
----:----|----:----|----:----|----:----|----:----|----:----|
caaagacctacacgagattacccagagatgtcacatcttatacataaattggtatcctaa
/ / / / / /// / / / /
AlwNI | BseGI | | ||Eco31I FaiI | FaiI HinfI
| HgiAI | | ||BsmAI MseI TfiI
| SduI | | |SfeI
Hpy178III | | Tsp4CI
| TscAI
FokIV S G C A L M G L Y S V E Y V F N H R I
F L D V L * W V S T V * N M Y L T I G F
F W M C S N G S L Q C R I C I * P * D S
----:----|----:----|----:----|----:----|----:----|----:----|
T E P H A R I P R * L T S Y T N L W L I
L K Q I H E L P D R C H L I H I * G Y S
N R S T S * H T E V T Y F I Y K V M P N
c-
g
X
-E
# Enzymes that cut Frequency Isoschizomers
Acc65I 1 Asp718I
AccI 1 FblI,XmiI
AciI 1 BspACI,SsiI
AflIII 2
AgsI 10
AhaIII 1 DraI
AluBI 6 AluI
AlwNI 2 CaiI
ApoI 5 AcsI,XapI
AsuI 3 Cfr13I,PspPI,Sau96I,AspS9I
AsuII 1 Bpu14I,Bsp119I,BspT104I,BstBI,Csp45I,NspV,SfuI
AvaII 1 Bme18I,Eco47I,SinI,VpaK11BI
BaeI 1
BalI 1 MlsI,MluNI,MscI,Msp20I
BamHI 1
BbvI 3 BseXI,BstV1I,Lsp1109I
BbvII 1 BpiI,BpuAI,BstV2I,BbsI
BccI 8
BceAI 2
BinI 4 AlwI,BspPI,AclWI
BlsI 3
BmgT120I 3
BmsI 1 LweI,SfaNI
BsaAI 1 BstBAI,Ppu21I
BseBI 5 Bst2UI,BstNI,BstOI,MvaI
BseGI 5 BstF5I,BtsCI
BseMII 3
BseRI 1
BsiI 1 BssSI,Bst2BI,BauI
BsiYI 4 Bsc4I,BseLI,BslI,AfiI
BslFI 2 BsmFI,FaqI
BsmAI 5 Alw26I,BstMAI
BsmI 4 BsaMI,Mva1269I,PctI
BspCNI 3
BspHI 1 CciI,PagI,RcaI
BspQI 1 LguI,PciSI,SapI
BsrDI 4 BseMI,Bse3DI
BsrI 7 BseNI,Bse1I,BsrSI
BssKI 6 BstSCI,StyD4I
BstKTI 5
BstXI 1
Cac8I 2 BstC8I
CauII 1 BcnI,BpuMI,NciI,AsuC2I
Cfr10I 1 BsrFI,BssAI,Bse118I
CfrI 1 AcoI,EaeI
ClaI 1 Bsa29I,BseCI,BshVI,BspDI,BspXI,Bsu15I,BsuTUI,BanIII
Csp6I 4 CviQI,RsaNI
CviAII 7
CviJI 19 CviKI-1
CviRI 10 HpyCH4V
DdeI 5 BstDEI,HpyF3I
DpnI 5 MalI
Ecl136II 1 EcoICRI
Eco31I 1 Bso31I,BspTNI,BsaI
Eco57I 1 AcuI
Eco57MI 2
EcoP15I 1
EcoRII 5 AjnI,Psp6I,PspGI
EcoT22I 2 Mph1103I,NsiI,Zsp2I
EspI 1 Bpu1102I,Bsp1720I,CelII,BlpI
FaiI 29
FatI 7
Fnu4HI 3 BisI,Fsp4HI,GluI,ItaI,SatI
FokI 5
GsuI 1 BpmI
HaeIII 4 BsnI,BsuRI,BshFI,PhoI
HgiAI 3 Bbv12I,BsiHKAI,Alw21I
HgiCI 1 BanI,BshNI,BspT107I,AccB1I
HgiJII 1 Eco24I,EcoT38I,FriOI,BanII
Hin4I 3
Hin4II 4 HpyAV
HindII 1 HincII
HindIII 1
HinfI 5
HpaI 1 KspAI
HpaII 3 HapII,BsiSI,MspI
HphI 2 AsuHPI
Hpy178III 8 Hpy188III
Hpy188I 7
KpnI 1
Ksp632I 1 Eam1104I,EarI,Bst6I
MaeI 9 FspBI,BfaI,XspI
MaeII 2 HpyCH4IV
MaeIII 5
MboI 5 Bsp143I,BssMI,BstMBI,DpnII,Kzo9I,BfuCI,NdeII,Sau3AI
MboII 4
MfeI 2 MunI
MjaIV 5 Hpy8I,Hpy166II
MlyI 1 SchI
MnlI 9
MseI 8 SaqAI,Tru1I,Tru9I
MslI 2 RseI,SmiMI
MwoI 1 HpyF10VI,BstMWI
NdeI 1 FauNDI
NlaIII 7 Hin1II,Hsp92II,FaeI
NlaIV 3 BspLI,BmiI,PspN4I
NspI 1 BstNSI,XceI
PasI 1
PflMI 1 BasI,AccB7I,Van91I
PfoI 1
PleI 1 PpsI
PmaCI 1 BbrPI,Eco72I,AcvI,PmlI,PspCI
RsaI 4 AfaI
SacI 1 Psp124BI,SstI
ScaI 1 BmcAI,AssI,ZrmI
ScrFI 6 BmrFI,MspR9I,Bme1390I
SduI 3 MhlI,Bsp1286I
SecI 6 BseDI,BssECI,BsaJI
SetI 16
SfeI 2 BstSFI,SfcI,BfmI
SpeI 3 BcuI,AhlI
StuI 1 Eco147I,PceI,SseBI,AatI
StyI 2 Eco130I,EcoT14I,ErhI,BssT1I
SwaI 1 SmiI
TaiI 2
TaqI 3
TaqII 1
TatI 1
TfiI 4 PfeI
TscAI 3 TspRI
TseI 3 ApeKI
TsoI 1
Tsp4CI 8 HpyCH4III,TaaI,Bst4CI
TspDTI 10
TspEI 11 TasI,Tsp509I,Sse9I
TstI 1
VspI 1 PshBI,AseI
XbaI 2
XhoII 2 BstYI,MflI,PsuI,BstX2I# Enzymes which cut less frequently than the MINCUTS criterion
# Enzymes < MINCUTS Frequency Isoschizomers# Enzymes which cut more frequently than the MAXCUTS criterion
# Enzymes > MAXCUTS Frequency Isoschizomers# Enzymes that do not cut
AanI AarI AatII AbsI AclI AcyI AflII AgeI
AjuI AlfI AloI ApaI ApaLI ArsI AscI AvaI
AvrII BaeGI BarI BbeI BbvCI Bce83I BcgI BciVI
BclI BdaI BetI BfiI BfoI BglI BglII BmeT110I
BmtI BplI Bpu10I BsaBI BsaXI BsePI BseSI BseYI
BsgI Bsp120I Bsp1407I BspFNI BspLU11I BspMI BspMII BspOI
BsrBI BssNAI Bst1107I BstAFI BstAPI BstEII BstSLI BstZ17I
BtgZI BtrI BtsI Cfr9I CsiI CspCI DinI DraII
DraIII DrdI DsaI Eam1105I EciI Eco47III EcoNI EcoRI
EcoRV EgeI EheI Esp3I FalI FauI FnuDII FseI
FspAI GlaI GsaI HaeII HgaI HhaI Hin6I HinP1I
Hpy99I HspAI KasI KflI MauBI McrI MluI MmeI
MroNI MstI NaeI NarI NcoI NgoMIV NheI NmeAIII
NotI NruI NspBII OliI PacI PmeI PpiI PpuMI
PshAI PsiI PspOMI PspXI PsrI PstI PteI PvuI
PvuII RigI RruI RsrII SacII SalI SanDI SauI
SexAI SfaAI SfiI SfoI SgfI SgrAI SgrDI SmaI
SmlI SnaBI SphI SplI SrfI Sse232I Sse8387I SspDI
SspI TauI Tsp45I TspGWI TspMI Tth111I XcmI XhoI
XmaCI XmaI XmaIII XmnI ZraI
# No. of cutting enzymes which do not match the
# SITELEN, BLUNT, STICKY, COMMERCIAL, AMBIGUOUS citeria781
By ただ at 23:19 カテゴリー ; 生命科学
« 05.25 もうそろそろ梅雨のようで |
05月の記事
| 05.28 マスクのない世界へ »
トラックバック
このエントリーのトラックバックURL:
http://pinmarch.sakura.ne.jp/mt/mt-tb.cgi/1422
このリストは、次のエントリーを参照しています: EMBOSSを試す その7:
$ perl -lane 'BEGIN{$starts[0]=0;$offset...
トラックバック時刻: 2009年8月19日 21:01