编码番茄红素环化酶/八氢番茄红素合酶(carRP)和八氢番茄红素脱氢酶 (carB)的三孢布拉霉β-胡萝卜素的生物合成基因 【发明领域】
本发明涉及命名为carRP和carB的三孢布拉霉(Blakeslea trispora)的2个新基因。基因carRP编码双功能酶番茄红素环化酶/八氢番茄红素合酶,而carB编码酶八氢番茄红素脱氢酶。2个基因都与三孢布拉霉的β-胡萝卜素生物合成途径有关(参见图解)。
现有技术
类葫萝卜素是类异戊二烯类型的色素,其由某些细菌,真菌和光合生物合成。由于它们对健康的有益效果和它们诱人的颜色,类葫萝卜素作为色素和食品添加剂具有重要的商业价值。β-胡萝卜素是一种类葫萝卜素,其化学合成从1956年起已经已知。它具有536.9的分子量,和含有11个共轭双键的分子(C40H56)。它的颜色在晶态是红紫色,在油性溶液中是黄橙色,在水性分散体中是橙色。合成的β-胡萝卜素具有全反式异构构型,而从各种天然来源获得的β-胡萝卜素具有各种形式:单-顺,双-顺和多-顺。
通过微生物合成生产类葫萝卜素是化学和生物方法之间竞争的典型实例。生物技术方法显示其中有可能以简单的方式获得具有更复杂结构的类葫萝卜素,如仅以天然形式存在的构象异构体。与化学合成竞争,用于生产β-胡萝卜素的工业生物技术方法是基于藻盐生杜氏藻(Dunaliellasalina)和真菌三孢布拉霉的使用。使用三孢布拉霉的生产方法涉及进行(+)和(-)菌株的混合发酵以获得最大产量地β-胡萝卜素。这导致三孢酸的生物合成,其诱导β-胡萝卜素的生产。(+)菌株和(-)菌株都生产β-胡萝卜素,其被两者代谢成视黄醛,然后成4-二氢三孢醇。(+)菌株使用4-二氢三孢醇作为形成4-二氢三孢酸和它的甲酯(4-二氢三孢甲酯)的底物。对于它的部分,(-)菌株将4-二氢三孢醇代谢成三孢醇。最后,4-二氢三孢甲酯被(-)菌株转化成三孢酸并且三孢醇被(+)菌株转化成三孢酸。三孢酸生物合成的这种描述是简化,因为该方法产生许多共代谢物,其中有些对于(+)菌株和(-)菌株是共有的,但其它对于它们中之一是特异性的。这些共代谢物的相对量根据菌株而变化。
尽管真菌三孢布拉霉的巨大的生物技术重要性,关于β-胡萝卜素的生物合成的科学知识是缺乏的,因为:(i)还未描述关于它的遗传操作,即生物合成/调节基因的克隆和转化的基本方法和(ii)缺乏和β-胡萝卜素生物合成有关的酶的特性信息。然而,已经在系统发育相关的真菌如布拉克须霉(Phycomyces blakesleeanus)和卷枝毛霉(Mucor circinelloides)中描述β-胡萝卜素的生物合成途径(参见图解)(Arrach N.等(2001)Proceedings of the National Academy of Sciences USA 98:1687-1692;Velayos A.等(2000)European Journal of Biochemistry 267:5509-5519)。该生物合成至少需要3种酶:(i)八氢番茄红素合酶,其连接产生八氢番茄红素的2分子香叶基香叶基焦磷酸,(ii)八氢番茄红素脱氢酶,其将4个双键引入八氢番茄红素分子以合成番茄红素,和(iii)番茄红素环化酶,其使用番茄红素作为底物,负责形成位于β-胡萝卜素分子两端的环。
分析三孢布拉霉的突变体已经得出结论,在该真菌中的β-胡萝卜素的生物合成途径类似于关于布拉克须霉的描述(Metha B.J.和Cerdá-Olmedo E.(1995)Applied Microbiology and Biotechnology 42:836-838)。在布拉克须霉的情形中,通过突变可以改变它菌丝体的黄色,产生菌丝体为红色,白色或各种梯度黄色的菌株。红色突变体累积番茄红素,而白色突变体缺乏类胡罗卜素的生产或累积八氢番茄红素。使用互补分析,已经鉴定称为carB和carRA的2个基因,其产物可能组织成酶复合体,其中4个脱氢作用由4个相同的八氢番茄红素脱氢酶单位催化并且2个环化由2个相同的番茄红素环化酶单位催化(Candau R.等(1991)Proceedings ofthe National Academy of Sciences USA 88:4936-4940)。已经克隆和表征carB基因(Ruiz-Hidalgo M.J.等(1997)Molecular and GeneralGenetics 253:734-744)和carRA基因(Arrach N.等(2001)Proceedings ofthe National Academy of Sciences USA 98:1687-1692)。carRA基因具有2个不同结构域:(i)结构域R,其位于更靠近5’末端并编码八氢番茄红素合酶活性和(ii)结构域A,其负责八氢番茄红素合酶活性。此外,已经在卷枝毛霉中表达布拉克须霉的carB基因(Ruiz-Hidalgo M.J.等(1999)Current Microbiolgy 39:259-264)。
也已经克隆和表征卷枝毛霉的基因carB(Velayos A.等(2000)Planta210:938-946)和carRP(Velayos A.等(2000)European Journal ofBiochemistry 267:5509-5519)。carRP基因编码具有两个结构域的双功能酶番茄红素环化酶/八氢番茄红素合酶:(i)结构域R,其位于更靠近5’末端并编码番茄红素环化酶活性和(ii)结构域P,其位于3’末端并负责八氢番茄红素合酶活性。即使在缺乏结构域P时结构域R也是有功能的,而结构域P需要结构域R的存在以适当地起作用。
发明详述
本发明首次描述三孢布拉霉的carB和carRP基因。如先前关于布拉克须霉和卷枝毛霉所述,carB基因编码八氢番茄红素脱氢酶并且carRP基因编码具有结构域R(更靠近5’末端和编码番茄红素环化酶活性)和结构域P(位于3’末端并负责八氢番茄红素合酶活性)的双功能酶番茄红素环化酶/八氢番茄红素合酶。具体地,由carB编码的产物进行八氢番茄红素至番茄红素的转化,由carRP编码的双功能酶催化从香叶基香叶基-PP的八氢番茄红素的生物合成和番茄红素至β-胡萝卜素的转化。两个基因:(i)涉及β-胡萝卜素的生物合成途径,(ii)在基因组中是紧接着和(iii)在位于两者之间的双向启动子的控制下表达。该双向启动子允许在一个方向或在相反方向的基因表达。缺乏这两个基因中任何一个的突变体不能生产β-胡萝卜素,而是累积相应的生物合成中间体(参见图解)。
三孢布拉霉是对于生物技术生产β-胡萝卜素有重要工业意义的真菌。实际上,该方法变得与目前工业使用的合成方法存在竞争性。可以使用前述的基因carRP和carB,例如用于(i)通过增加它的表达改善β-胡萝卜素的产率和(ii)通过产生能够生物合成其它类胡萝卜素的菌株修饰β-胡萝卜素的生物合成途径。通过将额外拷贝的所述基因以它们天然状态或在强启动子的控制下表达的形式插入于三孢布拉霉中可以增加carRP和carB的基因表达。例如通过操作carRP基因使番茄红素环化酶活性失活可以实现β-胡萝卜素生物合成途径的修饰。这样,将产生能够生产番茄红素,有重大工业和商业利益的类胡罗卜素的菌株。
下页的图解显示涉及三孢布拉霉的β-胡萝卜素生物合成途径的基因和酶。八氢番茄红素合成酶活性连接2分子香叶基香叶基焦磷酸,产生八氢番茄红素。八氢番茄红素脱氢酶活性在八氢番茄红素分子中引入4个双键以合成番茄红素。番茄红素环化酶活性,其使用番茄红素作为底物,形成位于β-胡萝卜素分子两端的环。
图解
将真菌三孢布拉霉的基因组DNA用于构建在噬菌体载体λ-GEM12中的基因文库。为此,用限制酶Sau3AI实行部分消化并将产生的片段连接到λ-GEM12,如在Sambrook J.等(1989)Molecular Cloning:A LaboratoryManual,Cold Spring Harbor Laboratory,Cold Spring Harbor,New York,USA中所述。通过使用对应于卷枝毛霉的carRP基因的探针筛选所述三孢布拉霉的基因文库克隆基因carRP(番茄红素环化酶/八氢番茄红素合酶)和carB(八氢番茄红素脱氢酶)。使用基于卷枝毛霉的carRP基因(Velayos A.等(2000)European Journal of Biochemistry 267:5509-5519)设计的引物,通过PCR(多聚酶链式反应)获得探针。这样扩增560-bp的DNA片段并用于筛选基因文库,分离4个命名为fALBT1,fALBT4,fALBT12和fALBT15的重组噬菌体(图1)。用一系列限制酶分析所述克隆,然后将2.4-kb HindIII片段亚克隆至大肠杆菌的质粒载体中。前述片段的限制酶图谱在图解中显示。接下来,测定包括在前述HindIII片段中的2430bp的序列,发现2个不完整的可读框(ORF),其在相反的方向上转录。基于它们与存在于数据库中的序列的相似性,将它们命名为carRP(SEQ ID NO:1)和carB(SEQ ID NO:2)。随后邻近于前述2.4kb片段HindIII的末端的DNA区域的亚克隆和 测序使可以完成两个ORF的核苷酸序列。
对应于carRP基因的ORF具有1894bp的长度,被位置406和475之间的70-bp内含子间断。所述ORF编码608个氨基酸,69,581Da和7.78的等电点的蛋白质。将它推断的氨基酸序列(SEQ ID NO:3)与SwissProt数据库比较显示与编码卷枝毛霉(67%),布拉克须霉(55%)和粗糙脉孢菌(Neurospora crassa)(29%)的番茄红素环化酶/八氢番茄红素合酶的基因(Velayos A.等(2000)。European Journal ofBiochemistry 267:5509-5519;Arrach N等(2000)。Proceedings of the National Academy of Sciences USA98:1687-1692;Schmidhauser T.J.等(1994)。The Journal of BiologicalChemistry 269:12060-12066)的很高相似性。
对应于carB基因的ORF具有1955bp的长度并且被分别位于位置594-734和1584-1651的141bp和68bp的2个内含子中断。该ORF编码582个氨基酸,66,426Da和6.9的等电点的蛋白质。将它推断的氨基酸序列(SEQ ID NO:4)与SwissProt数据库比较显示与编码卷枝毛霉(80%),布拉克须霉(72%)和粗糙脉孢菌(48%)的八氢番茄红素脱氢酶的基因(Velayos A.等(2000)。Planta,210:938-946;Ruiz-Hidalgo等(1997)。Molecular and General Genetics 253:734-744;Ruiz-Hidalgo等(1999)。Current Microbiology 39:259-264;Schmidhauser等(1990)。Molecular and Cellular Biology 10:5064-5070)的很高相似性。
使用适当的限制目标将包括两个基因(carRP和carB)的6.9-kb DNA片段亚克隆至质粒载体,其能够以一个或多个拷贝结合到三孢布拉霉的基因组中,其允许在宿主真菌中表达这些基因。类似地,可以获得carRP和carB基因的启动子(使用适当的限制目标或通过PCR)并用于在三孢布拉霉中表达同源基因和异源基因。下列实施例描述在carRP和carB基因的启动子下印度斯坦链异壁菌(Streptoalloteichus hindustanus)的腐草霉素抗性基因(bleR)表达(Drocourt D.等(1990)。Nucleic Acids Research18:4009)。如前所述,可以将carRP和carB基因的启动子用于在三孢布拉霉中正确表达异源基因或用于过量表达弱转录的同源基因。
在三孢布拉霉中表达叶黄素的生物合成基因可以产生转化体,其能够生物合成类胡萝卜素如玉米黄质,角黄素,虾青素或新的类胡萝卜素。另外,通过基因中断的方法,将可以阻断β-胡萝卜素的生物合成途径,获得能够生产例如作为终产物的番茄红素的转化体。
根据布达佩斯条约的微生物保藏
按照布达佩斯条约的条款,在09.17.01将携带质粒pALBT9(其包含carRP基因)和pALBT52(其包含carB基因)的2株大肠杆菌菌株保藏在西班牙典型培养物保藏中心(CECT),巴伦西亚大学生物学院微生物系,Campus de Burjasot,46100 Burjasot,巴伦西亚,(西班牙),保藏号分别为CECT5982和CECT 5981。
下列实施例详细而不限制性地描述本发明。
实施例1
构建三孢布拉霉的(+)和(-)菌株的基因文库
以下列方式构建三孢布拉霉的(+)和(-)菌株的基因文库:在600μl的反应体积中在37℃下用20单位的Sau3AI部分消化总共300μg的全DNA,并分别在45秒,1分钟和2分钟时收集3个200μl的等分试样,用冷EDTA 20mM终止消化。在于0.7%的琼脂糖凝胶中证实消化物后,将它们混合,在68℃下加热10分钟,让其缓慢冷却至室温并放置在38ml的蔗糖梯度(10-40%)上。将该梯度在26,000rpm下25℃离心24小时,其后收集0.5-ml等分试样,在0.4%琼脂糖凝胶中分析每个10μl。混合其DNA具有18和22kb之间大小的等分试样,然后用蒸馏水稀释至大约10%蔗糖。然后用乙醇沉淀DNA,重悬浮在50μl TE缓冲液中,在0.4%琼脂糖凝胶中分析3μl最后提到的溶液。在该凝胶中证实DNA片段的大小正确并且它们的浓度大约为50ng/μl。
平行地,通过先前所述方法(Sambrook J.等(1989)Molecular Cloning:A Laboratory Manual,Cold Spring Harbor Laboratory,Cold Spring Harbor,New York,USA)制备噬菌体λ-GEM12(Promega)的DNA。在37℃下用内切核酸酶BamHI和EcoRI将来自噬菌体的50μg DNA消化2小时。用苯酚/CIA和CIA萃取双消化物,用乙醇沉淀并重悬浮在50μl TE缓冲液中。在收集2-μl等分试样后,向剩余中加入MgCl2至10mM,然后在42℃下将它温育1小时以促进载体的臂在它的粘性末端环化。再收集2-μl级分并且与前述级分一起在0.5%琼脂糖凝胶中分析以证实在粘性末端的正确环化和测定它的近似浓度(大约100ng/μl)。
接下来,改变插入片段/载体比率,使用0.25μg插入片段和0.25-0.75μg数量的载体进行一系列的连接。将反应物在12-14℃下温育16小时。使用‘Packagene’(Promega)体外包装提取物进行在连接后产生的重组噬菌体DNA的包壳。将来源于包壳反应的产物重悬浮在500μl SM中,进行大肠杆菌NM538(Promega)的感染以测定存在的噬菌体和大肠杆菌NM539(Promega)的数量,旨在测定重组噬菌体的百分比。大肠杆菌NM539是P2噬菌体的溶原性菌株,当感染它的噬菌体缺乏非必需中央区域时只生产溶菌噬菌斑。
构建的基因文库的效价证明如下:对于三孢布拉霉(+)125,000pfu,对于三孢布拉霉(-)50,000pfu。在两种情形中大约80%的噬菌体是外源DNA片段的载体。
实施例2
克隆和表征三孢布拉霉的carB和carRP基因
将两种基因文库转移至硝化纤维滤器并用对应于卷枝毛霉的carRP基因的560-bp探针扫描。该探针是通过PCR扩增获得的,所述PCR使用引物#61(SEQ ID NO:5)和#62(SEQ ID NO:6),其被设计成卷枝毛霉的carRP基因DNA序列的功能(Velayos A.等(2000)European Journal ofBiochemistry 267:5509-5519)。按照先前所述的杂交方法(Sambrook J.等(1989)Molecular Cloning:A Laboratory Manual,Cold Spring HarborLaboratory,Cold Spring Harbor,New York,USA)进行选择阳性噬菌体的方法。一旦完成预杂交,杂交,洗涤和放射自显影,选择4个克隆,其命名为fALBT1,fALBT4,fALBT12和fALBT15,其与探针DNA产生阳性信号。通过使用标准技术(Sambrook J.等(1989)Molecular Cloning:A Laboratory Manual,Cold Spring Harbor Laboratory,Cold Spring Harbor,New York,USA)用适当的限制酶消化噬菌体fALBT4或fALBT15(图1)可以获得本发明要求的、包括carRP和carB基因的DNA。用1种或多种限制酶(图1)可以获得包含SEQ ID NO:1和SEQ ID NO:2的DNA片段,并亚克隆至适当的载体(例如pBluescript)中。
将从fALBT1噬菌体获得的2.4-kb HindIII DNA片段(图1)以两个可能的方向亚克隆至质粒pBluescript I KS(+)的HindIII限制位点中。产生的质粒命名为pALBT3和pALBT11。按照厂商的说明书使用‘Erase-a-base’试剂盒将这些质粒进行序列缺失。使用‘Dye TerminatorCycle Sequencing Ready Reaction Kit’进行缺失克隆的序列反应,并按照厂商的说明书在ABI PRISM自动测序仪(Perkin Elmer)中分离DNA片段。这样我们在两条链上获得总共2430bp的核苷酸序列。使用Geneplot程序(DNASTAR),鉴定两个不完整的ORF,其对应于carRP和carB基因。为了完成两个基因的核苷酸序列,我们进行邻近于前述HindIII片段的DNA区域的亚克隆和随后测序。从fALBT1噬菌体开始,将1.0-kb XhoI片段亚克隆至pBluescript I KS(+)(图1)中,其包括carRP基因的3’末端,产生pALBT12质粒。包括在所述质粒中片段的两条链的测序使可以完成carRP基因的核苷酸序列。
另外,为了测定carB基因3’区域的序列,在pBluescript I KS(+)中进行从fALBT15获得的5.1-kb ClaI-NotI片段的亚克隆。按照厂商的说明书使用‘Erase-a-base’试剂盒(Promega)将命名为pALBT59和pALBT82的获得的质粒(双向)进行序列缺失。使用‘Dye Terminator Cycle SequencingReady Reaction Kit’进行缺失克隆的序列反应,并按照厂商的说明书在ABI PRISM自动测序仪(Perkin Elmer)中分离DNA片段。
完整的carRP基因的核苷酸序列显示1894bp的ORF,其被70bp的内含子间断,所述内含子位于位置406-475之间,两侧是下列内含子/外显子剪接序列:5’(g/GTATGCA)和3’(TTAG/c)。当将它们与关于真菌所述的共有序列:5’(g/GTAHGTYW)和3’(MYAG/g)比较时,它是个保守序列的问题。该基因编码608个氨基酸的多肽,推断的分子量为69,581Da,等电点为7.78。从该序列(SEQ ID NO:3)推断的蛋白质与在数据库中所述的卷枝毛霉(67%),布拉克须霉(55%)和粗糙脉孢菌(29%)的具有番茄红素环化酶/八氢番茄红素合酶活性的酶(Velayos A.等(2000)。European Journal of Biochemistry 267:5509-5519;Arrach N等(2000)。Proceedings of the National Academy of Sciences USA 98:1687-1692;Schmidhauser T.J.等(1994)。The Journal of Biological Chemistry 269:12060-12066)显示很高的相似性。
carB基因,长度为1955bp,被141和68bp的两个内含子间断,所述内含子分别位于位置594-734bp和1584-1651bp处,两侧是与关于carRP基因所述的那些类似的保守真菌内含子/外显子剪接序列[对于141-bp内含子5’(g/GTAAGTA)和3’(ATAG/t),对于68-bp内含子5’(g/GTAATAC)和3’(GTAG/t)]。carB基因编码582个氨基酸的多肽,推断的分子量为66,426Da,等电点为6.9。从该序列(SEQ ID NO:4)推断的蛋白质与在数据库中所述的卷枝毛霉(80%),布拉克须霉(72%)和粗糙脉孢菌(48%)的具有八氢番茄红素脱氢酶活性的酶(Velayos A.等(2000)。Planta,210:938-946;Ruiz-Hidalgo等(1997)。Molecular and General Genetics253:734-744;Ruiz-Hidalgo等(1999)。Current Microbiology 39:259-264;Schmidhauser等(1990)。Molecular and Cellular Biology 10:5064-5070)显示很高的相似性。
实施例3
构建用于在三孢布拉霉中异源表达的质粒
将pALFleo7质粒(Díez B.等(1999)Applied Microbiology andBiotechnology 52:196-207)用作构建在三孢布拉霉中异源表达质粒的起点。所述质粒包含印度斯坦链异壁菌的腐草霉素抗性基因(bleR)(DrocourtD.等(1990)。Nucleic Acids Research 18:4009),NcoI限制位点位于ATG翻译起始密码子处,BamHI限制位点在NcoI之前。pALFleo7还具有构巢曲霉(Aspergillus nidulans)的trpC基因终止子(TtrpC)。通过用BamHI加上BglII双消化pALFleo7纯化由bleR基因和TtrpC形成的聚簇并亚克隆至pBluescript I KS(+)的BamHI位点。产生的质粒命名为pALfleo8。接下来,通过定向诱变使用‘QuikChangeTM Site-Directed Mutagenesis Kit’(Stratagene)将两个NcoI位点插入carB和carRP基因的ATG翻译起始密码子中。为了将NcoI切割位点插入carB基因,设计下列寡核苷酸,其由SEQ ID NO:7和SEQ ID NO:8表示。为了将NcoI切割位点插入carRP基因,设计由SEQ ID NO:9和SEQ ID NO:10表示的寡核苷酸。在两种情形中将pALBT3用作PCR扩增的模板。一旦插入相应的突变,获得pALBT56质粒。
从pALBT56质粒开始,使用‘Quiaex II’试剂盒(Quiagen),对应于carB和carRP基因的双向启动子(PcarB-PcarRP)纯化0.6-kb NcoI片段。将该片段连接至预先用内切核酸酶NcoI消化的pALfleo8质粒,获得两个不同质粒:pALBT57(其在PcarRP启动子下表达bleR基因)和pALBT58(其在PcarB启动子下表达bleR基因)。质粒pALBT57和pALBT58(图2)允许在三孢布拉霉中异源表达印度斯坦链异壁菌的bleR基因。
实施例4
构建用于将附加拷贝的carRP和carB基因插入三孢布拉霉中的质粒。
为了将附加拷贝的carRP和carB基因插入三孢布拉霉,构建质粒pALBT83,pALBT84和pALBT85(图2)。在所有3种情形中,将先前所述的pALBT57质粒用作起点。以下列方法进行pALBT83的构建:用内切核酸酶SphI消化fALBT4,然后用来源于大肠杆菌的DNA聚合酶I的Klenow片段补平它的末端(为了促进它外切核酸酶活性的作用未加入dNTP)并最后用内切核酸酶NotI消化。接下来,用‘Quiaex II’试剂盒(Quiagen)纯化6.9-kb片段,其包含carB和carRP基因。另外,用内切核酸酶SacI消化pALBT57质粒,然后与DNA聚合酶I的Klenow片段温育以获得平端(为了促进它外切核酸酶活性的作用未加入dNTP)并最后用NotI消化。连接质粒和插入片段的可匹配末端,并通过电穿孔将连接产物转化到大肠杆菌DH5α中。在LB培养基(Sambrook J.等(1989)Molecular Cloning:A Laboratory Manual,Cold Spring Harbor Laboratory,Cold Spring Harbor,New York,USA)中选择氨苄青霉素抗性(存在于pALBT57载体中)转化体。通过限制酶切分析选择所需构建体,获得pALBT83质粒(图2)。
以下列方法进行pALBT84的构建:用内切核酸酶NdeI消化pALBT9,然后用大肠杆菌DNA聚合酶I的Klenow片段补平它的末端。接下来,使用‘Quiaex II’试剂盒(Quiagen)纯化包含carRP基因的3.5kb片段。另外,用内切核酸酶SacI消化pALBT57质粒,然后与DNA聚合酶I的Klenow片段温育以补平末端(为了促进它外切核酸酶活性的作用未加入dNTP)。连接质粒和插入片段的平端,并通过电穿孔将连接产物转化到大肠杆菌DH5α中。在LB培养基(Sambrook J.等(1989)Molecular Cloning:A Laboratory Manual,Cold Spring Harbor Laboratory,Cold Spring Harbor,New York,USA)中选择氨苄青霉素抗性(存在于pALBT57载体中)转化体。通过限制酶切分析选择所需构建体,获得pALBT84质粒(图2)。
以下列方法进行pALBT85的构建:用内切核酸酶XhoI-XbaI消化pALBT52,然后用大肠杆菌DNA聚合酶I的Klenow片段补平它的末端。接下来,使用‘Quiaex II’试剂盒(Quiagen)纯化包含carB基因的2.6-kb片段。另外,用内切核酸酶SacI消化pALBT57质粒,然后与DNA聚合酶I的Klenow片段温育以补平末端(为了促进它外切核酸酶活性的作用未加入dNTP)。连接质粒和插入片段的平端,并通过电穿孔将连接产物转化到大肠杆菌DH5α中。在LB培养基(Sambrook J.等(1989)MolecularCloning:A Laboratory Manual,Cold Spring Harbor Laboratory,Cold SpringHarbor,New York,USA)中选择氨苄青霉素抗性(存在于pALBT57载体中)转化体。通过限制酶切分析选择所需构建体,获得pALBT85质粒(图2)。
为了明确地阐述本发明,提供总共2副图。缩写BamHI,HindIII等是对限制酶的常规缩写。按照通过琼脂糖凝胶电泳测定的大小以千碱基(kb)显示DNA片段的近似长度。图未显示所有存在的限制位点。本发明的DNA可以通过各种方法插入到任何适当的载体中:粘性末端的直接连接,利用同聚物附着,通过衔接或连接分子等。
附图详述
图1.包含carB和carRP基因的三孢布拉霉基因组区域的限制酶图谱。箭头表示它们中每个的转录方向,矩形表示内含子的存在。图下部的线表示克隆到噬菌体载体λ-GEM12(fALBT)中或质粒(pALBT)中的DNA片段。该DNA片段的核苷酸序列在SEQ ID NO:1和SEQ ID NO:2中描述。
图2.质粒pALBT57,pALBT58,pALBT83,pALBT84和pALBT85的物理图谱。pALBT57由具有包括腐草霉素-抗性构建体PcarRP-bleR-TtrpC的1.7-kbHindIII-BamHI插入片段的pBluescript KS(+)组成。pALBT58由具有包括腐草霉素-抗性构建体PcarB-bleR-TtrpC的1.7-kbHindIII-BamHI插入片段的pBluescript KS(+)组成。pALBT83是由具有包括carB和carRP基因的6.9-kb SphI-NotI插入片段的pALBT57形成的。pALBT84由具有包括carRP基因的3.5-kb NdeI插入片段的pALBT57组成。pALBT85是具有包括carB基因的2.6-kb XhoI-XbaI插入片段的pALBT57。
序列表
<110>抗生素,S.A.U.
<120>“编码番茄红素环化酶/八氢番茄红素合酶(carRP)和八氢番茄红素脱氢酶(carB)的三孢布拉霉β-胡萝卜素的生物合成基因”
<130>P-99669
<160>10
<210>SEQ ID NO:1
<211>8786
<212>DNA
<213>三孢布拉霉(Blakeslea trispora)
<220>CDS
<221>carRP
<222>5688...7581
<220>外显子
<222>5688...6092
<220>外显子
<222>6163...7581
<223>直接来源:质粒pALBT9(CECT 5982)和pALBT52(CECT 5981).
<223>β-胡萝卜素生物合成基因。
<400>
GGGCTCACTT GTATCATCAC TGTTGTTAAT AAATTGTTGA GAATAAAGCT TGGATTGTAT 60
ACGACTGAGC CAGTTGGAGC CACTTGTAGG CCTTGTGGGT GATACAACAG CAGGCTTTGA 120
TACAGTGGTA TTTATTTCTG GCCTAAACTT CAATATACTT TTTGGACTGT TCATGCCTTG 180
ATATAACAAA ATAAAATAAA AAGGGACCTT GTACACTGAA AGCCAAATGA ATAAAATAAA 240
AAATAAAAAA TAAAAAGTTA TTTGCAGAAA CCCAAAAAGC CCGAAATTTC TTCTAAAAGA 300
AAGAAAATAA AAAAAAAGAG AGACTTACTT TTAGGCGATC AACAGTCTTT TTTTTTTGCT 360
TGAGCTTCTT TTAAGCTTTA CATTTAGGCA CTTTTATAAT TGCTGTATTG AATATGACAT 420
CTATCTTTTT TAACTTGAAC GATTTGTAGC TGTTTTCTCT TGACAATGTT AATTTTAAAG 480
CATGATTTTT TTTCTGAAAA AATAAAGACC GCTTACATAA TACAAAAGTC AGAATTGTAT 540
GGCTATTTTA TCCTGTCTCT TCTTTTCAGG AACTCAACCA TCTCCTTGTC TTTATTGTCT 600
GAGAATGCTT TATCTGACTC TTATTAATCC TTTGAGAAAC CAAAAATGGG GTGGATTGAG 660
ACTTGGACTC GCTTTGTGCA TTTGCTCCTT CTGTTCGATC ATTATTGAGC AATGTCATAC 720
AAACATTCAA TGACAAGCAA CATCAAACTA GGAGACAGAT TAAGCAAGCA GATGATTTAT 780
AAAGGGCTAC TCTGCTCATT GATTTACATG CTTTAGCATC ATGAATTGTG TTGCTTCCAT 840
TCCAAAATAG ATGTCTTTTT CTCTTTCCAC TTGCCTCTTT CTTTTTTTGG CTATTCAGCA 900
TAATGCAAGA GTAGCTATTC CTCAACCCAT CAGTAATAAG TGAATGGCAC GGTAGCATTG 960
TGTTTGTATG TTTTCTATCT GCGTTTTGAT TTTCGAATCT CATCTTTAGG ATAGTATCTT 1020
GAGATTGTAT TGATTTTGCC TGATACACTT TTGTTTTTGT GAATATGAGC TTCAGGGAAG 1080
TGATCTGTTT AAGTACAATT GAAATAAAGT GATAGTCTTT GAGGTTTGGT ATATGGCTTT 1140
CTAAAGCATG GACAGAGCCA TGACAAAAAA AAGGGCGGTA TAAAGCCTCA GATAACTCCC 1200
TCTCTTGCAC TCTGACAAAT GTAGTACAGG CTTATTGCCA ATGGAATGAC TCAATAGATG 1260
GCTAAATGTG AAGAAAAACG TATTGGAAAA GCCTACAAAA CTTTCCTTAA TAAACATCAT 1320
GGCCATCTTT TTTGAAAGAT TTCAAAGACT AGGTTAAAAA TACCATTACT ATCATCTTGC 1380
TTCTAGTCCT CTGACTATTT CTTTGTTTTC AACAACACGT GTAGGAAGGA CAAGAAAGAT 1440
GTAAGCGACA GAATAGCTTA GTTTTACTAT ACTCTACTTC TTTTTTTTTT GTAGTTTTCC 1500
ATCTTCCTTG CTTTGAAGAT AGACAATTTT ACTCACACAA TTTCTTTTTC GTTAAAGCTG 1560
ATAGCTTGAA TATCATCATA TCAAACAGAA GACATATATC GCTAAATAGT ACTTGCTATA 1620
ACAGAGAAAG CGATCCATGT GCAACTCTGA TGCTTATCGC CGTAGACGAC TTTCCTTAAA 1680
AAAAACTGAC AAACACCGTA GAATGTGACA CATACACACA ACAATACTGT TGCTCAAAAC 1740
TTTAACTAAT AGTTAAAGTA CAATGAAAAT ATATTGCAGT CTAGAATGAT AACAATTTTG 1800
CTTTAACGAT GTGGTGCAGT CTTAATTCAA AGCTCAAAGA AAAGAAAAAG AATTTGCTAG 1860
CTATCATGAG CTAAATCTCT GTTTTTCTTG AAACAATCTT ACAAAGAAAA GCTATATTGC 1920
TTGACAAAGG GCAAATCCAC CGAGATTCTT TTACTGCATG CCCGAATAAA AAGGGAGAGG 1980
AAGAGGAGTA TGTTACACTT GAATGTATTT TTGAGGAAGC ATCCGCATTA ATTTGGTGTT 2040
ATAAACACTG AGTAATCATG TTATTTGAAG ACTGAACCTT TACCAAAAAG GTCTTGTAGA 2100
TCGCTTGTTG CAATTGAGAT GGGTTAAAAA TAAGTCTAAA GTTAAGATAA GCACAATGAA 2160
AAGAACGTTT TTATTTTCTA TCAGGCAAAG TAAAACCACT TTTCTAGATG TGGCAATAAG 2220
CAATCAAGCC AAAGGGAGAA AAAAGCTTAT CAAAGCTATG GCTTTCAAGA GAATAGTAAT 2280
TTAGGTACTA CACAAAGCCA CTGTTTATGC TTCTTTGCAA TATCAACAAA GAGACATTGT 2340
GTCTGTTGAA ATGTTTTGTT TGACATGTTT AATCAGATCA AGTGAGGATG CTTTACTCTT 2400
TGGTTTAGTA AAAGAAACAC ACCAGCAACT CCGGTGAATG TTATGATTAT GACGTTTCAA 2460
ACGAAAAATC TCTATTTTCG TTTTAAGGTT AGTCCTTTTA GAATACCGTT TTTTTTTTTA 2520
CCATTTCATT GTCTTGAAAA CCCTCCCAAG CTAATGATTA TTTTCTTTTT TTACGCAAGA 2580
CTCGTATCAC TCACCTACCT TACAGACGTG TTTTGCTTTT TTGGATAATG CTGTGCTTGA 2640
TCTATGTATG GCTCCTTTGC CTTATTTTTA AAAAGAAATG TTTGCTAGGA TTGATTTTTA 2700
ATGGTTACTC TCAATCAAAT ACCACATTTA GTAGAAACAA AATTTGTGCA TATCATAATA 2760
AAACTAAATT CGATTATTTT TTCTAAAATC AGGATAATTT GTTTTTCCAA TATTTGTTTG 2820
TAGAATTGTC TGTCCTACCA AACAATTCAG TTTTCTATTT GCGTCGAGTC ATTTATTTTG 2880
GGTTTCTTTG TTTGAGCTGA TTCTGATACA CATGTGAATT GTCTTTTTAG ACACTATTCT 2940
AGAATTCATT CCATTCGAAA GGATCAACAT ACACCAATTT AATGACGTGC TAGATAATGG 3000
ATACAAATAT ACGCACAAAA AAAGAAAGAA TTCTATGATC AAAGAGAACG CAGACACAGA 3060
GTGATACATT TAAATGGTTA AGTTCATATG ATGTTAAAAT GGTAGCTTTA TTATTGAACT 3120
AAATGCGAAT ATCGTTGCTG TTTTGTCCTT GGAAAACGTT AGGTAAAAGT TGGTTAATGA 3180
AAGAAGCAGG AGTTGTAGTA TCATCTCTTG GGAAGAAATA GAAAAAGAGG AAAGTAACAA 3240
AGTAACAAGC AAGACAATAA TAGATCCAAT GGCTTTCGGT CTTACGAGTT TGTTCAGGAG 3300
CATACTTCTT TTGGCTATCT TGTAACTTTC TTGGTAAGGG ATTCTGGCCA AAGCTTTTAC 3360
AGACTTGGTC GGAAGTAAGC TTACTTCCAG CAAGAACGAT AGGAACACCA GTACCTGGAT 3420
GTGTACTACA AAGAAAAGAG AAATGAGTAC GTGCGTTATT AAAAAAAAGA AAAAAAGAGG 3480
GCAAAAGTAT TACCTAGCTC CGACAAAGAA AAGATTATCA TAACGGTTTG TGGAATCCTT 3540
GGTACTAGGT CTGAACCAGA GAACTTGGAA CACATCATGA GAAAGACCAA GAATAGAACC 3600
TCTCCAAAGG TTAAACTTGC TTTGCCAAAC ACTAGGATCA TTCACTTCTT CATGTTCAAT 3660
CAAATTAGCA AAGTTGTTTA CTCCCAAACG ACGTTCGATA ACTTCCAGAA CCATCTTGCG 3720
TGCACGGTTT ACCAACTCAG GATAATTTTC TTCAGCACTG TTTCCTGTCT TACTCTTCAT 3780
ATGGCCAATT GGAACCAACA CAATAATGGA GTCCTTGTTG GGAGGTGCGG CAGATTCATC 3840
AATTCGAGAT GGAACGTTGA CATAGAATGA AGCTTCAGAG GGCAAACCGA AGTCGTTGAA 3900
AATCTCATCA AAACTTTCCT TGTAGGCTTC AGCCAAGAAG ATATTGTGTA CGTCTAATTG 3960
AGGCACCTTT GTTGACATGG ACCAATAAAA CGAAATAGAT GATGAAGTGA GTTTCTTTGA 4020
GGCTAATGTC TTCTTTGTCC AATTGCAAGG AGGTAACAGA TGGTGATAAG CATAAACAAG 4080
ATCCGCATTA CATACGACTG CATCGGCTTC AATGACTTCT CCGCTTTCCA AAGTGACACC 4140
GGTTACACGC TTGTCTTTAT CGACAGTGTT AATTTTAGCA ACAGGCGATT GATATCTGAA 4200
TTCAGCACCG TACTTTTTGG AGGCGATAGA CTCAAGCTTC TGAACAACCA TGTTGAAACC 4260
ACCACGAGGA TACCAGATAC CTTCAGCAAA CTCGGTGTAT TGTAACAAAC TGTAAACTGC 4320
TGGAGCATCA TAAGGCGACA TACTATATTC CAAAAATAGA AAATAGAACA ATGAATATCA 4380
AAATTCCTTT CACTTGCCCT TTTTCACATT TCTCTTTTCC CACCCCCGAC CGGTCTCACT 4440
CATTTTTTTT TCATCCCACA CCACGCGTTG TATGTGTACT TACCCCATAT ACATTGTTTG 4500
AAAAGTAAAA GCCATACGCA TTTTCTTGGT TTGGAAATAT TTACTGGCTC GGTCATAGAT 4560
CTTACCAAAC AAGTGCAAGC GAAAGATTTC AGGCACATAC TGAAGACGAA TCAAATCCCA 4620
AATGGTTTCA AAGTTGCGCT TGATAGCAAT AAATGTACCT TGTTCATAAT GGACATGTGT 4680
TTCCTTCATG AAATCCAAGA ATCTACCAAA TCCAAGGGGA CCCTCAATAC GGTCCAATTC 4740
GCCCTTCATC TTGGTTAAAT CGGAAGAGAG TTGTACGGCA TCACCGTCGT CAAAATGAAC 4800
CTTATAGTTA TTGTCACAGC GAAGCAAATC CAAATGATCA CCAATACGTT CATCCAAATC 4860
AGCAAATGCA TCTTCAAAAA GCTTAGGCAT CAAATAGAGT GAGGGACCCT GATCAAAGCG 4920
ATGACCATCG TGATGAATGA ATGAACAACG GCCACCGGAA AAGTCGTTCT TTTCAACAAC 4980
AGTAACTCGA AAACCTTCAC GAGCAAGACG AGCAGCAGTA GCAGTTCCGC CAATACCGGC 5040
ACCAATGACA ACAATATGCT TCTTTTGATC AGACATGAGA TTAAAATAGA TAAGGAAAAG 5100
AAAGTGAAAA GAAATTCGGA AGCATGGCAC ATTCTTCTTT TTATAAATAC ATGCCTGACT 5160
TTCTTTTTCC ATCGATATGA TATATGCATA TGATAGATAT ACAAGCAATC TTCTTCAAGG 5220
AGTTTGAAAT TTTGTCCTCC AGGAGCAAAA AAAAGTTTTT TTTTATACAT GTTTGTACAC 5280
AAGAATAGTT ACCAATTTGC TTTGGTCTTA CGTGCTGCAA GTTTATATCG TTTTCAATTT 5340
CTTTGTCTTT ACATTTTCTT TGTCCTTTAT CTTTCCTCAT TTAGTCTTTG GGAGAATTAG 5400
GAAAAGGGAG CGGAAAGGTA AGAAATGCTT GCGTATTTTA CTAATTCGGC AAACATCCAA 5460
TTTGGCAAAC AGCAGCCTGT GCAACGCTCT CGAGATGACA GTATCTTTGA TTACACTCTA 5520
AATCTCGATG ACCCGACCAA AAAGAGCGAA CAAAGAAATA ATCTTGTGCA TTCGAATATG 5580
ATGGAAGATT TTTTCCCCCT TATTCTAAAT GTTGACATAG CGTGTATGTT ATATAAACAA 5640
AAAGAAATTG TACAAACTTT CTTTTCTTCT CTTTTTATTT TATCTCT ATG TCA ATA 5696
Met Ser Ile
1
CTC ACT TAT CTG GAA TTT CAT CTC TAC TAT ACA CTA CCT GTC CTT GCG 5744
Leu Thr Tyr Leu Glu Phe His Leu Tyr Tyr Thr Leu Pro Val Leu Ala
5 10 15
GCA TTG TGT TGG CTG CTA AAG CCG TTT CAC TCA CAG CAA GAC AAT CTC 5792
Ala Leu Cys Trp Leu Leu Lys Pro Phe His Ser Gln Gln Asp Asn Leu
20 25 30 35
AAG TAT AAA TTT TTA ATG TTG ATG GCC GCC TCT ACC GCA TCG ATT TGG 5840
Lys Tyr Lys Phe Leu Met Leu Met Ala Ala Ser Thr Ala Ser Ile Trp
40 45 50
GAC AAT TAT ATC GTT TAT CAT CGC GCT TGG TGG TAC TGT CCT ACT TGT 5888
Asp Asn Tyr Ile Val Tyr His Arg Ala Trp Trp Tyr Cys Pro Thr Cys
55 60 65
GTT GTG GCT GTC ATT GGC TAT GTA CCT CTA GAA GAA TAC ATG TTC TTT 5936
Val Val Ala Val Ile Gly Tyr Val Pro Leu Glu Glu Tyr Met Phe Phe
70 75 80
ATC ATC ATG ACT TTA ATG ACT GTC GCG TTC TCA AAC TTT GTT ATG CGT 5984
Ile Ile Met Thr Leu Met Thr Val Ala Phe Ser Asn Phe Val Met Arg
85 90 95
TGG CAC TTG CAT ACT TTC TTT ATT AGA CCC AAC ACT TCT TGG AAG CAA 6032
Trp His Leu His Thr Phe Phe Ile Arg Pro Asn Thr Ser Trp Lys Gln
100 105 110 115
ACA CTA TTA GTA CGC CTT GTG CCT GTT TCA GCT TTA TTG GCA ATC ACT 6080
Thr Leu Leu Val Arg Leu Val Pro Val Ser Ala Leu Leu Ala Ile Thr
120 125 130
TAT CAT GCT TGG GTATGCAAAA TAAACAAACA CTAAAAAAAA ATAATAGCGA 6132
Tyr His Ala Trp
135
TAATTATTTT ACTCATTTTT CTTTTTTTAG CAC TTG ACA CTG CCA AAT AAA TCT 6186
His Leu Thr Leu Pro Asn Lys Ser
140
TCA TTT TAT GGT TCA TGC ATC CTT TGG TAT GCT TGT CCT GTG TTG GCT 6234
Ser Phe Tyr Gly Ser Cys Ile Leu Trp Tyr Ala Cys Pro Val Leu Ala
145 150 155
ATT CTT TGG CTG GGT GCT GGT GAA TAT ATC TTG CGT CGA CCT GTG GCT 6282
Ile Leu Trp Leu Gly Ala Gly Glu Tyr Ile Leu Arg Arg Pro Val Ala
160 165 170 175
GTC CTT TTG TCT ATT GTT ATC CCT AGT GTA TAC CTA TGT TGG GCT GAT 6330
Val Leu Leu Ser Ile Val Ile Pro Ser Val Tyr Leu Cys Trp Ala Asp
180 185 190
ATC GTC GCT ATT AGT GCT GGC ACA TGG CAT ATT TCT CTT AGA ACA AGC 6378
Ile Val Ala Ile Ser Ala Gly Thr Trp His Ile Ser Leu Arg Thr Ser
195 200 205
ACT GGC AAA ATG GTA GTA CCC GAT TTA CCT GTA GAA GAA TGC CTG TTT 6426
Thr Gly Lys Met Val Val Pro Asp Leu Pro Val Glu Glu Cys Leu Phe
210 215 220
TTT ACT TTG ATC AAC ACA GTC TTG GTT TTT GCT ACC TGT GCT ATA GAC 6474
Phe Thr Leu Ile Asn Thr Val Leu Val Phe Ala Thr Cys Ala Ile Asp
225 230 235
CGC GCT CAG GCC ATC CTC CAT CTG TAC AAA TCA TCT GTT CAA AAT CAA 6522
Arg Ala Gln Ala Ile Leu His Leu Tyr Lys Ser Ser Val Gln Asn Gln
240 245 250 255
AAC CCT AAA CAA GCC ATT TCC CTT TTC CAG CAT GTC AAA GAG CTA GCA 6570
Asn Pro Lys Gln Ala Ile Ser Leu Phe Gln His Val Lys Glu Leu Ala
260 265 270
TGG GCC TTC TGT CTT CCT GAC CAA ATG CTC AAC AAT GAA TTG TTT GAT 6618
Trp Ala Phe Cys Leu Pro Asp Gln Met Leu Asn Asn Glu Leu Phe Asp
275 280 285
GAT CTT ACT ATC AGC TGG GAT ATT TTA CGT AAA GCC TCA AAG TCA TTC 6666
Asp Leu Thr Ile Ser Trp Asp Ile Leu Arg Lys Ala Ser Lys Ser Phe
290 295 300
TAT ACT GCA TCT GCC GTT TTT CCA AGT TAT GTA CGT CAA GAC TTG GGT 6714
Tyr Thr Ala Ser Ala Val Phe Pro Ser Tyr Val Arg Gln Asp Leu Gly
305 310 315
GTT CTC TAT GCT TTC TGC AGA GCT ACC GAT GAC CTG TGC GAT GAT GAA 6762
Val Leu Tyr Ala Phe Cys Arg Ala Thr Asp Asp Leu Cys Asp Asp Glu
320 325 330 335
TCC AAA TCT GTT CAA GAA AGA AGA GAC CAA TTA GAT CTT ACT CGA CAA 6810
Ser Lys Ser Val Gln Glu Arg Arg Asp Gln Leu Asp Leu Thr Arg Gln
340 345 350
TTT GTT CGT GAT CTC TTT AGC CAA AAG ACC AGT GCG CCT ATT GTG ATT 6858
Phe Val Arg Asp Leu Phe Ser Gln Lys Thr Ser Ala Pro Ile Val Ile
355 360 365
GAT TGG GAA TTG TAT CAA AAC CAA CTT CCT GCT TCT TGT ATA TCA GCC 6906
Asp Trp Glu Leu Tyr Gln Asn Gln Leu Pro Ala Ser Cys Ile Ser Ala
370 375 380
TTT AGA GCC TTT ACT CGC CTT CGC CAT GTC CTT GAA GTA GAC CCT GTA 6954
Phe Arg Ala Phe Thr Arg Leu Arg His Val Leu Glu Val Asp Pro Val
385 390 395
GAA GAA CTA TTA GAT GGT TAC AAA TGG GAT CTT GAG CGT CGT CCT ATC 7002
Glu Glu Leu Leu Asp Gly Tyr Lys Trp Asp Leu Glu Arg Arg Pro Ile
400 405 410 415
CTT GAT GAA CAA GAC TTG GAG GCA TAC TCT GCT TGT GTG GCC AGT AGT 7050
Leu Asp Glu Gln Asp Leu Glu Ala Tyr Ser Ala Cys Val Ala Ser Ser
420 425 430
GTG GGT GAA ATG TGC ACA CGT GTG ATT CTT GCT CAA GAC CAA AAG GAA 7098
Val Gly Glu Met Cys Thr Arg Val Ile Leu Ala Gln Asp Gln Lys Glu
435 440 445
AAT GAT GCT TGG ATA ATT GAC CGT GCA CGT GAG ATG GGG CTG GTG CTA 7146
Asn Asp Ala Trp Ile Ile Asp Arg Ala Arg Glu Met Gly Leu Val Leu
450 455 460
CAA TAC GTT AAC ATT GCT CGA GAC ATT GTG ACT GAT AGC GAG ACT CTG 7194
Gln Tyr Val Asn Ile Ala Arg Asp Ile Val Thr Asp Ser Glu Thr Leu
465 470 475
GGT CGA TGT TAT CTG CCT CAA CAA TGG CTT AGA AAA GAA GAA ACA GAA 7242
Gly Arg Cys Tyr Leu Pro Gln Gln Trp Leu Arg Lys Glu Glu Thr Glu
480 485 490 495
CAA ATA CAG CAA GGC AAC GCC CGT AGC CTA GGT GAT CAA AGA CTG TTG 7290
Gln Ile Gln Gln Gly Asn Ala Arg Ser Leu Gly Asp Gln Arg Leu Leu
500 505 510
GGC TTG TCT CTG AAG CTT GTA GGA AAG GCA GAC GCT ATC ATG GTG AGA 7338
Gly Leu Ser Leu Lys Leu Val Gly Lys Ala Asp Ala Ile Met Val Arg
515 520 525
GCT AAG AAG GGC ATT GAC AAG TTG CCG GCA AAC TGT CAA GGC GGT GTA 7386
Ala Lys Lys Gly Ile Asp Lys Leu Pro Ala Asn Cys Gln Gly Gly Val
530 535 540
CGA GCT GCT TGC CAA GTA TAT GCT GCA ATT GGA TCT GTA CTC AAG CAG 7434
Arg Ala Ala Cys Gln Val Tyr Ala Ala Ile Gly Ser Val Leu Lys Gln
545 550 555
CAG AAG ACA ACA TAT CCT ACA AGA GCT CAT CTA AAA GGA AGC GAA CGT 7482
Gln Lys Thr Thr Tyr Pro Thr Arg Ala His Leu Lys Gly Ser Glu Arg
560 565 570 575
GCC AAG ATT GCT CTG TTG AGT GTA TAC AAC CTC TAT CAA TCT GAA GAC 7530
Ala Lys Ile Ala Leu Leu Ser Val Tyr Asn Leu Tyr Gln Ser Glu Asp
580 585 590
AAG CCT GTG GCT CTC CGT CAA GCT AGA AAG ATT AAG AGT TTT TTT GTT 7578
Lys Pro Val Ala Leu Arg Gln Ala Arg Lys Ile Lys Ser Phe Phe Val
595 600 605
GAT TAG TGAATTTTTG TTTTATTTAT GTCTGATAGT TCAATAAAGA GACAACACAT 7634
Asp
ACAATATAAA ATCATTGTCT TTAAATGTTA ATTTAGTAGA GTGTAAAGCC TGCATTTTTT 7694
TTGTACGCAT AAACAATGAG TTCACCCCGC TTCTGGTTTT TAAATAATTA TGTCAAACTA 7754
GGGAAAATTC TTTTTTTTCT CTTCGTTCTT TTTTTGGCTT GTTGTGGAGT CACAGGCTTG 7814
TCTTCAGATT GATAGAGGTT GTATACACTC AACAGAGCAA TCTTGGCACG TTCGCTTCCT 7874
TTTAGATGAG CTCTTGTAGG ATATGTTGTC TTCTGCTGCT TGAGTACAGA TCCAATTGCA 7934
GCATATACTT GGCAAGCAGC TCGTACACCG CCTTGACAGT TTGCCGGCAA CTTGTCAATG 7994
CCCTTCTTAG CTCTCACCAT GATAGCGTCT GCCTTTCCTA CAAGCTTCAG AGACAAGCCC 8054
AACAGTCTTT GATCACCTAG GCTACGGGCG TTGCCTTGCT GTATTTGTTC TGTTTCTTCT 8114
TTTCTAAGCC ATTGTTGAGG CAGATAACAT CGACCCAACA TCCTCGAGCC ATACTACAGC 8174
ATAAAAGGAT ACGTTTTCTT TAACAGAAAT TTACCCTTTT GTTATCAGCA CATACAAAAA 8234
AAAAGAAATT TAAGATGAGT AGGACTTCCA TTCTCTCAAA AATTTTATTC AATCCATAAA 8294
TGAATTATTT TTGGACAAAA AAGAAAGATT ATGCCTGATT TTCTCTATTT TTTTTTTTTT 8354
TACAACTCCA CCAATACTTT CTAGAGACAC ATTTGAGCGA TGTGACAGTC GGACTCGAGA 8414
AGTACAAGAA GGTACAGAAA TAGCAGCTGA GCGTATGATT GGGTCCCAAG GTTCTTCTCG 8474
ACTTTCTCTC TTACCCTGGA ACAGAGAGAA AAAAAAAATA TTTCGTCTTT TTTGGATAAT 8434
ATTATAAAAA AGGGAATTTA GTAAAGAAAA CGGTTGTTTC CTTTTCTTTT TTTTTTTCCT 8494
TCTCCACTAC ATGAATAAAC ATCGCCACCC AAATTTACCT TCCATATCTA CTCTACTTAC 8554
TGGACCACCT TCTCCGCCAC CTCCCATTAT TGTGATAGAT GAGGAACATT CTCCCAGTTG 8614
TTCACCTAAC AAATACCACT TGTCTCCTGT CTTATCACCT ATTGATTCAT ATGCTTCATC 8674
ACCAAACTCG AG 8786
<210>SEQ ID NO:2
<211>8786
<212>DNA
<213>三孢布拉霉
<220>CDS
<221>carB 3711...5665
<220>外显子3711...4303
<220>外显子4445...5293
<220>外显子5362...5665
<223>直接来源:质粒pALBT9(CECT 5982)和pALBT52(CECT 5981)。
<223>β-胡萝卜素生物合成基因。
<400>
CTCGAGTTTG GTGATGAAGC ATATGAATCA ATAGGTGATA AGACAGGAGA CAAGTGGTAT 60
TTGTTAGGTG AACAACTGGG AGAATGTTCC TCATCTATCA CAATAATGGG AGGTGGCGGA 120
GAAGGTGGTC CAGTAAGTAG AGTAGATATG GAAGGTAAAT TTGGGTGGCG ATGTTTATTC 180
ATGTAGTGGA GAAGGAAAAA AAAAAAGAAA AGGAAACAAC CGTTTTCTTT ACTAAATTCC 240
CTTTTTTATA ATATTATCCA AAAAAGACGA AATATTTTTT TTTTCTCTCT GTTCCAGGGT 300
AAGAGAGAAA GTCGAGAAGA ACCTTGGGAC CCAATCATAC GCTCAGCTGC TATTTCTGTA 360
CCTTCTTGTA CTTCTCGAGT CCGACTGTCA CATCGCTCAA ATGTGTCTCT AGAAAGTATT 420
GGTGGAGTTG TAAAAAAAAA AAAAATAGAG AAAATCAGGC ATAATCTTTC TTTTTTGTCC 480
AAAAATAATT CATTTATGGA TTGAATAAAA TTTTTGAGAG AATGGAAGTC CTACTCATCT 540
TAAATTTCTT TTTTTTTGTA TGTGCTGATA ACAAAAGGGT AAATTTCTGT TAAAGAAAAC 600
GTATCCTTTT ATGCTGTAGT ATGGCTCGAG GATGTTGGGT CGATGTTATC TGCCTCAACA 660
ATGGCTTAGA AAAGAAGAAA CAGAACAAAT ACAGCAAGGC AACGCCCGTA GCCTAGGTGA 720
TCAAAGACTG TTGGGCTTGT CTCTGAAGCT TGTAGGAAAG GCAGACGCTA TCATGGTGAG 780
AGCTAAGAAG GGCATTGACA AGTTGCCGGC AAACTGTCAA GGCGGTGTAC GAGCTGCTTG 840
CCAAGTATAT GCTGCAATTG GATCTGTACT CAAGCAGCAG AAGACAACAT ATCCTACAAG 900
AGCTCATCTA AAAGGAAGCG AACGTGCCAA GATTGCTCTG TTGAGTGTAT ACAACCTCTA 960
TCAATCTGAA GACAAGCCTG TGACTCCACA ACAAGCCAAA AAAAGAACGA AGAGAAAAAA 1020
AAGAATTTTC CCTAGTTTGA CATAATTATT TAAAAACCAG AAGCGGGGTG AACTCATTGT 1080
TTATGCGTAC AAAAAAAATG CAGGCTTTAC ACTCTACTAA ATTAACATTT AAAGACAATG 1140
ATTTTATATT GTATGTGTTG TCTCTTTATT GAACTATCAG ACATAAATAA AACAAAAATT 1200
CACTAATCAA CAAAAAAACT CTTAATCTTT CTAGCTTGAC GGAGAGCCAC AGGCTTGTCT 1260
TCAGATTGAT AGAGGTTGTA TACACTCAAC AGAGCAATCT TGGCACGTTC GCTTCCTTTT 1320
AGATGAGCTC TTGTAGGATA TGTTGTCTTC TGCTGCTTGA GTACAGATCC AATTGCAGCA 1380
TATACTTGGC AAGCAGCTCG TACACCGCCT TGACAGTTTG CCGGCAACTT GTCAATGCCC 1440
TTCTTAGCTC TCACCATGAT AGCGTCTGCC TTTCCTACAA GCTTCAGAGA CAAGCCCAAC 1500
AGTCTTTGAT CACCTAGGCT ACGGGCGTTG CCTTGCTGTA TTTGTTCTGT TTCTTCTTTT 1560
CTAAGCCATT GTTGAGGCAG ATAACATCGA CCCAGAGTCT CGCTATCAGT CACAATGTCT 1620
CGAGCAATGT TAACGTATTG TAGCACCAGC CCCATCTCAC GTGCACGGTC AATTATCCAA 1680
GCATCATTTT CCTTTTGGTC TTGAGCAAGA ATCACACGTG TGCACATTTC ACCCACACTA 1740
CTGGCCACAC AAGCAGAGTA TGCCTCCAAG TCTTGTTCAT CAAGGATAGG ACGACGCTCA 1800
AGATCCCATT TGTAACCATC TAATAGTTCT TCTACAGGGT CTACTTCAAG GACATGGCGA 1860
AGGCGAGTAA AGGCTCTAAA GGCTGATATA CAAGAAGCAG GAAGTTGGTT TTGATACAAT 1920
TCCCAATCAA TCACAATAGG CGCACTGGTC TTTTGGCTAA AGAGATCACG AACAAATTGT 1980
CGAGTAAGAT CTAATTGGTC TCTTCTTTCT TGAACAGATT TGGATTCATC ATCGCACAGG 2040
TCATCGGTAG CTCTGCAGAA AGCATAGAGA ACACCCAAGT CTTGACGTAC ATAACTTGGA 2100
AAAACGGCAG ATGCAGTATA GAATGACTTT GAGGCTTTAC GTAAAATATC CCAGCTGATA 2160
GTAAGATCAT CAAACAATTC ATTGTTGAGC ATTTGGTCAG GAAGACAGAA GGCCCATGCT 2220
AGCTCTTTGA CATGCTGGAA AAGGGAAATG GCTTGTTTAG GGTTTTGATT TTGAACAGAT 2280
GATTTGTACA GATGGAGGAT GGCCTGAGCG CGGTCTATAG CACAGGTAGC AAAAACCAAG 2340
ACTGTGTTGA TCAAAGTAAA AAACAGGCAT TCTTCTACAG GTAAATCGGG TACTACCATT 2400
TTGCCAGTGC TTGTTCTAAG AGAAATATGC CATGTGCCAG CACTAATAGC GACGATATCA 2460
GCCCAACATA GGTATACACT AGGGATAACA ATAGACAAAA GGACAGCCAC AGGTCGACGC 2520
AAGATATATT CACCAGCACC CAGCCAAAGA ATAGCCAACA CAGGACAAGC ATACCAAAGG 2580
ATGCATGAAC CATAAAATGA AGATTTATTT GGCAGTGTCA AGTGCTAAAA AAAGAAAAAT 2640
GAGTAAAATA ATTATCGCTA TTATTTTTTT TTAGTGTTTG TTTATTTTGC ATACCCAAGC 2700
ATGATAAGTG ATTGCCAATA AAGCTGAAAC AGGCACAAGG CGTACTAATA GTGTTTGCTT 2760
CCAAGAAGTG TTGGGTCTAA TAAAGAAAGT ATGCAAGTGC CAACGCATAA CAAAGTTTGA 2820
GAACGCGACA GTCATTAAAG TCATGATGAT AAAGAACATG TATTCTTCTA GAGGTACATA 2880
GCCAATGACA GCCACAACAC AAGTAGGACA GTACCACCAA GCGCGATGAT AAACGATATA 2940
ATTGTCCCAA ATCGATGCGG TAGAGGCGGC CATCAACATT AAAAATTTAT ACTTGAGATT 3000
GTCTTGCTGT GAGTGAAACG GCTTTAGCAG CCAACACAAT GCCGCAAGGA CAGGTAGTGT 3060
ATAGTAGAGA TGAAATTCCA GATAAGTGAG TATTGACATA GAGATAAAAT AAAAAGAGAA 3120
GAAAAGAAAG TTTGTACAAT TTCTTTTTGT TTATATAACA TACACGCTAT GTCAACATTT 3180
AGAATAAGGG GGAAAAAATC TTCCATCATA TTCGAATGCA CAAGATTATT TCTTTGTTCG 3240
CTCTTTTTGG TCGGGTCATC GAGATTTAGA GTGTAATCAA AGATACTGTC ATCTCGAGAG 3300
CGTTGCACAG GCTGCTGTTT GCCAAATTGG ATGTTTGCCG AATTAGTAAA ATACGCAAGC 3360
ATTTCTTACC TTTCCGCTCC CTTTTCCTAA TTCTCCCAAA GACTAAATGA GGAAAGATAA 3420
AGGACAAAGA AAATGTAAAG ACAAAGAAAT TGAAAACGAT ATAAACTTGC AGCACGTAAG 3480
ACCAAAGCAA ATTGGTAACT ATTCTTGTGT ACAAACATGT ATAAAAAAAA ACTTTTTTTT 3540
GCTCCTGGAG GACAAAATTT CAAACTCCTT GAAGAAGATT GCTTGTATAT CTATCATATG 3600
CATATATCAT ATCGATGGAA AAAGAAAGTC AGGCATGTAT TTATAAAAAG AAGAATGTGC 3660
CATGCTTCCG AATTTCTTTT CACTTTCTTT TCCTTATCTA TTTTAATCTC ATG TCT 3716
Met Ser
1
GAT CAA AAG AAG CAT ATT GTT GTC ATT GGT GCC GGT ATT GGC GGA ACT 3764
Asp Gln Lys Lys His Ile Val Val Ile Gly Ala Gly Ile Gly Gly Thr
5 10 15
GCT ACT GCT GCT CGT CTT GCT CGT GAA GGT TTT CGA GTT ACT GTT GTT 3812
Ala Thr Ala Ala Arg Leu Ala Arg Glu Gly Phe Arg Val Thr Val Val
20 25 30
GAA AAG AAC GAC TTT TCC GGT GGC CGT TGT TCA TTC ATT CAT CAC GAT 3860
Glu Lys Asn Asp Phe Ser Gly Gly Arg Cys Ser Phe Ile His His Asp
35 40 45 50
GGT CAT CGC TTT GAT CAG GGT CCC TCA CTC TAT TTG ATG CCT AAG CTT 3908
Gly His Arg Phe Asp Gln Gly Pro Ser Leu Tyr Leu Met Pro Lys Leu
55 60 65
TTT GAA GAT GCA TTT GCT GAT TTG GAT GAA CGT ATT GGT GAT CAT TTG 3956
Phe Glu Asp Ala Phe Ala Asp Leu Asp Glu Arg Ile Gly Asp His Leu
70 75 80
GAT TTG CTT CGC TGT GAC AAT AAC TAT AAG GTT CAT TTT GAC GAC GGT 4004
Asp Leu Leu Arg Cys Asp Asn Asn Tyr Lys Val His Phe Asp Asp Gly
85 90 95
GAT GCC GTA CAA CTC TCT TCC GAT TTA ACC AAG ATG AAG GGC GAA TTG 4052
Asp Ala Val Gln Leu Ser Ser Asp Leu Thr Lys Met Lys Gly Glu Leu
100 105 110
GAC CGT ATT GAG GGT CCC CTT GGA TTT GGT AGA TTC TTG GAT TTC ATG 4100
Asp Arg Ile Glu Gly Pro Leu Gly Phe Gly Arg Phe Leu Asp Phe Met
115 120 125 130
AAG GAA ACA CAT GTC CAT TAT GAA CAA GGT ACA TTT ATT GCT ATC AAG 4148
Lys Glu Thr His Val His Tyr Glu Gln Gly Thr Phe Ile Ala Ile Lys
135 140 145
CGC AAC TTT GAA ACC ATT TGG GAT TTG ATT CGT CTT CAG TAT GTG CCT 4196
Arg Asn Phe Glu Thr Ile Trp Asp Leu Ile Arg Leu Gln Tyr Val Pro
150 155 160
GAA ATC TTT CGC TTG CAC TTG TTT GGT AAG ATC TAT GAC CGA GCC AGT 4244
Glu Ile Phe Arg Leu His Leu Phe Gly Lys Ile Tyr Asp Arg Ala Ser
165 170 175
AAA TAT TTC CAA ACC AAG AAA ATG CGT ATG GCT TTT ACT TTT CAA ACA 4292
Lys Tyr Phe Gln Thr Lys Lys Met Arg Met Ala Phe Thr Phe Gln Thr
180 185 190
ATG TAT ATG GG GTAAGTACAC ATACAACGCG TGGTGTGGGA TGAAAAAAAA 4343
Met Tyr Met Gly
195
ATGAGTGAGA CCGGTCGGGG GTGGGAAAAG AGAAATGTGA AAAAGGGCAA GTGAAAGGAA 4403
TTTTGATATT CATTGTTCTA TTTTCTATTT TTGGAATATA G T ATG TCG CCT TAT 4457
Met Ser Pro Tyr
200
GAT GCT CCA GCA GTT TAC AGT TTG TTA CAA TAC ACC GAG TTT GCT GAA 4505
Asp Ala Pro Ala Val Tyr Ser Leu Leu Gln Tyr Thr Glu Phe Ala Glu
205 210 215
GGT ATC TGG TAT CCT CGT GGT GGT TTC AAC ATG GTT GTT CAG AAG CTT 4553
Gly Ile Trp Tyr Pro Arg Gly Gly Phe Asn Met Val Val Gln Lys Leu
220 225 230
GAG TCT ATC GCC TCC AAA AAG TAC GGT GCT GAA TTC AGA TAT CAA TCG 4601
Glu Ser Ile Ala Ser Lys Lys Tyr Gly Ala Glu Phe Arg Tyr Gln Ser
235 240 245 250
CCT GTT GCT AAA ATT AAC ACT GTC GAT AAA GAC AAG CGT GTA ACC GGT 4649
Pro Val Ala Lys Ile Asn Thr Val Asp Lys Asp Lys Arg Val Thr Gly
255 260 265
GTC ACT TTG GAA AGC GGA GAA GTC ATT GAA GCC GAT GCA GTC GTA TGT 4697
Val Thr Leu Glu Ser Gly Glu Val Ile Glu Ala Asp Ala Val Val Cys
270 275 280
AAT GCG GAT CTT GTT TAT GCT TAT CAC CAT CTG TTA CCT CCT TGC AAT 4745
Asn Ala Asp Leu Val Tyr Ala Tyr His His Leu Leu Pro Pro Cys Asn
285 290 295
TGG ACA AAG AAG ACA TTA GCC TCA AAG AAA CTC ACT TCA TCA TCT ATT 4793
Trp Thr Lys Lys Thr Leu Ala Ser Lys Lys Leu Thr Ser Ser Ser Ile
300 305 310
TCG TTT TAT TGG TCC ATG TCA ACA AAG GTG CCT CAA TTA GAC GTA CAC 4841
Ser Phe Tyr Trp Ser Met Ser Thr Lys Val Pro Gln Leu Asp Val His
315 320 325 330
AAT ATC TTC TTG GCT GAA GCC TAC AAG GAA AGT TTT GAT GAG ATT TTC 4889
Asn Ile Phe Leu Ala Glu Ala Tyr Lys Glu Ser Phe Asp Glu Ile Phe
335 340 345
AAC GAC TTC GGT TTG CCC TCT GAA GCT TCA TTC TAT GTC AAC GTT CCA 4937
Asn Asp Phe Gly Leu Pro Ser Glu Ala Ser Phe Tyr Val Asn Val Pro
350 355 360
TCT CGA ATT GAT GAA TCT GCC GCA CCT CCC AAC AAG GAC TCC ATT ATT 4985
Ser Arg Ile Asp Glu Ser Ala Ala Pro Pro Asn Lys Asp Ser Ile Ile
365 370 375
GTG TTG GTT CCA ATT GGC CAT ATG AAG AGT AAG ACA GGA AAC AGT GCT 5033
Val Leu Val Pro Ile Gly His Met Lys Ser Lys Thr Gly Asn Ser Ala
380 385 390
GAA GAA AAT TAT CCT GAG TTG GTA AAC CGT GCA CGC AAG ATG GTT CTG 5081
Glu Glu Asn Tyr Pro Glu Leu Val Asn Arg Ala Arg Lys Met Val Leu
395 400 405 410
GAA GTT ATC GAA CGT CGT TTG GGA GTA AAC AAC TTT GCT AAT TTG ATT 5129
Glu Val Ile Glu Arg Arg Leu Gly Val Asn Asn Phe Ala Asn Leu Ile
415 420 425
GAA CAT GAA GAA GTG AAT GAT CCT AGT GTT TGG CAA AGC AAG TTT AAC 5177
Glu His Glu Glu Val Asn Asp Pro Ser Val Trp Gln Ser Lys Phe Asn
430 435 440
CTT TGG AGA GGT TCT ATT CTT GGT CTT TCT CAT GAT GTG TTC CAA GTT 5225
Leu Trp Arg Gly Ser Ile Leu Gly Leu Ser His Asp Val Phe Gln Val
445 450 455
CTC TGG TTC AGA CCT AGT ACC AAG GAT TCC ACA AAC CGT TAT GAT AAT 5273
Leu Trp Phe Arg Pro Ser Thr Lys Asp Ser Thr Asn Arg Tyr Asp Asn
460 465 470
CTT TTC TTT GTC GGA GCT AG GTAATACTTT TGCCCTCTTT TTTTCTTTTT 5323
Leu Phe Phe Val Gly Ala Ser
475 480
TTTAATAACG CACGTACTCA TTTCTCTTTT CTTTGTAG T ACA CAT CCA GGT ACT 5377
Thr His Pro Gly Thr
485
GGT GTT CCT ATC GTT CTT GCT GGA AGT AAG CTT ACT TCC GAC CAA GTC 5425
Gly Val Pro Ile Val Leu Ala Gly Ser Lys Leu Thr Ser Asp Gln Val
490 495 500
TGT AAA AGC TTT GGC CAG AAT CCC TTA CCA AGA AAG TTA CAA GAT AGC 5473
Cys Lys Ser Phe Gly Gln Asn Pro Leu Pro Arg Lys Leu Gln Asp Ser
505 510 515
CAA AAG AAG TAT GCT CCT GAA CAA ACT CGT AAG ACC GAA AGC CAT TGG 5521
Gln Lys Lys Tyr Ala Pro Glu Gln Thr Arg Lys Thr Glu Ser His Trp
520 525 530
ATC TAT TAT TGT CTT GCT TGT TAC TTT GTT ACT TTC CTC TTT TTC TAT 5569
Ile Tyr Tyr Cys Leu Ala Cys Tyr Phe Val Thr Phe Leu Phe Phe Tyr
535 540 545 550
TTC TTC CCA AGA GAT GAT ACT ACA ACT CCT GCT TCT TTC ATT AAC CAA 5617
Phe Phe Pro Arg Asp Asp Thr Thr Thr Pro Ala Ser Phe Ile Asn Gln
555 560 565
CTT TTA CCT AAC GTT TTC CAA GGA CAA AAC AGC AAC GAT ATT CGC ATT 5665
Leu Leu Pro Asn Val Phe Gln Gly Gln Asn Ser Asn Asp Ile Arg Ile
570 575 580
TAGT TCAATAATAA AGCTACCATT TTAACATCAT ATGAACTTAA CCATTTAAAT 5719
GTATCACTCT GTGTCTGCGT TCTCTTTGAT CATAGAATTC TTTCTTTTTT TGTGCGTATA 5779
TTTGTATCCA TTATCTAGCA CGTCATTAAA TTGGTGTATG TTGATCCTTT CGAATGGAAT 5839
GAATTCTAGA ATAGTGTCTA AAAAGACAAT TCACATGTGT ATCAGAATCA GCTCAAACAA 5899
AGAAACCCAA AATAAATGAC TCGACGCAAA TAGAAAACTG AATTGTTTGG TAGGACAGAC 5959
AATTCTACAA ACAAATATTG GAAAAACAAA TTATCCTGAT TTTAGAAAAA ATAATCGAAT 6019
TTAGTTTTAT TATGATATGC ACAAATTTTG TTTCTACTAA ATGTGGTATT TGATTGAGAG 6079
TAACCATTAA AAATCAATCC TAGCAAACAT TTCTTTTTAA AAATAAGGCA AAGGAGCCAT 6139
ACATAGATCA AGCACAGCAT TATCCAAAAA AGCAAAACAC GTCTGTAAGG TAGGTGAGTG 6199
ATACGAGTCT TGCGTAAAAA AAGAAAATAA TCATTAGCTT GGGAGGGTTT TCAAGACAAT 6259
GAAATGGTAA AAAAAAAAAC GGTATTCTAA AAGGACTAAC CTTAAAACGA AAATAGAGAT 6319
TTTTCGTTTG AAACGTCATA ATCATAACAT TCACCGGAGT TGCTGGTGTG TTTCTTTTAC 6379
TAAACCAAAG AGTAAAGCAT CCTCACTTGA TCTGATTAAA CATGTCAAAC AAAACATTTC 6439
AACAGACACA ATGTCTCTTT GTTGATATTG CAAAGAAGCA TAAACAGTGG CTTTGTGTAG 6499
TACCTAAATT ACTATTCTCT TGAAAGCCAT AGCTTTGATA AGCTTTTTTC TCCCTTTGGC 6559
TTGATTGCTT ATTGCCACAT CTAGAAAAGT GGTTTTACTT TGCCTGATAG AAAATAAAAA 6619
CGTTCTTTTC ATTGTGCTTA TCTTAACTTT AGACTTATTT TTAACCCATC TCAATTGCAA 6679
CAAGCGATCT ACAAGACCTT TTTGGTAAAG GTTCAGTCTT CAAATAACAT GATTACTCAG 6739
TGTTTATAAC ACCAAATTAA TGCGGATGCT TCCTCAAAAA TACATTCAAG TGTAACATAC 6799
TCCTCTTCCT CTCCCTTTTT ATTCGGGCAT GCAGTAAAAG AATCTCGGTG GATTTGCCCT 6859
TTGTCAAGCA ATATAGCTTT TCTTTGTAAG ATTGTTTCAA GAAAAACAGA GATTTAGCTC 6919
ATGATAGCTA GCAAATTCTT TTTCTTTTCT TTGAGCTTTG AATTAAGACT GCACCACATC 6979
GTTAAAGCAA AATTGTTATC ATTCTAGACT GCAATATATT TTCATTGTAC TTTAACTATT 7039
AGTTAAAGTT TTGAGCAACA GTATTGTTGT GTGTATGTGT CACATTCTAC GGTGTTTGTC 7099
AGTTTTTTTT AAGGAAAGTC GTCTACGGCG ATAAGCATCA GAGTTGCACA TGGATCGCTT 7159
TCTCTGTTAT AGCAAGTACT ATTTAGCGAT ATATGTCTTC TGTTTGATAT GATGATATTC 7219
AAGCTATCAG CTTTAACGAA AAAGAAATTG TGTGAGTAAA ATTGTCTATC TTCAAAGCAA 7279
GGAAGATGGA AAACTACAAA AAAAAAAGAA GTAGAGTATA GTAAAACTAA GCTATTCTGT 7339
CGCTTACATC TTTCTTGTCC TTCCTACACG TGTTGTTGAA AACAAAGAAA TAGTCAGAGG 7399
ACTAGAAGCA AGATGATAGT AATGGTATTT TTAACCTAGT CTTTGAAATC TTTCAAAAAA 7459
GATGGCCATG ATGTTTATTA AGGAAAGTTT TGTAGGCTTT TCCAATACGT TTTTCTTCAC 7519
ATTTAGCCAT CTATTGAGTC ATTCCATTGG CAATAAGCCT GTACTACATT TGTCAGAGTG 7579
CAAGAGAGGG AGTTATCTGA GGCTTTATAC CGCCCTTTTT TTTGTCATGG CTCTGTCCAT 7639
GCTTTAGAAA GCCATATACC AAACCTCAAA GACTATCACT TTATTTCAAT TGTACTTAAA 7699
CAGATCACTT CCCTGAAGCT CATATTCACA AAAACAAAAG TGTATCAGGC AAAATCAATA 7759
CAATCTCAAG ATACTATCCT AAAGATGAGA TTCGAAAATC AAAACGCAGA TAGAAAACAT 7819
ACAAACACAA TGCTACCGTG CCATTCACTT ATTACTGATG GGTTGAGGAA TAGCTACTCT 7879
TGCATTATGC TGAATAGCCA AAAAAAGAAA GAGGCAAGTG GAAAGAGAAA AAGACATCTA 7939
TTTTGGAATG GAAGCAACAC AATTCATGAT GCTAAAGCAT GTAAATCAAT GAGCAGAGTA 7999
GCCCTTTATA AATCATCTGC TTGCTTAATC TGTCTCCTAG TTTGATGTTG CTTGTCATTG 8059
AATGTTTGTA TGACATTGCT CAATAATGAT CGAACAGAAG GAGCAAATGC ACAAAGCGAG 8119
TCCAAGTCTC AATCCACCCC ATTTTTGGTT TCTCAAAGGA TTAATAAGAG TCAGATAAAG 8179
CATTCTCAGA CAATAAAGAC AAGGAGATGG TTGAGTTCCT GAAAAGAAGA GACAGGATAA 8239
AATAGCCATA CAATTCTGAC TTTTGTATTA TGTAAGCGGT CTTTATTTTT TCAGAAAAAA 8299
AATCATGCTT TAAAATTAAC ATTGTCAAGA GAAAACAGCT ACAAATCGTT CAAGTTAAAA 8359
AAGATAGATG TCATATTCAA TACAGCAATT ATAAAAGTGC CTAAATGTAA AGCTTAAAAG 8419
AAGCTCAAGC AAAAAAAAAA GACTGTTGAT CGCCTAAAAG TAAGTCTCTC TTTTTTTTTA 8479
TTTTCTTTCT TTTAGAAGAA ATTTCGGGCT TTTTGGGTTT CTGCAAATAA CTTTTTATTT 8539
TTTATTTTTT ATTTTATTCA TTTGGCTTTC AGTGTACAAG GTCCCTTTTT ATTTTATTTT 8599
GTTATATCAA GGCATGAACA GTCCAAAAAG TATATTGAAG TTTAGGCCAG AAATAAATAC 8659
CACTGTATCA AAGCCTGCTG TTGTATCACC CACAAGGCCT ACAAGTGGCT CCAACTGGCT 8719
CAGTCGTATA CAATCCAAGC TTTATTCTCA ACAATTTATT AACAACAGTG ATGATACAAG 8779
TGAGCCC 8786
<210>SEQ ID NO:3
<211>608
<212>多肽
<213>推断的
<223>Pm 69581Da
<223>序列类型:由三孢布拉霉的carRP基因编码的推断的氨基酸序列。
<400>
Met Ser Ile Leu Thr Tyr Leu Glu Phe His Leu Tyr Tyr Thr
1 5 10
Leu Pro Val Leu Ala Ala Leu Cys Trp Leu Leu Lys Pro Phe His Ser
15 20 25 30
Gln Gln Asp Asn Leu Lys Tyr Lys Phe Leu Met Leu Met Ala Ala Ser
35 40 45
Thr Ala Ser Ile Trp Asp Asn Tyr Ile Val Tyr His Arg Ala Trp Trp
50 55 60
Tyr Cys Pro Thr Cys Val Val Ala Val Ile Gly Tyr Val Pro Leu Glu
65 70 75
Glu Tyr Met Phe Phe Ile Ile Met Thr Leu Met Thr Val Ala Phe Ser
80 85 90
Asn Phe Val Met Arg Trp His Leu His Thr Phe Phe Ile Arg Pro Asn
95 100 105 110
Thr Ser Trp Lys Gln Thr Leu Leu Val Arg Leu Val Pro Val Ser Ala
115 120 125
Leu Leu Ala Ile Thr Tyr His Ala Trp His Leu Thr Leu Pro Asn Lys
130 135 140
Ser Ser Phe Tyr Gly Ser Cys Ile Leu Trp Tyr Ala Cys Pro Val Leu
145 150 155
Ala Ile Leu Trp Leu Gly Ala Gly Glu Tyr Ile Leu Arg Arg Pro Val
160 165 170
Ala Val Leu Leu Ser Ile Val Ile Pro Ser Val Tyr Leu Cys Trp Ala
175 180 185 190
Asp Ile Val Ala Ile Ser Ala Gly Thr Trp His Ile Ser Leu Arg Thr
195 200 205
Ser Thr Gly Lys Met Val Val Pro Asp Leu Pro Val Glu Glu Cys Leu
210 215 220
Phe Phe Thr Leu Ile Asn Thr Val Leu Val Phe Ala Thr Cys Ala Ile
225 230 235
Asp Arg Ala Gln Ala Ile Leu His Leu Tyr Lys Ser Ser Val Gln Asn
240 245 250
Gln Asn Pro Lys Gln Ala Ile Ser Leu Phe Gln His Val Lys Glu Leu
255 260 265 270
Ala Trp Ala Phe Cys Leu Pro Asp Gln Met Leu Asn Asn Glu Leu Phe
275 280 285
Asp Asp Leu Thr Ile Ser Trp Asp Ile Leu Arg Lys Ala Ser Lys Ser
290 295 300
Phe Tyr Thr Ala Ser Ala Val Phe Pro Ser Tyr Val Arg Gln Asp Leu
305 310 315
Gly Val Leu Tyr Ala Phe Cys Arg Ala Thr Asp Asp Leu Cys Asp Asp
320 325 330
Glu Ser Lys Ser Val Gln Glu Arg Arg Asp Gln Leu Asp Leu Thr Arg
335 340 345 350
Gln Phe Val Arg Asp Leu Phe Ser Gln Lys Thr Ser Ala Pro Ile Val
355 360 365
Ile Asp Trp Glu Leu Tyr Gln Asn Gln Leu Pro Ala Ser Cys Ile Ser
370 375 380
Ala Phe Arg Ala Phe Thr Arg Leu Arg His Val Leu Glu Val Asp Pro
385 390 395
Val Glu Glu Leu Leu Asp Gly Tyr Lys Trp Asp Leu Glu Arg Arg Pro
400 405 410
Ile Leu Asp Glu Gln Asp Leu Glu Ala Tyr Ser Ala Cys Val Ala Ser
415 420 425 430
Ser Val Gly Glu Met Cys Thr Arg Val Ile Leu Ala Gln Asp Gln Lys
435 440 445
Glu Asn Asp Ala Trp Ile Ile Asp Arg Ala Arg Glu Met Gly Leu Val
450 455 460
Leu Gln Tyr Val Asn Ile Ala Arg Asp Ile Val Thr Asp Ser Glu Thr
465 470 475
Leu Gly Arg Cys Tyr Leu Pro Gln Gln Trp Leu Arg Lys Glu Glu Thr
480 485 490
Glu Gln Ile Gln Gln Gly Asn Ala Arg Ser Leu Gly Asp Gln Arg Leu
495 500 505 510
Leu Gly Leu Ser Leu Lys Leu Val Gly Lys Ala Asp Ala Ile Met Val
515 520 525
Arg Ala Lys Lys Gly Ile Asp Lys Leu Pro Ala Asn Cys Gln Gly Gly
530 535 540
Val Arg Ala Ala Cys Gln Val Tyr Ala Ala Ile Gly Ser Val Leu Lys
545 550 555
Gln Gln Lys Thr Thr Tyr Pro Thr Arg Ala His Leu Lys Gly Ser Glu
560 565 570
Arg Ala Lys Ile Ala Leu Leu Ser Val Tyr Asn Leu Tyr Gln Ser Glu
575 580 585 590
Asp Lys Pro Val Ala Leu Arg Gln Ala Arg Lys Ile Lys Ser Phe Phe
595 600 605
Val Asp
<210>SEQ ID NO:4
<211>582
<212>3
<213>3
<223>Pm 66426DA
<223>序列类型:由三孢布拉霉的carB基因编码的推断的氨基酸序列。
<400>
Met Ser Asp Gln Lys Lys His Ile Val Val Ile Gly Ala Gly Ile Gly
1 5 10 15
Gly Thr Ala Thr Ala Ala Arg Leu Ala Arg Glu Gly Phe Arg Val Thr
20 25 30
Val Val Glu Lys Asn Asp Phe Ser Gly Gly Arg Cys Ser Phe Ile His
35 40 45
His Asp Gly His Arg Phe Asp Gln Gly Pro Ser Leu Tyr Leu Met Pro
50 55 60
Lys Leu Phe Glu Asp Ala Phe Ala Asp Leu Asp Glu Arg Ile Gly Asp
65 70 75 80
His Leu Asp Leu Leu Arg Cys Asp Asn Asn Tyr Lys Val His Phe Asp
85 90 95
Asp Gly Asp Ala Val Gln Leu Ser Ser Asp Leu Thr Lys Met Lys Gly
100 105 110
Glu Leu Asp Arg Ile Glu Gly Pro Leu Gly Phe Gly Arg Phe Leu Asp
115 120 125
Phe Met Lys Glu Thr His Val His Tyr Glu Gln Gly Thr Phe Ile Ala
130 135 140
Ile Lys Arg Asn Phe Glu Thr Ile Trp Asp Leu Ile Arg Leu Gln Tyr
145 150 155 160
Val Pro Glu Ile Phe Arg Leu His Leu Phe Gly Lys Ile Tyr Asp Arg
165 170 175
Ala Ser Lys Tyr Phe Gln Thr Lys Lys Met Arg Met Ala Phe Thr Phe
180 185 190
Gln Thr Met Tyr Met Gly Met Ser Pro Tyr Asp Ala Pro Ala Val Tyr
195 200 205
Ser Leu Leu Gln Tyr Thr Glu Phe Ala Glu Gly Ile Trp Tyr Pro Arg
210 215 220
Gly Gly Phe Asn Met Val Val Gln Lys Leu Glu Ser Ile Ala Ser Lys
225 230 235 240
Lys Tyr Gly Ala Glu Phe Arg Tyr Gln Ser Pro Val Ala Lys Ile Asn
245 250 255
Thr Val Asp Lys Asp Lys Arg Val Thr Gly Val Thr Leu Glu Ser Gly
260 265 270
Glu Val Ile Glu Ala Asp Ala Val Val Cys Asn Ala Asp Leu Val Tyr
275 280 285
Ala Tyr His His Leu Leu Pro Pro Cys Asn Trp Thr Lys Lys Thr Leu
290 295 300
Ala Ser Lys Lys Leu Thr Ser Ser Ser Ile Ser Phe Tyr Trp Ser Met
305 310 315 320
Ser Thr Lys Val Pro Gln Leu Asp Val His Asn Ile Phe Leu Ala Glu
325 330 335
Ala Tyr Lys Glu Ser Phe Asp Glu Ile Phe Asn Asp Phe Gly Leu Pro
340 345 350
Ser Glu Ala Ser Phe Tyr Val Asn Val Pro Ser Arg Ile Asp Glu Ser
355 360 365
Ala Ala Pro Pro Asn Lys Asp Ser Ile Ile Val Leu Val Pro Ile Gly
370 375 380
His Met Lys Ser Lys Thr Gly Asn Ser Ala Glu Glu Asn Tyr Pro Glu
385 390 395 400
Leu Val Asn Arg Ala Arg Lys Met Val Leu Glu Val Ile Glu Arg Arg
405 410 415
Leu Gly Val Asn Asn Phe Ala Asn Leu Ile Glu His Glu Glu Val Asn
420 425 430
Asp Pro Ser Val Trp Gln Ser Lys Phe Asn Leu Trp Arg Gly Ser Ile
435 440 445
Leu Gly Leu Ser His Asp Val Phe Gln Val Leu Trp Phe Arg Pro Ser
450 455 460
Thr Lys Asp Ser Thr Asn Arg Tyr Asp Asn Leu Phe Phe Val Gly Ala
465 470 475 480
Ser Thr His Pro Gly Thr Gly Val Pro Ile Val Leu Ala Gly Ser Lys
485 490 495
Leu Thr Ser Asp Gln Val Cys Lys Ser Phe Gly Gln Asn Pro Leu Pro
500 505 510
Arg Lys Leu Gln Asp Ser Gln Lys Lys Tyr Ala Pro Glu Gln Thr Arg
515 520 525
Lys Thr Glu Ser His Trp Ile Tyr Tyr Cys Leu Ala Cys Tyr Phe Val
530 535 540
Thr Phe Leu Phe Phe Tyr Phe Phe Pro Arg Asp Asp Thr Thr Thr Pro
545 550 555 560
Ala Ser Phe Ile Asn Gln Leu Leu Pro Asn Val Phe Gln Gly Gln Asn
565 570 575
Ser Asn Asp Ile Arg Ile
580
<210>SEQ ID NO:5
<211>20
<212>DNA
<213>人工序列
<223>PCR引物
<400>
CGCGCCGACT GCCATTGACT 20
<210>SEQ ID NO.6
<211>19
<212>DNA
<213>人工序列
<223>PCR引物
<400>
CACGCACGCC GCCTTGACA 19
<210>SEQ ID NO:7
<211>28
<212>DNA
<213>人工序列
<223>在carB基因中插入切割位点NcoI的寡核苷酸
<400>
CTATTTTAAT CCCATGGCTG ATCAAAAG 28
<210>SEQ ID NO:8
<211>28
<212>DNA
<213>人工序列
<223>在carB基因中插入切割位点NcoI的寡核苷酸
<400>
CTTTTGATCA GCCATGGGAT TAAAATAG 28
<210>SEQ ID NO:9
<211>31
<212>DNA
<213>人工序列
<223>在carRP基因中插入切割位点NcoI的寡核苷酸
<400>
CTTTTTATTT TATCTCCATG GCAATACTCA C 31
<210>SEQ ID NO:10
<211>31
<212>DNA
<213>人工序列
<223>在carRP基因中插入切割位点NcoI的寡核苷酸
<400>
GTGAGTATTG CCATGGAGAT AAAATAAAAA G 31
-1-