Protein of Escherichia coli K12 - Hugues Bedouelle

Feb 21, 1984 - the payment of page charges. This article must therefore be .... followed by the name of the plasmid between parentheses, for exam- ple, HBlOl (pPD1). ...... no - not detemined. The nmbers i n parenthesis dm frm the sequence.
2MB taille 2 téléchargements 366 vues
THEJOURNAL OF BIOLOGICAL CHEMISTRY 0 1984 by The American Society of Biological Chemists, Inc.

Vol. 259, No. 16, Issue of August 25, pp. 10606-10613, 1984 Printed in (IS. A.

Sequences of the maZE Gene and of Its Product, theMaltose-binding Protein of Escherichia coli K12* (Received for publication, February 21, 1984)

Pascale DuplaySO, Hugues BedouelleSSll, Audree Fowlerll**, Irving Zabinll**, William SaurinSS,and Maurice HofnungSS From the $Programmation Moliculaire etToxicologie Genetique (Centre National de l a Recherche Scientifique LA 271 -Znstitut a Sante et dela Recherche Medicale U 163), Znstitut Pasteur, 75015 Paris, France and the IlDepartment of National de l Biological Chemistry, School of Medicine and Molecular Biology Institute, University of California, Los Angeles, California 90024

The sequences of the malE gene and of its mature product, the maltose-binding protein, have been determined and arein good agreement. The malE gene encodes thepre-protein (396 amino acid residues) which yields, upon cleavage of the NHP-terminal extension (26 amino acid residues), the mature maltosebinding protein (370 amino acid residues). The malE mRNA could form stable stem and loop structures, some of which may account for translational pauses observed by Randall et al. (Randall, L., Josefsson, L. G. & Hardy, S . J. S . (1980) Eur. J. Biochem. 107, 375-379). The sequence change due to an in-frame nonpolar deletion of 765 nucleotides in malE is also presented as well as homologies between the maltosebinding protein and other sugar-binding proteins. The malB region of the Escherichia coli chromosome is composed of two operons, malE-malF-malG and malK-lamB, transcribed divergently from a controlregion located between malE and malK (Fig. 1) (Hofnung, 1974; Silhavy et al., 1979; Raibaud et al., 1979; Bedouelle et al., 1982). malB encodes the components of the system that transports maltose and maltodextrins acrossthe bacterial envelope (reviewedin Shuman, 1982a; Hengge and Boos, 1983). The LamB protein is located in the outer membrane (Randall-Hazelbauer and Schwartz, 1973);the MalE protein (or maltose-binding protein) is in the periplasmic region (Kellermann and Szmelcman, 1974); the MalF, MalG, and MalK proteins are associated with the cytoplasmic membrane (Shuman et al., 1980; Shuman and Silhavy, 1981; Shuman, 1982b). The nucleotide sequences of the entiremall(-lamB operon (Bedouelle and Hofnung, 1982; Gilson et al., 1982; C16ment and Hofnung, 1981) and of the malF gene’ have been determined. * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. 3 Supported by grants from the Direction Generale a la Recherche Scientifique et Technique, the Foundation pour la Recherche Medicale, the Ligue Nationale Francaise contre le Cancer, the Association pourle Developpement de la Recherche surleCancer,Grant CP.960002-ATP.956144 from the Centre National de la Recherche Scientifique, and Grant 1297 from the North Atlantic Treaty Organization. ll Recipient of a special stipend from the Institut Francais du Petrole. Present address, Medical Research Council, Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, England. ** Supported by Grant PCM-8118/112 from the National Science Foundation and Grant AI-04181 from the United States Public Health Service. S. Froshauer and J. Beckwith, personal communication.

MBP2 is a binding protein specific for maltose and maltodextrins with a KO around 1 PM; the affinity is maximum for maltotriose (Schwartz et al., 1976; Szmelcman et al., 1976; Wandersman et aL, 1979). Studies on the binding specificity have suggested that the binding site recognizes the glycosidic bond linking the glucose moities of maltose (Kellermann and Szmelcman, 1974; Ferenci, 1980). MBP can be purified in dimeric form from bacteria that are constitutive for the expression of the maltose system and are grown in the absence of maltose. Maltose induces the conversion of the protein dimers to monomers (Richarme, 1982). There is one binding site for maltose/monomer (Schwartz et al., 1976), and upon binding of the substrate, the protein undergoes a conformational change that can be monitored by fluorescence techniques (Szmelcman et at., 1976; Zukin, 1979). MBP is a multisite protein which, in addition to binding its substrates, interacts with at least two proteins located in different layers of the cell envelope: with LamB (KO = 0.15 p M ) to specifically facilitate the diffusion of maltose and maltodextrinsthrough the outer membrane (Wandersman et al., 1979; Heuzenroeder and Reeves, 1980; Boos and Staehelin, 1981; Bavoil and Nikaido, 1981; Bavoil et al., 1983;Neuhaus et al., 1983)and with the inner membrane methyl-accepting proteinMCP I1 (KOaround 1 p ~to)induce the chemotactic response (Koiwai and Hayashi, 1979;Hayashi and Ohba, 1982; Richarme, 1982). In addition, MBP might interact with the MalG of MalF proteins to allow translocation of the substrate through the cytoplasmic membrane.3 The protein has been crystallized, and the elucidation of its three-dimensional structure isin progress (Quiocho et al., 1979). MBP is synthesized in large amounts. A fully induced cell may contain up to 40,000 monomers, i.e. about one MBP monomer/LamB trimer (Kellermann and Szmelcman, 1974; Dietzel et al., 1978). Expression of the malE gene is activated by the MalTprotein in the presence of maltose or maltodextrim and by the cyclic AMP receptor protein;a detailed genetics analysis of the malE promoter has revealed potential interaction sites for MalT and the cyclic AMP receptor protein and a long-range interaction with the diverging malK promoter (Bedouelle, 1983). Like most outer membrane and periplasmic proteins, MBP is synthesized initially withan amino-terminal signal peptide which is necessary for the initiation of export through the ~

~~

* The abbreviations used are: MBP, maltose-binding protein; SDS, sodium dodecylsulfate; HPLC, high-pressureliquid chromatography; PTH, phenylthiohydantoin; dansyl, 5-dimethylaminonaphthalene-lsulfonyl. H. Shuman, personal communication.

10606

10607

Sequences of the malE Gene and Its Product TABLEI Bacterial strains Strain

pop1741 JC10289 HBlOl MC4100 AmalE444 MC4100 HS2019 pop3971 PD1 PD2 PD3 PD4 PD5

References

Genotype

HfrG6 malBA114 his F-thrl leuB6 proA2 his4 argE3 thilmtll gatC? aral4lacy1 galk2 xy15 rpsL3l tsx33 supE44A (srlR-recA)306:TnIO F-proA2 aral4lacy1 galK2 xy15 mtll rpsL2O supE44 hsdS2O (r-B, mB) recA13 F-thiA relA araD139A lncU169 rpsL MC4100 malTpl malQ7 AmalE444 pop3971 A(srlR-recA)306::TnZO MC4100 HS2019 A(srlR-recA)306:TnlO A(srlR-recA)306::TnlO pop3971 PD1 A(srlR-recA)306:TnlO

Hofnung et al. (1974) Csonka and Clark (1979) Boyer and Roulland-Dussoix (1969) Casadaban (1976) Shuman (1982b) Chapon (1982) This study This study This study This study This studv

RESULTS cytoplasmic membrane. It interacts with the cell secretory apparatus and is cleaved during the exportprocess or shortly Cloning of the malE Gene-The transducing phage after (Bedouelle et al., 1980; reviewed in Bassford etal., 1984). Xaph80malB130 carries most of the wild-type malB region The MBP precursor is active in binding maltose (Ferenci(Raibaud and etal., 1979). TheAmalE444 deletion was transferred Randall, 1979). from the chromosomeof the bacterial strain, HS2019, to the MBP is essential for the energy-dependent translocation of chromosome of phage Xaph80malB130 by in vivo recombinamaltoseandmaltodextrinsthroughthe cytoplasmic mem- tion(“MaterialsandMethods”).Theresulting phage was brane.Thispropertywasdemonstratedusing a deletion, labeled Xaph80malB130AmalE444. AmalE444, that is internal to malE, is nonpolar on malF and The malB region contains a unique EcoRI restriction site malG, and abolishes transport (Shuman, 198213; Brass et al., at the beginningof the malK gene (Bedouelleet al., 1982) and 1981). a StuI site upstreamof malF.5. (Fig. 1).Analysis and comparIn this paper, we present the amino acid sequence of the ison of phage haph8OmalB130 and of its AmalE444 derivative MalE protein, thenucleotide sequence of the malE gene, and with endonucleases revealed that the AmalE444 mutation is the location in this sequence of the AmalE444 deletion. contained within an EcoRI-StuI fragment and deletes about 765 base pairs. The length of the wild-type fragment, about MATERIALS ANDMETHODS 1712 base pairs, was a priori enough to contain the malB Media and General Techniques-The growth media, genetic tech- control region (Bedouelle and Hofnung, 1982) and the entire niques (Bedouelle, 1983), and standardDNA technology (Maniatis et malE coding sequence (Fowler and Zabin, 1982). The wildal., 1982) were as described. Bacterial Strains and Phages-The bacterial strains are listed in type and deleted EcoRI-StuI fragments were purified from between EcoRIand PvuII Table I. Strains harboring a plasmid are noted with their name the transducing phages and inserted followed by the name of the plasmid between parentheses, for exam- sites of plasmid vector pBR322. The recombinant plasmid ple, HBlOl (pPD1). The A(srlR-recA)306:TnIO deletion was intro- that carries the wild-type malE gene was labeled pPD1, and duced into various strains by transduction to tetracycline resistance the one that containsAmalE444 was labeled pPD2. using a Plv grown on strain JC10289. The particular clone of phage Plasmid p P D l Carries a Functional malE Gene-The paXaph8OmalB13 (Raibaud et al., 1979) used in this work was labeled rental plasmid pBR322 and the recombinant plasmids pPDl Xaph80malB130. It carries a wild-type malB region, except for the 3’terminal end of the lumB gene which is deleted. It transduced strain and pPD2 were introduced into strain HB101, which is wild pop1741, which harbors a complete deletion of the malE gene, to type for the maltose system. The tramformed strain HBlOl Mal+. The AmalE444 deletion was transferred from strain HS2019 (pPD1) synthesizeda polypeptide that was predominant, was onto phage Xaph8OrnalB130 by the integration-excision technique, as inducible with maltose, and co-migrated with purified MalE described (Bedouelle, 1983). The resulting phage was labeled protein when total cell extracts were analyzed by polyacrylXaph80malB130AmalE444. AmalE444 was transferred from the latter phage to the chromosome of strain pop3971 by the same technique. amide gel electrophoresis (Fig. 2). In contrast, strains HBlOl (pPD2) and HBlOl (pBR322) synthesized much lower The resulting strain is PD1. Restriction Mapping and DNA Sequencing-The restriction frag- amounts of the samepolypeptide. These results showed that ments were labeled a t their 5’ ends with [w~’P]ATP and T4 poly- p P D l directs the synthesis of a full-length MBP; they connucleotide kinase by the exchange reaction (Maniatis et al., 1982). firmed that the malE promoter is functional and under malThe two labeled ends were segregated by secondary restriction cuts. The labeled fragments were separated by electrophoresis through thin tose control when carried by a multicopy plasmid (Bedouelle et al., 1982). (0.35 mm) non-denaturing 5% polyacrylamide gels and eluted off the A Mal’ strain generally becomes partially Mal- when transgel pieces by overnight diffusion a t 37 ‘C in 0.1% SDS, 0.3 M sodium acetate, pH 8. Fine restriction maps of the purified fragments were formed witha multicopyplasmid that carries the malB control obtained by the method of partial digests and their nucleotide se- region. This effect has been attributed to a titration of the quences by the Maxam and Gilbert technique as described (Maniatis transcription activator MalT by the multiple copies of the et al., 1982). Protein Sequencing-The corresponding materials and methods malE andmalK promoters; under these conditions, the intracellular concentration of MalT would become limiting for the are given in Miniprint4 and include Fig. l a and Table Ia-IIIa. _____ expression of the maltose operons.‘ Accordingly, strain PD2



Portions of this paper (includingpart of “Materials and Methods,” part of “Results,” part of “Discussion,” Fig. la, Tables Ia-IIIa, and additional references) are presented in miniprint at the end of this paper. Miniprint is easily read with the aid of a standard magnifying glass. Full size photocopies are available from the Journal of Biological Chemistry, 9650 Rockville Pike, Bethesda, MD 20814. Request Document No. 84-M526, cite the authors, and include a check or

._____

money order for $5.20 per set of photocopies. Full size photocopies are also included in the microfilm edition of the Journal that is available from Waverly Press. S. Froshauer, personal communication. H. Bedouelle, unpublished observations.

Sequences of the malE Gene and Its Product

10608

,

, malK

lam8

_””” r

,p, malE -

J

I

malF



---

““

””

r

, malt3 ““

I

malE

I

I

>

I I

,200bP,

b-+-A444-*

-

FIG.1. Restriction map and sequencing strategy for the malE gene. The upper section of the figure shows the genetic structure of the mall3 region. The middle section is an enlargement of the EcoRI-Stul restriction fragment that contains the mnlE gene. The large arrow shows the position of the sequence encoding the preMalE protein in this fragment. The restriction sites that we used to determine the DNA sequence are indicated with the TaqI sites drawn under the line. We also used a TthlllI site which is located 123 base pairs (bp) downstream of the StuI site and is unique in plasmid pPD1. The horizontal arrows in the lower section represent the sequenced DNA fragments. The tails of the arrows indicate the labeled 5’-ends of the fragments. The length of the arrows corresponds to the extentof the sequences read off the gels. The complete nucleotide sequence of the malE gene was obtained a first time from the AoaII, BgnI and HinfI sites and then verified from the BssHII, DdeI, NcoI, and one of the TaqI sites. 81% of the malE gene was sequenced on both strands. This calculation takes into account the fact that thesequence of the first 600 base pairs from EcoRI site was determined previously (Bedouelle and Hofnung, 1982).

became partially Mal- whentransformed with plasmids pPDl and pPD2, whereas it remained Mal+ when transformed with pBR322 (Table 11); strain PD4, a mal-1 malTp7 double mutant that overproduces MalT 30-fold (Chapon, 19821, remained Mal+ when transformed with any of the three plasmids. Strain PD5, a AmalE444, mal-1 mal-7 mutant, became Mal+ when transformed with plasmid pPDl but remained Mal- when transformed with pPD2. This last result demonstrated that pPDlcan complement the AmalE444 mutation for maltose fermentation and therefore that it carries a functional malE gene. Nucleotide Sequence of the malE Gene-Comparison of the DNA fragment obtained after digestions of plasmids pPDl and pPD2 with various restriction endonucleases allowed the construction of a preliminary restriction map of the EcoRIStuI fragment that contains the m l E gene and the approximate localization of the AmalE444 deletion in this fragment. Fig. 1 shows this restriction map and the strategy used to determine both the nucleotide sequence of the malE gene and the position in this sequence of the AmalE444 deletion using the Maxam and Gilbert technique. Fig. 3 shows the nucleotide sequence data. The AmalE444 mutation removes 765 base pairs i.e. exactly 255 amino acids, and preserves the reading frame. Protein Sequence Strategy-Primary cleavages of the maltose-binding protein with carried out with cyanogen bromide, trypsin, and the glutamic acid-specific Staphylococcusaureus V-8 protease. In the case of the trypsin digestion, the lysines wereblocked with citraconyl groups. The only methods needed to separate the peptides were sizeseparation on Seph-

FIG,2. Expression of the malE gene on plasmid pPD1. The cells were grown in maltose (+) or in glycerol (-) minimal medium supplemented with 0.2% casamino acids. Whole cell extracts were prepared and analyzed by electrophoresis through 10% polyacrylamide gels in SDS as described (Laemmli, 1972). The gels were stained with Coomassie Brilliant Blue. About 10s cells were loaded per well. Lane I, HBlOl (pBR322); lanes 2 and 3, HB101 (pPD2); lanes 4 and 5,HBlOl (pPD1); lane 6,4wg of purified E. coli MalE protein. adex and reverse-phase chromatography by high-pressure liquid chromatography. The protein contains 6 methionine and 6 arginine residues so that seven peptides would be expected from both the cyanogen bromide (Table Ia) and tryptic (Table IIa) digests. In both cases, all seven peptides were isolated. Cleavage of Met-Ser and Met-Thrbonds by cyanogen bromide was quite high. If anything, cleavage of Met-Pro was poor since the yield of CB6 was somewhat low. Since the protein contains no cyteine, incorporation of [%SI sulfate into thecells yielding maltose-binding protein labeled only in methionine. Therefore, the methionine-containing peptides in the tryptic and Staphylococcus protease digests were easily identified. These labeled peptides were sequenced snd used to place the cyanogen bromide peptides in order.

CCA Pro

TTC Phe

AAA Lys

CAG Gln

ATT Ile

AAA GCG TTC Lys Al a Phe

GGC AAG CTG Gly Lys Leu

GCT Ala

TAC Tyr

AAG Lys

GGC TAC Gly Tyr

405 GCT GTT Ala Val 109 GAA Glu

ATT Ile

GCT Ala

GGT AAG 61 y Lys -T45

567 GCT Ala 163

513 AGC Ser

GAC Asp

AT6 MET

GGG GGT Gly Gly

GCG CTG Ala Leu

GAA Glu

ATC Ile -

TAT Tyr

TTC Phe

ATC Ile

TTC Phe

AM Lys

GTA Val

ACG Thr

CTG Leu

GCG TTC Ala Phe

AAC Asn

ATT Ile

CGT Arg

ACC Thr

AAG Lys

CAA Gin

270 CAC His x4

216 AAA Lys 46

162 GAG Glu 28

108 TOG Trp 10

TAT Tyr

TAC Tyr

TAT Tyr

GAA Glu

594 GAA Glu 172

540 CCG Pro 154

406 GAT Asp 136

432 AAC Asn 118

378 AAC Am 100

324 CCG GAC Pro Asp 82

TGG GCA Trp Ala -

GA6 Glu

TTC Phe

ATC Ilc

ACG A% Thr IYT -9

CCG GCG CTG Pro Ala Leu

TCG CTG Ser Lcu

GCC GTA Ala Val

GAG ATC Glu Ile

GCG TTA Ala Leu

AAG Lys

CTG Leu

TTA Leu

AAA CTG GM Lys Leu Glu -

GGT Gly

AAA Lys

GCA Ala

GCT GAA Ala Glu

ATT Ilc

TOG OAT Trp Asp

TTT Phe

351 CC0 Pro 91 ACC Thr -

GGC CTG TTG Gly Leu Let,

GAC Asp

297 TCT Ser 73

GGC CCT Gly Pro -

GTC Val

GGT Gly

TCC Ser

CCG GAT Pro Asp

GAA Glu

GAA Glu

TTA Leu

GTA Val

GGT Gly

ATC Ile

TTC Phe

GCA Ala

CTG Leu

GAC Asp

GTA Val

CTG Leu

ATT Ile

GAA Glu

GTT Val

ATT Ile

CGT CAG Arg Gln

TCC Ser

ACT Thr

GCT Al a

TTC Phe

AAG Lys

TCC Ser

GCT Ala

GGT Gly

AAC Asn

ACT Thr

GAT Asp

GAA Glu

GTC Val

TTC Phe

CT6 Leu

945 G6T Gly 289

999 TAC Tyr 307

AAA Lys

AAC Asn

891 CC0 Pro 271

GAT Asp

1107 GCC GTG COT Ala Val Arg 343

ACT Thr

1161 GAA GCC CTG AAA GAC Glu Ala Leu Lys Asp 361

TGG TAT Trp Tyr

GGT Gly

GAG GAA Glu Glu

GAA Glu

TCC Set-

1053 GAA AAC GCC CAG AAA Glu Asn Ala Gin Lys 325

GCG CTG AAG TCT Ala Leu Lys Ser

CTG Leu

AAC Asn

6GC Gly

ATG MET

CC0 Pro

ATC Ile

GCG CAG Al a Gin

GCG GTG Ala Val

GAA Glu

GAG TTG 61 u Leu -

GCG GTT Ala Val

GAG CTG Glu Leu

AAA Lys

GTT Val

GGT Gly

GGC GTG Gly Val

TAT Tyr

ACC Thr-

GAC Asp

AAA Lys

ACT Thr

ATC Ile

ATG MET

GAT Asp

GAC Asp

CGT At-g

AAC Asn

ATC Ile

CCA Pro

AAA Lys

1080 CCG Pro 334

1026 CGT Arg 316

972 CCG Pro 298

916 CTC Leu 280

864 CTG Leu 262

810 GTA Val 244

756 ATC Ile 226

702 ACC Thr 208

648 GCG Ala 190

1iSS ATC ACC AA0 11 e Thr LYP 370

1134 GCC GCC AGC Ala Al a Ser 352

CCG AAC Pro Asn

GCG AAA Al a Lys

AAT Asn

GCG AAA GAG TTC Ala Lys Glu Phe

TTC Phe

GCA Ala

GCG ATG Ala MET

AAT Asn

GCT GGC GCG AAA Ala 01 y Ala Lys

GAA ACA Glu Thr

AAA CAC Lye His -

GAT Asp

GA!2 ACC AGC AAA GTG AAT Asp Thr Ser Lys Val Asn

AAA Lys

AAC Asn

037 CAA CCA Gln Pro 253

703 ATC Ile 235

AAT Atm

729 GCC TTT Ala Phe 217

GAC CTG Asp Leu

AAA Lys

GGC GTG Gly Val

675 ATT Ile l-w

621 AAA GAC GTG Lys Asp Val 181

AAC GCC GCC AGT Asn Ala Ala Ser

CCG ACC Pro Thr -

GCC GCC ACC ATG Ala Ala Thr MET

CAG ATG Gln MET

ATT Ile

TAT Tyr

6GT Gly

CTG Leu

GGT GCC Gly Ala

GAA AAC Glu Asn

CTG Leu

TCC SW

ACC Thr

TAC Tyr

GGC CCG TGG G&A TGG 61 y Pro Trp Ala Trp

TAC Tyr

CTG Leu

GGC AAG Gly Lys

AGC GCA Set- Ala

ACG Thr

AAC Asn

GAT Asp

GGT Gly

AAC Asn

TAA .

FIG. 3. Sequences of the mdE gene and of its product. The nucleotide sequence of the m&E gene is numbered from 5’ to 3’ with the MalE protein sequence shown below. The vertical arrow indicates the site at which the MalE precursor is cleaved to yield the mature protein. The amino acid residues in the signal peptide have ncgatiue numbers. The underlined amino acids .- correspond to residues that were not characterized by protein sequencing. TWO hor~zontul arrows, at nucleotide positions 28 and 792, bracket the 765 base pairs deleted by the Am&B444 mutation.

CCG CTG Pro Leu

ACC Tht-

TAC Tyr

TGG Trp

CTG AAA GCG AAA Leu Lys Al a Lys

AAA GAA Lys 61 u

TTC Phe

TAT Tyr

CAA Gln

243 GGC OAT Gly Asp 55

GAG CAT Glu His -

GCT Ala

135 AAC GGT CTC Asn Gly Leu 19

109 ACC OTT Thr Val 37

GAA Glu

GCA Ala

ATC Ile

81 GCC AAA Ala Lys

CTC Leu

277

CGC ATC Arg Ile -18

CCG ATC Pro Ile

CTG Leu

GCT Ala

ACT Thr

AAA GTC Lyr Val

GGC TAT Gly Tyr

GCG GCA Ala Ala

GAC Asip

GCA Ala

GCT CTC Ala Leu

GGT Gly

459 CTG CTG CCG AAC CCG CCA AAA ACC TGG Leu Leu Pro Asn Pro Pro Lys Thr Trp 127

GGT Gly

GAC CGC TTT Asp Arg Phe

CAG GTT Gln Val -

ACC GGA ATT Thr Gly Ile

GAT Asp

AAA Lya

AAA GAT Lys Asp

ACA Thr

GCC TCG Ala Set-

AAC GGC GAT Asn Gly Asp

TCC Ser

ATG TTT MET Phe

AAA Lys

ATT Ile

ATA Ile

ATG AAA MET Lys

Sequences of the malE Gene and Its Product

10610

A 1 32 33

RBP

1 32 33

ABP

59

L "

1 34 35

GBP

MBP

59

L""

118

92

177

235 36 271 ""

199

107

""""""_

58

51

184

""U I

92 94

-

92

"""_"

59

L""

1

B

56

255 51

306 I

"""

111

205

51

256

771

288

56

34326 370

56

309

_ _ _ _ _ __ _ - -_ _____ U J

119

L-.

""_

u "_I

""

A

B

FIG. 5. Sequence homologies between sugar-bindingproteins. The sequences of the maltose (MEP; this work)-, ribose (RBP; Groarke et al., 1983)-, galactose (GBP; Mahoney et al., 1981)-, and arabinose (AEP; Hogg and Hermodson, 1977bbinding proteins were compared using (i) dot matrices (Gibbs and McIntyre, 1970)' and (ii) an adaptation of the program described by Korn et at. (1977) to protein sequence (Brutlag et al., 1982). This analysis revealed two main regions of homology noted A and B and allowed the alignment of the four protein sequences. This alignment is shown schematically in the upper part of the figure. The A and B regions are represented by solid lines, with their coordinates in the sequences (roman numbers) and the lengths of the polypeptide segments involved (italicized numbers, in numbers of residues). Notice that the distances between A and B are smaller in RBP (19 residues less) and in ABP (4 residues less) than in GBP or MBP. The C-terminal end of MBP would be shorter than those of the three other proteins, while the NHZ-terminal end of MBP would be longer by over 80 amino acid residues. The lower part of the figure compares the sequences of the A and B regions. The amino acid occurring at thesame position in at least three of the four proteins are written ascapitals in the one letter code. The boxed amino acids are either identical or belong to the same class of conservative replacement: A/P; D/E/N; F/Y/W; I/L/M/V; K / R S/T (Dayhoff, 1972). More than 45% of the residues are boxed in reeion A and more than 39% in B. This comuarison extends to MBP the alignment between ABP, GBP, and RBP-previously published (Argos et al., 1981).

-

TABLEI1 Complementation and titration by plasmidp P D l The indicated recA host strains were transformed with plasmids pPD1, pPD2, and pBR322 and streaked on McConkey maltose indicator agar containing 100 pg/ml of ampicillin. The coloration of the colonies is symbolized by + and -, going from intense red (+++), indicating strong fermentation of maltose, to red (++), pink (+I, and white (-), indicating no fermentation of maltose. Similar results were observed by looking at growth on minimal maltose plates; however, on this medium, strain pD5 (pPD1) presented irregular colonies possibly indicating some growth inhibition. marker

Plasmids

Host strain

Relevant

PD2 PD3 PD4 PD5

Wild type AmalE444 rnalTpl Tp7 malTpl Tp7 AmalE444

pPD2

pPDl

+ + +++ +++ ++

pBR322

+-

+++ -

-

-

+++

The fact that 3 of the methionine residues as well as 4 of the arginine residues are clustered in the last 55 residues (of 370) of the protein made the sequence determination more difficult. Three large peptides (CB1, 148 residues; CB4, 97 residues; and CT3, 218 residues) were obtained from the cyanogen bromide and tryptic digests. The Staphylococcus W. Saurin, unpublished programs.

200

1

400

1

"

I

I

100

1

I -I

I

I1

600

800

1

I

200 -I

In

1000 I

1188

1

_."

no

300 I-

IPJ

" I -

"

V VI

U1

FIG. 4. Secondary structure of the malE mRNA. The numbers abooe the line correspond to thenucleotide sequence of the malE gene; the numbers below the line to the amino acid sequence of the MalE mature protein. The stem and loop structures that could form in the malE mRNA having calculated energy of formation AG < -10 kcal/mol (Tinoco et al., 1973) have been numbered from I to VII. The two palindromic sequences that form the stem of each structure are represented by two black boxes; their coordinates in the nucleotide sequence and the AG for the structures are as follows: I , 231-251/ 278-301, AG = -17.1; II, 320-343/351-371, AG = -14.4; III, 644666/683-705, AG = -17.0; I V , 773-787/804-825, AG = -17.5; V, 1046-1071/1079-1106, AG = -17.9; VI, 1024-1069/1080-1126, AG = -28.3; VII, 1090-1119/1128-1157, AG = -15.1.

protease digest yielded on the average much smaller peptides since the protein contains 27 glutamic acid residues. These peptides were therefore used to complete the sequence of the large peptides, and in particular, the SP peptides isolated from the 35S-proteinwere used to order the CB peptides. Some subfragmentations were necessary to complete the sequence determinations of the large peptides. Cleavage with Staphylococcus protease yielded a complex

Sequences malE of the

Gene Product Itsand

10611

REFERENCES mixture of peptides (Table IIIa),which might be due to some residual structure of the protein. Actual splitting would be Argos, P., Mahoney, W. C., Hermodson, M. A. & Hanei, M. (1981) J. Biol. Chem. 256,4357-4361 expected in only 19 of the 27 glutamic acids since there are three Glu-X-Pro sequences, one Glu-Pro, two Glu-Glu, and Bavoil, P. & Nikaido, H. (1981) J. Biol. Chem. 256,11385-11388 one Glu-Glu-Glu sequence in the protein. Staphylococcuspro- Bavoil, P., Wandersman, C., Schwartz, M. & Nikaido, H. (1983) J. Bucteriol. 1 5 5 , 919-921 tease does not split the two types of proline sequences and is Bassford, P. J., Jr., Bankaitis, V. A., Rasmussen, B. A. & Ryan, J. P. not an exopeptidase, so it does not release free glutamic acid. (1984) Microbiology (Wash. D.C.) 8-12 Peptides isolated from the Staphylococcus protease digests Bedouelle, H. (1983) J. Mol. Biol. 170,861-882 were frequently sequenced to thecarboxyl terminal. Bedouelle, H. & Hofnung, M.(1982) Mol. Gen. Genet. 185,82-87 The residues that were not identified by protein sequence Bedouelle, H., Bassford, P. J., Jr., Fowler, A. V., Zabin, I., Beckwith, J. & Hofnung, M.(1980) Nature ( L o r d . ) 285,78-81 analysis are underlined in Fig. 3. The peptides used to deduce the sequence are indicated in Fig. l a of the Miniprint. Most Bedouelle, H., Schmeissner, U., Hofnung, M. & Rosenberg, M.(1982) J . Mol. Biol. 161,519-531 of the residues were sequenced at least three times from Boos, W. & Staehelin, L. (1981) Arch. Microbiol. 129,240-246 different peptides. Boyer, H. W. & Roulland-Dussoix, D. (1969) J. Mol. Bid. 4 1 , 459472 Brass, J. M.,Boos, W. & Hengge, R. (1981) J. Bacteriol. 146, 10-17 DISCUSSION Brutlag, D. L., Clayton, J., Friedland, P. & Kedes, L. H. (1982) Nucleic Acids Res. 10, 279-294 The complete nucleotide sequence of the malE gene was determined. It encodes a precursory polypeptide of 396 amino Casadaban, M. (1976) J . Mol. B i d 104,541-555 Chapon, C. (1982) EMBO J. 1,369-374 acids; the 26 NH2-terminal amino acids correspond to the Clement, J . M. & Hofnung, M. (1981) Cell 2 7 , 507-514 signal peptide (Bedouelle et al., 1980), and the 370 remaining Csonka, L. N. & Clark, A. J . (1979) Genetics 93,321-343 amino acids correspond to the mature MBP. The amino acid Dayhoff, M. 0. (1972) Atlas of Protein Sequence and Structure, Vol. sequence was obtained both by direct sequencing of most of 5, p. 96, National Biomedical Research Foundation, Washington, D. C. the protein and by DNA sequencing. The molecular weight of the mature protein, 40,661, predicted from the sequence is in Dietzel, I., Kolb, V. & Boos, W. (1978) Arch. Microbiol. 179, 921934 good agreement with the experimental values, 44,000 and Ferenci, T. (1980) Eur. J . Biochem. 108, 631-636 37,000, obtained by equilibrium centrifugation and polyacryl- Ferenci, T. & Randall, L. L. (1979) J. Biol. Chem. 2 5 4 , 9979-9981 amide gel electrophoresis in thepresence of SDS (Kellermann Fowler, A. V. & Zabin, I. (1982) Ann. Microbwl. (Paris) 133A, 49and Szmelcman, 1974). MBP contains 45% nonpolar, 29% 53 uncharged polar, 14%acidic, and 12%basic amino acids. This Gibbs, A. J . & McIntyre, G. A. (1970) Eur. J. Biochem. 16, 1-11 distribution is similar to thatfound in other binding proteins Gilson, E., Nikaido, H. & Hofnung, M. (1982) Nucleic Acids Res, 1 0 , 7449-7458 (see Groarke et al., 1983, and references therein). Groarke, J. M., Mahoney, W. C., Hope, J. N., Furlong, C. E., Robb, Intermediate polypeptides transiently accumulate during F. T., Zalkin, H. & Hermodson, M. A. (1983) J. Biol. Chem. 2 5 8 , the synthesis of MBP due to drastic reductions in the rate of 12952-12956 elongation at specific sites of the mRNA. These pauses occur Hayashi, H. & Ohba, M.(1982) Ann. Microbiol. (Paris) 133A, 195197 about 25 and 50 amino acids upstream of the C-terminal end of the protein (Randall et al., 1980). We did not find any Hengge, R. & Boos, W. (1983) Biochim. Biophys. Acta 737,443-478 accumulation of rare codons in the malE gene that could Heuzenroeder, M. W. & Reeves, P. (1980) J. Bacteriol. 141,431-435 Hofnung, M. (1974) Genetics 76,169-184 account for these pauses. Asearch for mRNA secondary Hofnung, M., Hatfield, D. & Schwartz, M. (1974) J. Bacteriol. 117, structures using the SEQ program (Brutlag et al., 1982) re40-47 vealed seven stable stemand loop structures (AG < -10 Kcal/ Hogg, R. W. & Hermodson, M. A. (1977) J. Bid. Chem. 2 5 2 , 51355141 mol) along the malE message. Three of these structures, including the stablest one (AG = -28 Kcal/mol), are clustered Kellermann, 0. & Szmelcman, S. (1974) Eur. J. Biochem. 4 7 , 139149 at the end of the gene. They could transiently arrest the 0. & Hayashi, H. (1979) J. Biochem. (Tokyo) 8 6 , 27-34 ribosome at theobserved sites (Fig. 4) and thusbe responsible Koiwai, Korn, L. J., Queen, C. L. & Wegman, M. N. (1977) Proc. Natl. Acad. for at least some of the observed pauses. Sci. U. S. A. 74,4401-4405 Comparison of the primary sequence of the maltose-binding Laemmli, U. K. (1970) Nature (Lond.)2 2 7 , 680-685 protein with that of other bacterial periplasmic binding pro- Maniatis, T., Fritsh, E. F. & Sambrook, J (1982) Molecular Cloning, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY teins revealed two main regions of homologies (Fig. 5, A and B ) . Alignment of the sequences based on these two regions Neuhaus, J. M., Schindler, H. & Rosenbusch, J. P. (1983) EMBO J . 2,1987-1991 may correspond to structural similarities (Argos et aL, 1981; Quiocho, F. A., Meador, W. E. & Pflugrath, J. W. (1979) J. Mol. Biol. Vyas et al., 1983) and suggests in particular that MBP may 133, 181-184 have an extra segment of over 80 amino acid residues at its Raibaud, O., Clement, J. M. & Hofnung, M. (1979) Mol. Gen. Genet. 174,261-267 NH2 terminus compared to the three other proteins (legend to Fig. 5). The amino acid sequence of MBP should help Randall, L., Josefsson, L. G. & Hardy, S. J. S. (1980) Eur. J. Biochem. 1 0 7 , 375-379 elucidate the three-dimensional structure of this protein. The L. & Schwartz, M. (1973) J. Bacteriol. 1 1 6 , nucleotide sequence of the gene will allow a detailed genetic Randall-Hazelbauer, 1436-1446 analysis of the structure, functions, and interactions in par- Richarme, G. (1982) Biochem. Biophys. Res. Commun. 105,476-481 ticular with its substrates andwith the other cellular compo- Schwartz, M., Kellermann, O., Szmelcman, S. & Hazelbauer, G. nentsthatare involved in thetransport of maltose and (1976) Eur. J. Biochem. 71,167-170 Shuman, H. A. (1982a) Ann. Microbwl. (Paris) 133A, 153-159 maltodextrins and in the chemotaxis towards these sugars. Shuman, H. A. (1982b) J . Biol. Chem. 2 5 7 , 5455-5461 Shuman, H. A. & Silhacr, T. J. (1981) J. Biol. Chem. 2 6 6 . 560-562 Acknowledgments-We thank S. Froshauer, C. Lee, and J. Beck- Shuman, H. A., Silhavy, T.J. & Beckwith, J. R. (1980) J. Bwl. Chem. with for communication of unpublished data andC. Chapon, J. Clark, 255, 168-174 and H. Shuman for bacterial strains. We thank W. Boos for the Silhavy, T. J., Brickman, E., Bassford, P. J., Jr., Casadaban, M. J., maltose-binding protein and for his continued interest in this project. Shuman, H. A,, Schwartz, V., Guarente, L., Schwartz, M. & BeckI

~~~

~"

10612

Sequences of the malEand Gene

with, J. (1979) Mol. Gem Genet. 1 7 4 , 249-259 Szmelcman, S., Schwartz, M., Silhavy, T. & Boos, W. (1976) Eur. J. Biochem. 65,13-19 Tinoco, I., Jr., Borer, P. N., Dengler, B., Levine, M., Uhlenbeck, 0. C., Crothers, D. M. & Gralla, J. (1973) Nature New Biol. 246.4041

t a CB7.

Its Product

Wandersman, C., Schwartz, M. & Ferenci, T. (1979) J. Bmteriol. 140, 1-13 Zukin, S. (1979) Biochemistry 18,2139-2145 Vyas, N. K., Vyas, M. N. & Quiocho, F. A. (1983) Proc. Natl. Acad. Sci. U. S. A. 80. 1792-1796 Additional references are found below.

Sequences of the malE Gene and Its Product

10613 TABLE l l l d

Amino Acld Compollt~on o f SP Peptldes Amno Acid CB1

CB2

CB3

C84

CB5

CB6

SP3-4-5

SP7

sP9

C87

SPIO-11835

Aspartic Acid Thmonlne Serine Glutamic Acid Proline Glycine Alanine

valine

6.815) 2.7(3) 1.511) 5.017) 3.6131 6.0(7) 5.015) 3.414)

7.0(7) 4.0151

5.7151 3.5(4)

4.515)

2.9(2)

1.212) 3.7(51 1.111)

lraleucine Leucine Tyrosine Phenylalanine

2.4(2)

2.3(2) 7.7(8) 2.6131 1.7(2) 3.5(4)

1.4(1) 1.1(1) 1.0(11

3.5(31

1.0(11

A111 2.9(31

__ 370

56

27

42

60

Residues

23-78

112-138

173-214

215-274

Amno Tcminal

Val

Ala

A m

M

61

30

TOt*l

YWld

no

-

19.1

Ala

notdetemined The nmbers i n parenthesis dm frm t h e sequence.

TABLE I l a

Amino Acid C n p o n t i o n r o f Tryptic Peptides Am1 no

ACId

9.018) 3.113)

3.213) 1.712)

9.1181

3.3(3)

3.113) B.9(8)

2.3(2) 3.1(3)

29.3(29) 8.6(101 7.3 ( 9 ) 17.3(18) 12.4114) 17.8(16)

4.2(+)

3.414)

21.9(26)

5451

1.4(1)

4.8(61

1.4111 2.813) 1.9121 2.513) 2.012)

11.7(11) 3.1 ( 3 ) 2.713) 13.1(11) 24.31231 10.2111) 7.3 I81 25.7123) 1.8 ( 1 )

.2

3.6(3) 1.0(11 3,013) 9.0(91 1.7(2)

1.0(1)

.2

66

32

218

Residues

1-66

67-98

99-316

Amine Terminal

Lys

Phe

Yield(%)

40

81

-

6.3(6)

3.4(1) 4.5(51 6.9(6) 6.211) 3.9151 1.011) 2.6(21 2.3121 1.1(11 2.3(31 4.315)

Uethionine

6.1 (61

TOTAL

SP19-535 ~

17.9(16) 6.1 (6) 4.3 13) 15.7(15) 10.9110) 15.2113) 16.1(141 8.4 ( 7 ) NII (1) 9.2(111 13.8(141 6.2 (6) 16.9(18) 1.7 I21 NO 14) 1.6 (21

SPl4-1516

54

2.2(21

1.1111 1.3(1) 1.0(1) 3.9141 2.1(2)

1.2(11 1.0(1)

1.7W

1.0111

3.0(3)

1.6(1]

1.2(1)

4.0(5)

2.5(3) .811)

l.O(l)

2.012)

2.1(2)

I.O[l)

2.8131

.8(1]

1.0(1)

1.2(11 1.0111 1.011)

l.Oll1

28

317-344

Tyr

Ile

30

26

NO nntdetermined The numbers l n paremhesir am from the sequence

1.0(11

10

13

345-354 355-367

Thr

1.0(11

3

368-370

Gln

Ile

41

6

62

29

282-310

48

323-370

As"

Am

42

35