two 27 bp introns .fr

Jun 9, 1992 - further original mechanisms underlying DNA plasticity and structural .... examined by PCR amplification, using as primers two sets of synthetic ..... alternatively on - 200 ng of Paramecium DNA, using the Taq polymerase.
2MB taille 7 téléchargements 258 vues
The EMBO Journal vol. 1 1 no. 1 0 pp.3713 - 3719, 1992

The 3-tubulin genes of Paramecium two 27 bp introns

Pascale Dupuis Centre de Genetique Moleculaire du Centre National de la Recherche Scientifique Associe a l'Universite Pierre et Marie Curie, 91198 Gifsur-Yvette, France Communicated by P.Slonimski

In the ciliate Paramecium tetraurelia, the analysis of the tubulin gene family has revealed the existence of four a and three 3 genes. We show here that the coding sequence of the first (3-tubulin gene to be cloned and sequenced is interrupted by two short non-coding sequences of 27 bp each, which present at their extremities the pairs GT/AG, characteristic of eukaryotic pre-mRNA introns, and the internal pentanucleotide TTAAT, consensual in Tetrahymena introns. We demonstrate by PCR experiments that the three macronuclear ,B-tubulin genes contain these sequences in similar positions, thereby ruling out the possibility that these sequences are ciliate IES present in the micronucleus and eliminated in the transcriptionally active macronucleus. Si mapping analysis and mRNA sequencing show that the sequences are absent from the j-tubulin transcripts. These sequences are the first introns described in protein encoding genes in P.tetraurelia and the shortest known introns altogether. Key words: introns/multigenic family/Paramecium/tubulin

Introduction Because of their position in the evolutionary scale, between the most primitive eukaryotes and the emergence of metazoa (Sogin et al., 1986; Baroin et al., 1988), the ciliated protozoa are of interest in the analysis of the evolution of eukaryotes and their molecular biology has already disclosed unexpected phenomena or properties such as self-splicing introns in Tetrahymena (Cech et al., 1981) and the deviant genetic codes of Paramecium (Caron and Meyer, 1985; Preer et al., 1985), Oxytricha (Herrick et al., 1987), Stylonychia (Helftenbein, 1985), Tetrahymena (Hanyu et al., 1986) and Euplotes (Meyer et al., 1991); for a review see Martindale (1989). More recently, the analysis of a ciliate specific process, their nuclear differentiation, has revealed further original mechanisms underlying DNA plasticity and structural organization of the genome. Indeed, ciliates are characterized by a nuclear dimorphism comprising a highly polyploid somatic macronucleus and a diploid transcriptionally inactive, germinal micronucleus. The macronuclear chromosomes derive from the micronuclear ones by a series of processes including fragmentation, sequence elimination, amplification and addition of telomeres. Clues to some of these complex mechanisms have been obtained: in Tetrahymena, Yao et al. (1990) identified chromosome breakage sequences involved in the (D Oxford University Press

are

interrupted by

fragmentation process. Other studies in Tetrahymena (Spangler et al., 1988; Yu and Blackburn, 1991) and also in Oxytricha (Zahler and Prescott, 1988) and Paramecium (Baroin et al., 1987; Bourgain and Katinka, 1991) are contributing to the understanding of telomerization. Finally, studies in hypotrichous ciliates have revealed the existence of 'DNA splicing', which consists in the specific elimination of short intragenic sequences (IES) during the differentiation of micro into macronuclear chromosomes (Klobutcher et al., 1984). Moreover, Greslin et al. (1989) showed that the micronuclear actin gene of Oxytricha nova is interrupted by eight short 'intron-like sequences' whose mean size ranges from 15 to 20 bp which are eliminated during macronuclear development; the remaining nine exons are then reordered to form a functional gene by rejoining short homologous terminal repeats present at the exon boundaries. Ciliates are also of interest for cell biologists because of their complex morphogenesis and their varied cytoskeletal networks, in particular their microtubular networks whose function and morphology are exceptionally diverse. In this context, we have undertaken the characterization of the tubulin genes in Paramecium. We show here that the first sequenced 3-tubulin gene of Paramecium tetraurelia comprises two intervening sequences of 27 bp which appear to be spliced from the mRNA and therefore behave as introns. To date, very little is known about introns in ciliates, but the feature that distinguishes the first Paramecium introns from those recorded in Tetrahymena thermophila (Csank et al., 1990) seems to be their particularly small size. The significance of these data is discussed in relation to splicing mechanisms in eukaryotic pre-mRNA, and in the context of the molecular evolution of the highly conserved tubulin gene family (Little and Seehaus, 1988).

Results The a- and f-tubulin genes of P. tetraurelia hybridize with the homologous genes (aTT and flTT1) of another ciliate, Tetrahymena pyriformis (Barahona et al., 1988). Southern blots, probed either with the complete genes or with the 5' part of the coding sequences of oeTT and O3TT1, disclosed four ca- and three f-tubulin genes in Paramecium (P.Dupuis, C.Klotz and J.Beisson, in preparation). The screening of a lambda library of Paramecium tetraurelia total DNA (see Materials and methods) with the fTT1 gene of Tetrahymena led to the isolation of eight positive clones. Three of them displayed the same restriction pattern and revealed a 3.6 kb Bgll fragment hybridizing with the 3TT1 gene. This fragment was entirely subcloned in a pBS plasmid vector designated plasmid B24. Sequencing showed it to contain a complete f-tubulin sequence of 1383 nucleotides, called OPTI. Further subcloning of ,3PIT1 was carried out as shown in Figure 1. Hybridizations with the B 1 probe were carried out as shown in Figure 2. We can

3713

P.Dupuis

SS1 and SS2, at positions 212 and 1232 bp respectively. In view of the highly conserved structure of tubulins, it

X bal \ Xbol

Xbal

Fig. 1. Representation of plasmids B24 and B1. The B24 plasmid was obtained by insertion of the 3.6 kb BglII fragment into the BamHI cloning site of the PBS-SK+ vector. The fl-tubulin gene with its surrounding non-coding parts was subcloned by XbaI digestion of B24 and subsequent ligation, leading to the B1 plasmid.

is unlikely that such additional sequences contribute to the coding section of the gene, a conclusion which is reinforced by the presence, in SS 1, of a TGA stop codon in phase with the start codon. It also seems unlikely that OPTI is a pseudogene, coding for a truncated non-functional protein, as one would then expect its sequence to have accumulated more mutations than appears to be the case from comparison with the Tetrahymena (-tubulin gene. Other hypotheses must therefore be put forward to account for the intercalated SS 1 and SS2 sequences. SS 1 and SS2 are not ciliate IES A first hypothesis, specific to ciliates, is that these sequences are IES, like those described in hypotrichous ciliates (Greslin et al., 1989). In ciliates, as mentioned in the introduction, the macronuclear chromosomes derive from the

micronuclear ones by a multi-step differentiation including amplification, fragmentation, addition of telomeres, and specific elimination of certain sequences called IES for internal eliminated sequences (Klobutcher et al., 1984). Since the library from which the gene was cloned was composed of total DNA, in which micronuclear DNA represents 0.5%, there was a small probability that the cloned

Fig. 2. Southern blot of Paramecium genomic DNA hybridized with the Bi probe. 5 ,ug of restricted Paramecium total DNA was loaded in each lane, transferred on to a nylon membrane and hybridized with the B1 plasmid overnight at high stringency. Note in particular the BgllI digest which shows three bands of equal intensity, expected each to contain a ,B-tubulin gene. The size markers (DNA ladder from BRL) are indicated on the right of the figure.

note in particular the three bands of equal intensity at 3.6, 4.1 and 6 kb in the BglII pattern. These data confirm the preliminary Southern analysis using heterologous probes indicating the presence of three genes and support the conclusion that each of the bands includes a whole 3-tubulin gene.

The ,81 Paramecium (,/PT1J gene contains two intervening 27 bp sequences Among the 3.6 kb sequenced, only the coding part of OPTl is analysed here and presented in Figure 3. The alignment of the deduced amino acid sequences of ,3TTI and that of the fPTI gene revealed 99% identity and the presence in the Paramecium gene of two 27 bp supplementary sequences, 3714

gene was of micronuclear origin, and thus that the supplementary sequences were not present in the macronuclear copies of the gene, having been eliminated as IES during macronuclear differentiation. This possibility was examined by PCR amplification, using as primers two sets of synthetic oligonucleotides bordering each supplementary sequence, as shown in Figures 3 and 5. The amplification reactions were carried out on total DNA (i.e. from micro and macronucleus), and on the DNA extracted from each of the three gel bands of 3.6, 4.1 and 6 kb which contain the 3-tubulin genes in a BglII digest, as judged by hybridization with the OPT1 tubulin sequence (see Figure 2). Indeed, considering the great similarity between tubulin sequences even in different species, it was very probable that the oligonucleotides would also hybridize with the other two ,B-tubulin genes and act as primers for their amplification. Cloned tubulin genes from both Tetrahymena and Paramecium were used as controls. PCR amplication products of the cloned Paramecium gene are 163 and 169 bp long, corresponding respectively to the amplified segments including SS 1 and SS2. As the Tetrahymena gene does not contain SSl and SS2, PCR should only amplify the segments remaining between the oligonucleotides, which are 27 bp shorter, i.e. respectively 136 and 142 bp. The Paramecium cloned gene thus provided a positive reference for the presence of SS1 and SS2 and the Tetrahymena gene a negative reference, as illustrated in Figure 5. The first observation is that each reaction carried out on total DNA leads to a single type of amplification product of 163 bp for the first amplification and 169 bp for the second one, corresponding respectively to the presence of SS 1 and SS2. The second observation is that PCR experiments conducted on the size limited DNA populations each containing one of the three (-tubulin genes, uniformly lead to the same size of fragments as those obtained on total DNA. Control DNA extracted from the 2 kb region of a BglH digest gave no PCR amplification products (not shown). These results imply that the macronuclear copies of IPT1

Introns in Paramecium f3-tubulin genes _ NR_I_ _CCCCCC PT_CTISCCSDLQLIRITIASTCGRI GA VSD SCI YY N T G G R Y V I V 8 I Q G G Q C C N Q I G a R r v z V I 3 D E 8 C I D P T G T I 8 G D S D I Q I B R I N V

N R E

R E

N

I

V

8

I

Q G G O C G N Q I G a R F V S V I

S

D Z

8

G

I

D P T

G

T

8 G D

Y

S

D

L O L Z R I

N V Y

Y

N Z

a T G G R Y V

1~~~~~~~ ACChAO.TGAGA

N D s v R a G P r G Q L F R P D n r V F G Q T G a G N N N A R G * Y T E G A E TTNDSVRACPFCQLFRPDSFVFGQTCAGNNWAIGSITICAI~~~~~~~~----

P RPRAILNDLIPC a I L N D L Z P G P R A I

61 L

L N D L Z

51

P C

T N D S V R A C P F C Q L F R P D U F V F C O T CA C N N W A I c

LIDSVLOVVRIIAICCDCLCCFCIATSCSLCA I D S V L D V V Rt R E A E G C D C L GC F O T 8 S L

L

I

112

D

S V L

D V V R X E A E G C

D C L O CGF Q I

S P R V S

D T V V I P Y N a T L S

S P K V S

D T V V E P

172

CVT C CL R

A S S a H

S

G V T

C C

L

Y N a T L

P

R F P

V

8 0L

V E N

S 8 I

A

I

D N E A

Y

D

I

C F R T

L X L

I

D N E A L Y

D

I

C r R T

L

L A V N L I

P F P R L 8 F F N I CF A P L TS

D

L

P

L

R X

A V N

L

I

F

CR

A G

D

I

C

D

E

N E

D

E

N

P

P X

C

8 F F N I

P R L

403

P

E F

T

T

V

V P

L N 8

L

V

S

C N L N 8

L

V

S a

A

T P T

R C S

OQ

Y R A

L T V P I L T Q

O Q

Y R a

L

S A G

S

T V

P

E

L

T Q

S TX E V

D E O

L N V Q N X

N

S

S

Y

F V Z

V

P

I

N N

I

X

S

CCGTTCAACCTATCTACTITATATM_-cICTR F RA V A Z Q F T A N F R R x a F L 8 v G s T A I OC Z Y TTCG X C

L X

a v T

F

V

CN

*S TA I

Q Z

C

C

N

K

N D

L V

S

E

Y

C

A E IS

N

N

N D

L V

S

E

Y

O O Y O D

A E

L

s

V V P

Y G D Y

r

R L T

N A V T rF

S

T E

a P

T

VSV

E TF S

ACAA ANSCNLNVTCIrCSIFPT NICA S T X Z V D E O L N V O N R N S S Y F V E V I P N N I X S

P

X

R

V

A

E O r

_ATCCATAATATCAArACATCcCACTCCRAAA S

G P

T T P

T

CG A aI

T

R N

Q N F D A X N N N C A A D P R 8 G R Y L T a S a L r x G R m

I

L

D L RI

S

CCAKRCTW D R I N E

I S X V R SE ZY P D R I N

V

N

292

352

T L L

C O L N S

ACCAANTTCAATRTSCRILTAALF L T A * A L F

X

NC

G O L N

D a R N N N C a a D P A 8 G R Y

TATGTCTACCCACCAAA SI C D I P P X C L

C S

t B E Y P

C N V

232 T O N F

CC JTCSlCNCTLLISIVR G G T G S G N G T L L I S X V R

L CCGTC

D E C

L V 2 N a D 2

S V I

G

Y

Y

D

A

T

A

A T

A

1

s Z

z Z E

T

P

A

CI

F

ZF

x

A

F L

8

Y

C

IC

582

CCACA E C Z C STO

A

C Z

R R

Z Z

E E E

S

C Z

N

STO

Fig. 3. Coding sequence of the Paramecium 31-tubulin gene (,BPT1). The deduced amino acid sequence is aligned with that deduced from the fTTI gene of Tetrahymena (Barahona et al., 1988). The quasi-total identity of these two sequences (except for the C-terminal amino acids) reveals the presence in the flPTI gene of two small non-coding sequences, SS1 and SS2, of 27 bp each. The sequences used as primers in the PCR reactions (see below) are underlined. The star indicates the stop codon, in SS1, in phase with the ATG. (EMBL accession no. X67237).

contain the sequences SS1 and SS2, thus ruling out the hypothesis that these sequences are IES and demonstrating that all of the f-tubulin genes in Paramecium contain the two intervening sequences. These results indicate that, since f-tubulin is undoubtedly present in the cell, SS1 and SS2 must necessarily be spliced in order that the ,B-tubulin genes may be expressed. Further evidence that the OPT1 gene (or at least one of the other two genes with a similar C-terminal sequence) are actually expressed, was obtained by immunodecoration of several microtubule networks by tubulin specific rabbit antibodies raised against the 12 Cterminal amino acids of the OPT1 polypeptide (P.Dupuis, C.Klotz and J.Beisson, in preparation).

55

I

GAACCAG E

P

G

EJTGATCTAAAATTAAYGAATTA4i GAACTATG T

M

3-

The intervening sequences are absent from /3-tubulin mRNAs That SS1 and SS2 may be introns is suggested by the fact that they present the initial pair GT and the terminal pair AG, universally characteristic of eukaryotic introns; in addition, both contain the pentanucleotide TTAAT (Figure 4), common to a majority of the introns of T. thermophila (Csank et al., 1990). If the SS1 and SS2 sequences are introns, they must be absent from j31-tubulin mRNA. Poly(A) RNA was thus prepared from an exponentially growing culture, and analysed by Northern blot. Hybridization with the OPT 1 gene revealed a single band of 1.6 kb for the expected 3-tubulin transcripts (Figure 6a). In order to know if the SS1 and SS2 sequences were present in ,5-tubulin mRNAs, SI mapping experiments were performed on heteroduplex molecules -

GGTGAAG G

E

G

E~ATCTACTTTATATTAATTCATC4Ij

GTATGGAC

-Mr

D

Fig. 4. Sequences of SS1 and SS2. SS1 and SS2 present at their extremities the pairs GT/AG specific to eukaryotic introns, and the internal sequence TTAAT consensual in Tetrahymena introns (Csank et al., 1990). The stop codon included in SS1, in phase with the protein sequence, is marked by a star.

between poly(A) RNA fraction and a restricted fragment of the B1 plasmid. The DNA target consisted of an AccI fragment of Bl including the SS1 sequence. Figure 6b depicts the expected results of the SI nuclease treatment depending on whether SS1 is present or absent in the mRNA. Figure 6c shows that the nuclease SI treatment yields the two fragments of 764 and 210 bp expected if the mRNA does not contain SS 1. Another experiment using the whole 3PT1 gene as matrix indicated a double cleavage at both SS1 and SS2 positions (not shown). More direct evidence for the splicing of sequences SS1 and SS2 is given by the 3715

P.Dupuis

Fig. 5. PCR amplification of the SS1 and SS2 regions. (a) Sequence of the oligonucleotides used as primers for PCR reactions. The oligonucleotides prepared in order to encompass the region including SS1 and SS2 are indicated and below are represented the expected sizes of the amplification products depending on whether the target sequence included the supplementary sequence or not. (b) PCR amplification products. From each 25 I1 PCR reaction, 1 pi was loaded on a 2% low melting agarose gel. Lanes 1-6 correspond to the PCR products obtained with the first set of oligonucleotides surrounding SS1; lanes 7-12, to those obtained with the second set of oligonucleotides surrounding SS2. Amplification reactions were carried out on total DNA (lanes 3 and 9), and on the three isolated Paramecium 03-tubulin genes included in the three BglII fragments (see Materials and methods) of 3.6 kb (lanes 4 and 10), 4.1 kb (lanes 5 and 11) and 6 kb (lanes 6 and 12). The size references were provided by the amplification of the cloned genes 3TT1 from Tetrahymena (lanes 2 and 8) and ,BPT1 from Paramecium (lanes 1 and 7). The background in lane 2 is due to the fact that the sequence of the oligonucleotide presents less homology with ,BTT1 in this region.

sequencing of the j-tubulin transcripts, initiated by the primers situated downstream of each region (used previously in PCR experiments, see Figure 5a). The results are illustrated for the SS2 region in Figure 7, which shows (i) that SS2 is absent from the poly(A) 0-tubulin transcripts, and (ii) that the sequence results from the junction of the two bordering sequences spliced at GT/AG sites, as expected for pre-mRNA introns. A similar result was obtained for SS1 (data not shown). It is clear that these results do not specifically concern the transcription products of the sole OPTl gene, since we cannot discriminate between the three gene products. Nevertheless, the fact that SS1 and SS2 are present in the three genes and absent from the whole ,Btubulin mRNA provides strong support for the conclusion that these intervening sequences are eliminated during mRNA processing and therefore behave as introns.

Discussion This article discloses the existence of two small sequences interrupting the first ,3-tubulin gene cloned and sequenced in Paramecium tetraurelia. PCR experiments revealed that these sequences exist in similar positions in the three macronuclear 3-tubulin genes, thus excluding the possibility that these sequences could be micronuclear IES, as described in some hypotrichous genes (Klobutcher et al., 1984; Herrick et al., 1987; Greslin et al., 1989). The fact that they are bounded by the conventional eukaryotic splicing junctions GT/AG, and the demonstration by S I mapping and reverse sequencing of mRNA, that they are excluded from the 3tubulin transcripts lead us to conclude that these sequences are introns. This represents the first case of introns described in protein encoding genes in Paramecium, as no intron had yet been found in any of the seven other genes totally sequenced in this species, namely the surface antigen genes SAg156 (Prat et al., 1986) and SAg 168 (Prat, 1990) in P.primaurelia, the surface antigen genes A and D in

3716

P. tetraurelia (Nielsen et al., 1991), the calmodulin gene in P. tetraurelia (Kink et al., 1990) and the a 1 and oi2-tubulin genes in P.tetraurelia (P.Dupuis, C.Klotz and J.Beisson, in preparation). This may not be surprising as it is recognized that introns, in lower eukaryotes which divide rapidly, are scarcer than in higher organisms (Gilbert et al., 1986). The major question raised by our data concerns the fact that SS1 and SS2 are introns despite their small size. Few reports have been made on ciliate introns, and except in yeast, the mechanisms of splicing in lower eukaryotic premRNA have not been extensively studied. For these reasons, examination of intron sequences in a new organism such as Paramecium may contribute to understanding the evolution of the splicing process. Our knowledge of the steps involved in splicing and of the role of the components of the spliceosome is mostly based on studies in yeast and vertebrates (Green, 1986; Padgett et al., 1986; Woolford, 1989). Recently however, in comparing Tetrahymena introns with those of organisms from various phyla, Csank et al. (1990) have shown that in fact yeast and vertebrates have atypical sequence requirements to complete effective splicing. In particular the polypyrimidine tract, required for the branching step and lariat formation in vertebrates, is absent from the introns of all the non-vertebrate species examined. In lower eukaryotes, introns are A+T rich, compared with their neighbouring exons. This is also the case for the Paramecium 3-tubulin gene where the average G+C content is 14% in intron sequences, against 42 % in the adjacent coding sequences. These figures, similar to those found in Tetrahymena, may reflect a property of ciliate genomes to present a particularly high proportion of A and T in noncoding sequences (Horowitz et al., 1987; Helftenbein et al., 1989; P.Dupuis, unpublished results). In plants (Goodall and Fillipowicz, 1989), the abundance of A+T in introns, either because it reduces potential secondary structure, or because they act as a recognition signal for hnRNP, is absolutely

Introns in Paramecium ,B-tubulin genes

.D........

Fig. 7. mRNA sequence of the SS2 region. The sequencing reaction, realized on poly(A) RNA was initiated by the oligonucleotide situated 34 bp away from the right junction of SS2 (see Figure 5a). The sequence obtained here is to be expected if the SS2 sequence is spliced at the GT/AG sites. Assuming that this sequence probably represents the superposition of the sequence of the three (3-tubulin genes, it can be noted that they seem to be very similar in this region.

Fig. 6. (a) Northern blot. Respectively 0.5 Ag and 1 ug of Paramecium poly(A) RNA were loaded on a denaturing 1.5% agarose gel and transferred on to a nylon membrane. The hybridization of the Northern blot with the BI probe reveals ,B-tubulin transcripts of 1.6 kb. (b) Nuclease SI mapping: expected products. An AccI restriction fragment of the BI plasmid was denatured, hybridized with the poly(A) RNA and submitted to nuclease SI digestion. The expected products are indicated in the two alternative hypotheses. (c) Nuclease S1 mapping: results. The reaction products were run on an alkaline gel, transferred and hybridized with the Bi probe. In the first lane the experiment is made without RNA to serve as control for SI activity; the band obtained corresponds to the renatured restricted plasmid. In the second lane, the heteroduplex yields three bands: the upper one corresponds to the renatured homoduplex and the lower two have precisely the size expected if the intron SS1 is absent from the transcripts, as shown in the scheme. In addition, we can note the absence of any band expected in the alternative case where SS1 would be present in the poly(A) RNA. -

necessary for correct splicing. Interestingly, the only common motif observed in the two Paramecium introns is TTAAT, the only consensus present in the majority of Tetrahymena introns and therefore conceivably constituting the branching point. As regards the 5' intron junction, the two sequences in Paramecium, as in Tetrahymena (Csank et al., 1990), derive from the conventional AG/GTAAGT for the last two bases. Indeed, this sequence at the left junction of the introns is observed among a great number of species, supposedly because it interacts directly with the highly conserved Ul snRNA (Green, 1986). Paramecium introns thus show many common features with those found in Tetrahymena. However, what distinguishes these introns from those of Tetrahymena is their size of 27 bp. Indeed, amongst all eukaryotic split genes listed to this date, intron size ranges from 31 to 100 kb (Hawkins, 1988; Woolford, 1989). Generally, the introns in vertebrates and plants are rather long, while introns below 150 bp predominate in lower eukaryotes. In species possessing especially small introns, such as Caenorhabditis elegans (Blumenthal and Thomas, 1988), Dictyostelium discoideum (Hawkins, 1988) or T. thermophila (Csank et al., 1990), their mean size is at least 50 bp ranging from 50 to 1000 bp. In vitro experiments on vertebrates (Green, 1986) suggested that there exists a threshold size below which effective splicing cannot occur. -

3717

P.Dupuis

A study of the rabbit 3-globin gene showed that the progressive reduction of its large intron was accompanied by a reduction in splicing efficiency (Wieringa et al., 1984); a critical size of 80 bp was determined. Another experiment consisted in introducing a synthetic oligonucleotide, containing all the consensus sequences for splicing, into a ,B-globin gene carried by a plasmid (Rautmann et al., 1984). This demonstrated that in vivo, at least in mammalian somatic cells, an absolute minimum size of 29 bp is required for a synthetic intron to be spliced. Furthermore, it is particularly interesting to note that the 31 bp intron observed in the SV40 19S mRNA is remarkable in having a very low splicing efficiency (Treisman et al., 1981). Considering all these data, it would not be unreasonable to wonder whether these unusually short introns could play a part in the regulation of expression of the 3-tubulin gene in Paramecium, by way of low splicing efficiency, were it not that tubulins are highly abundant proteins in ciliates. We prefer the hypothesis that ciliates have developed a different mode of splicing, more efficient or accurate for the elimination of very short introns. It would be interesting to find other introns in Paramecium, in order to know whether they all reveal the same very small size, and generally to provide further understanding of the evolution of splicing. The second point of interest raised by our findings concerns the evolution of the tubulin genes. Introns were not found in tubulin genes of either the hypotrichous ciliate Stylonychia lemnae (Conzelmann and Helftenbein, 1987; Helftenbein and Muller, 1988) or of the holotrichous T.thernophila (Barahona et al., 1988) or in the two a-tubulin genes sequenced in Paramecium (P.Dupuis, C.Klotz and J.Beisson, in preparation). These observations fit the currently admitted theory that introns arose by insertion into unsplit genes rather than being present from the outset (Cavalier-Smith, 1991). Furthermore, they argue in favour of the proposition that insertion of introns interrupting actin or tubulin genes probably occurred late in protozoan evolution. More precisely, this work would place this event still later after ciliate diversification, at least after the emergence of Paramecium from the holotrichous phylum. This hypothesis is supported by the position of the introns in Paramecium genes, after the first bases of codons 71 and 402. Referring to a study of the intron distribution within the tubulin gene family (Dibb and Newman, 1989), it appears -

71

-

Paramecium

35

Te trahym ena

2l

Stylonychia

.

Saccharomyces

402

24

1I3

Fig. 8. The ,B-tubulin genes through ciliate evolution. The evolution of the 3-tubulin genes subfamily in ciliates is illustrated through a simplified phylogenetic tree, established from the molecular data on ciliate phylogeny (Fleury et al, 1992). The number of genes and the intron positions, if any, are indicated for each species. 3718

that these positions are not held in common with any previously observed site, be it in plants, fungi, vertebrates or even in the evolutionarily distant protozoan Toxoplasma gondii. Our results confirm the observation by Dibb and Newman (1989) that intron positions are conserved within each of the above groups, but they differ between them, suggesting that the introns are gained during evolution, after the major speciation events. Moreover, inspection of the sequences bordering the two Paramecium introns confirms the further hypothesis that insertion would appear not to be haphazard but to occur between the G and the R of the consensus 'proto-splice' site C/AAG R. Indeed, they are respectively inserted in sites of the following sequences: CAG G and AAG G. Finally, since all three 3-tubulin genes in Paramecium contain the two introns, and in the same positions, it is likely that they evolved from a common ancestral gene by successive duplications. Indeed, since each of the evolutionarily earlier ciliates Tetrahymena and Stylonychia possesses two unsplit ,B-tubulin genes (Figure 8), it is difficult to imagine that the three Paramecium genes derive from duplication and subsequent addition of introns in similar positions in the three genes. Therefore, two alternative hypotheses could be proposed. The first is that the three Paramecium genes arose from duplications of a split ancestral gene, lost in the other ciliate species examined. This rather argues in favour of an oscillating size of the tubulin family during the course of evolution, with an alternatively growing and shrinking number of genes. In the second hypothesis, which does not exclude a linear evolution of the f-tubulin gene family, the introns could have been introduced into the Paramecium genome, with subsequent homogenization by conversion, leading to the three similar genes. Whatever the case may be, the investigation into the presence of introns of the j-tubulin genes in related ciliate species would certainly be very interesting.

Materials and methods Cells and culture conditions The wild-type strain d4-2 used in these experiments is a derivative of the wild-type stock 51 of P.tetraurelia (Sonneborn, 1974). Cells were grown at 27°C in buffered Wheat Grass Powder (Pines International Co., USA), bacterized with Enterobacter aerogenes and supplemented with 0.4 Ag/ml

13-sitosterol.

DNA purification Cells were harvested in exponential phase, pelleted, and lysed at 50°C in 0.5 M EDTA pH 9, 1% SDS and 1 mg/ml proteinase K for at least 4 h. The lysis mixture was extracted three times with phenol, dialysed against 20% ethanol, and finally dialysed against 10 mM Tris-HCl pH 8, 1 mM EDTA.

Southern blotting hybridization Paramecium DNA was digested by restriction endonucleases, electrophoresed on 1 % (w/v) agarose gels and blotted onto nylon N + membranes (Amersham; Buckinghamshire, UK) in 0.4 N NaOH. Prehybridization and hybridization were carried out at 62°C as described by Church and Gilbert (1984). For Southern blots, the probes were 32P-radiolabelled (with [32P]dATP, 400 Ci/mmol, from Amersham), using the random-priming kit from Boehringer (Mannheim, Germany). Filters were autoradiographed on Kodak X-Omat AR5 films at -70°C with intensifying screens.

Genomic library The genomic library was constructed by J.Cohen (unpublished), using the following procedure: total DNA was partially digested with Sau3A (in order to obtain a majority of fragments of 20-25 kb), then phosphorylated and inserted into the BamHI cloning site of the lambda vector EMBL4 (Frischauf et al., 1983). After encapsidation, the titre of the library was 1.2 x 105

Introns in Paramecium ,B-tubulin genes

particles/ml. As the complexity of the Parainecium genome is estimated to be 2 x i08 bp, this library would need to be composed of at least 4.6 x IO4 phages to be representative. The whole unamplified library was estimated to contain 4.8 x 104 recombinant phages.

Isolation of recombinant clones and DNA sequencing Clones containing 0-tubulin sequences were isolated from the recombinant genomic library screened with the coding region of OTTI from T.pvriforrnis isolated from the IBl plasmid, generously given by the authors (Barahona et al., 1988). DNA inserts of phages from this sublibrary were analysed by Southern experiments, then a 3.6 kb BglII fragment, hybridizing with the ,BTT I probe, was subcloned in a pBluescript vector (Stratagene, La Jolla, CA). This fragment was gradually digested by exonuclease III in order to obtain overlapping clones for sequencing. DNA sequence was determined according to the method of Sanger et al. (1977), using the sequencing kit from United States Biochemicals (Cleveland, OH) Nucleotide and protein sequence analysis DNA and the deduced protein sequences were aligned and analysed using the programs of UWGCG (Devereux et al., 1984). PCR amplification The oligonucleotides were chosen such that their T,,7 would be higher than 80°C. Reactions were performed either on total DNA or on DNA fractions containing either one of the three 0-tubulin genes. These fractions were obtained by cutting the regions of the bands at 3.6, 4.1 and 6 kb from an agarose gel, after electrophoresis of a BglII digest of total DNA as shown in Figure 2. DNA was then recovered from the 'bands' by treatment with agarase (Sigma; St Louis, USA) in the following buffer: 100 mM NaCl; 5 mM EDTA, pH 6, for I h, then ethanol precipitated. The amplification reactions were carried out separately with the two sets of oligonucleotides (bordering respectively SSl and SS2) on 20 ng of plasmid DNA or alternatively on 200 ng of Paramecium DNA, using the Taq polymerase from Bioprobe (Montreuil, France), according to the supplier's instructions. They consisted in 35 cycles, of 1 min of denaturation at 92°C and 1 min 15 of renaturation/polymerization at 63°C. followed by a 10 min cycle at 700C. -

RNA extraction

Exponentially growing cells were centrifuged at 150 g for 2 min and washed

in Tris 10 mM, pH 8. The pellet was resuspended in 10 vol of an ice cold lysis buffer (0.1 M Tris pH 8, 0.01 M EDTA, 0.35 M NaCI, 2% SDS, 7 M urea), and immediately extracted three times by phenol/chloroform (1:1), then once by chloroform/isoamyl alcohol (24:1) and finally ethanol precipitated. Poly(A) RNA was prepared by chromatography on oligo(dT) cellulose from Boehringer, as described in Sambrook et al. (1989). twice

Northern blotting and S 1 mapping Glyoxylated RNAs were fractionated on denaturing gels (1.5% agarose, 6 M urea) according to Sambrook et al. (1989), then electrotransferred to a nylon N membrane from Amersham (Buckinghamshire, UK). SI mapping experiments were performed as described in Sambrook et al. (1989). 3 Ag of poly(A) RNA were hybridized with 10 ng of restricted Bi plasmid and then submitted to nuclease SI digestion. The products of the reaction were electrophoresed on an alkaline gel and transferred as described previously. Hybridizations were carried out overnight at 450C with the linearized 32P-labelled BI plasmid. mRNA sequencing mRNA sequencing was performed as described in Di Rago and Colson

(1988).

Acknowledgements I am indebted to Didier Contamine and Anne-Marie Petitjean for their help and suggestions in carrying out the S1 mapping experiments, and to JeanPaul Di Rago for advice for mRNA sequencing. Linda Sperling was kind enough to provide me with poly(A) RNA for sequencing. I am very grateful to Janine Beisson for constant support and helpful advice throughout this work. I thank Jean Cohen. Linda Sperling, Andre Adoutte, Ed Brody, Josette Banroque. Eric Meyer and Janine Beisson for critical reading of the manuscript. Thanks are especially due to Eric Meyer for stimulating discussions. Thanks are also due to Dominic Williams for kindly correcting the English. This work was supported by the Ligue Nationale Francaise contre le Cancer and the Fondation pour la Recherche Medicale.

References Barahona,l., Soares,H., Cyrne,L.P., Penque,D., Denoulet,P. and Rodrigues-Pousada,C. (1988) J. Mol. Biol., 202, 365-382. Baroin,A., Prat,A. and Caron,F. (1987)NucleicAcidsRes., 15, 1717-1728. Baroin,A., Perasso,R., Qu.L.H., Brugerolle,G., Bachellerie,J.P. and Adoutte,A. (1988) Proc. Natl. Acad. Sci. USA, 85, 3474-3478. Blumenthal,T. and Thomas,J. (1988) Trends Genet., 4, 305-308. Bourgain,F.M. and Katinka,M.D. (1991) Nucleic Acids Res., 19, 1541-1547. Caron,F. and Meyer,E. (1985) Nature, 314, 185-188. Cavalier-Smith,T. (1991) Trends Genet., 7, 145-148. Cech,T.R., Zaug,J. and Grabowski,P.J. (1981) Cell, 27, 487-496. Church,G.M. and Gilbert,W. (1984) Proc. Natl. Acad. Sci. USA, 81, 1991 -1995. Conzelmann,K.K. and Helftenbein,E. (1987) J. Mol. Biol., 198, 643-653. Csank,C., Taylor,F.M. and Martindale,D.W. (1990) Nucleic Acids Res., 18, 5133-5141. Devereux,J., Haeberli,P. and Smithies,O. (1984) Nucleic Acids Res., 12, 387 -395. Dibb,N.J. and Newman,A.J. (1989) EMBO J., 8, 2015-2021. Di Rago,J. P. and Colson,A. M. (1988) J. Biol. Chem., 263, 12564-12570 Fleury,A., Delgado,P., Iftode,F. and Adoutte,A. (1992) Dev. Biol., in press. Frischauf,A.M., Lehrach,H., Poustka,A. and Murray,N. (1983) J. Mol. Biol., 170, 827-842. Gilbert,W., Marchionni,M. and McKnight,G. (1986) Cell, 46, 151-154. Goodall,G.J. and Fillipowicz,W. (1989) Cell, 58, 473-483. Green,M.R. (1986) Annu. Rev. Genet., 20, 671-708. Greslin,A.F., Prescott,D.M., Oka,Y., Loukin,S.H. and Chapell,J.C. (1989) Proc. Natl. Acad. Sci. USA, 86, 6264-6268. Hanyu,N., Kuchino,Y. and Nishimura,S. (1986) EMBO J., 5, 1307-1311. Hawkins,J.D. (1988) Nucleic Acids Res., 16, 9893-9908. Helftenbein,E. (1985) Nucleic Acids Res., 13, 415-433. Helftenbein,E. and Muller,E. (1988) Curr. Genet., 13, 425-432. Helftenbein,E., Conzelmann,K.K., Becker,K.F. and Fritzenschaf,H. (1989) Eur. J. Protistol., 25, 158-167. Herrick,G., Hunter.D., Williams,K. and Kotter,K. (1987) Genes Dev., 1, 1047- 1058. Horowitz,S., Bowen,J.K., Bannon,G.H. and Gorovsky,M.A. (1987) Nucleic Acids Res., 15, 141-160. Kink,J.A., Maley,M.E., Preston,R.R., Ling,K.Y., Wallen-Friedman,M.A., Saimi,Y. and Kung,C. (1990) Cell, 62, 165-174. Klobutcher,L.A., Jahn,C.L. and Prescott,D.M. (1984) Cell, 36, 1045- 1055. Little,M. and Seehaus,T. (1988) Comp. Biochem. Phvsiol., 90B, 655-670. Martindale,D.W. (1989) J. Protozool., 36, 29-34. Meyer,F., Shmidt,H.J., PlumperE., Hasilik,A., Mersmann,G., Meyer,H.E., Engstrom,A., and Heckmann,K. (1991) Proc. Natl. Acad. Sci. USA, 88, 3758-3761. Nielsen,E., You,Y. and Forney,J. (1991) J. Mol. Biol., 222, 835-841. Padgett,R.A., Grabowski,P.J., Konarska,M.M., Seiler,S. and Sharp,P.A. (1986) Annu. Rev. Biochem., 55, 1119-1150. Prat,A. (1990) J. Mol. Biol., 211, 521-535. Prat,A., Katinka,M., Caron,F. and Meyer,E. (1986) J. Mol. Biol., 189, 47-60. Preer,J.R., Preer,L.B., Rudman,B.M. and Barnett,A.J. (1985) Nature, 314, 188-190. Rautmann,G., Matthes,H.W.D., Gait,M.J. and Breathnach,R. (1984) EMBO J., 3, 2021-2028. Sambrook,J., Fritsch,E.F. and Maniatis,T. (1989) Molecular Cloning. A Laboratorv Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. Sanger,F., Nicklen,S. and CoulsonA.R. (1977) Proc. Natl. Acad. Sci. USA, 74, 5463. Spangler,E.A., Ryan,T. and Blackburn,E.H. (1988) Nucleic Acids Res., 16, 5569-5585. Sogin.M.L., Elwood,H.J. and Gunderson,J.H. (1986) Proc. Natl. Acad. Sci. USA, 83, 1383-1387. Sonneborn,T.M. (1974) In King.R. (ed.) Handbook of Genetics. Plenum Publishing Corp., New York, pp. 469-594. Treisman,T., Novak,U., Favaloro,J. and Kamen,R. (1981) Nature, 292, 595 -600. Wieringa,B., Hofer,E. and Weissmann,C. (1984) Cell, 37, 915-925. Woolford,J.L. (1989) Yeast, 5, 439-457. Yao,M.C., Yao,C.H. and Monks,B. (1990) Cell, 63, 763-772. Yu,G.L. and Blackburn,E.H. (1991) Cell, 67, 823-832. Zahler,A. M. and Prescott, D. M. (1988) Nucleic Acids Res., 16, 6953-6972. Received oni May 13, 1992; revised on June 9, 1992

3719