18 Flere proteiner fra samme gen (alternativ spleising) Menneske: 60 % av genene koder for mer enn ett proteinOrm: 22 % av genene koder for mer enn ett protein
19 Forskjeller i geninhold Fibroblastvekstfaktor – menneske 30, bananflue og orm 2 hverTransformerende vekstfaktor β – menneske 42, bananflue 9, orm 6Gener som koder for proteiner med immunglobulindomener – menneske 765, bananflue 140, orm 64”Sinkfinger”-proteiner – menneske dobbelt så mange som bananflue og 5 ganger flere enn orm
20 Mouse-human synteny. Human chromosomes can be cut into ~150 pieces, then shuffled into a reasonable approximation of the mouse genome.
21 CpG-frekvens og CpG-øyer The typical density of CpG doublets in mammalian DNA is ~1/100 bp, as seen for a -globin gene. In a CpG-rich island, the density is increased to >10 doublets/100 bp. The island in the APRT gene starts ~100 bp upstream of the promoter and extends ~400 bp into the gene. Each vertical line represents a CpG doublet.
22 CpG-øyerDiagram showing the structure of three human CpG island genes of different sizes. Vertical lines show the positions of CpGs in the first 10 kb of (a) the desmin (EMBL hsdes01), (b)hypoxanthine phosphoribosyl transferase (HPRT; EMBL hshprt8a) and(c) retinoblastoma (EMBL L11910) genes. The locations of the exons are shown by boxes. Open and tinted portions denote translated and untranslated regions, respectively. Any exons not present in the first 10 kb of genomic DNA are shown fused together to the right. The total genomic length of each gene (in kb) is given in brackets
23 Vedlikeholdsmetylering Ved maintenance-metylering induserer metyleringsmønsteret i en parental DNA-tråd det tilsvarende metyleringsmønster i den komplementære tråden. Slik kan et stabilt metyleringsmønster opprettholdes i en cellelinje
24 CpG – underrepresentert i genomet The CpG doublet occurs in vertebrate DNA at only ~20% of the frequency that would be expected from the proportion of G·C base pairs. (this is because CpG doublets are methylated on C, and spontaneous deamination of methyl-C converts it to T, introducing a mutation that removes the doublet.) In certain regions, however, the density of CpG doublets reaches the predicted value; in fact, it is increased by 10× relative to the rest of the genome. The CpG doublets in these regions are unmethylated
28 Klasser av intersperserte repetisjoner i det humane genom
29 Elementer i det humane genom som kan transposeres på en RNA-formidlet måte RNA-mediated transposable elements in the human genome. Each contains the characteristic flanking direct repeats (arrows). The human endogenous retrovirus containing long terminal repeats (LTRs) (speckled region), gag (group-specific antigen gene), pol (polymerase gene) and env (envelope gene). The THE-1 retrotransposon consists of an open reading frame (ORF) and LTRs. The non-LTR retrotransposon (LINE) contains internal RNA polymerase II promoter sequences (P), two open reading frames, and an A-tail. The Alu element has a dimeric structure of homologous halves separated by a middle A-rich region (striped). The left half contains A- and B-box RNA polymerase III promoter sequences, and the right half contains an additional internal 31 bp. Other shaded regions are sequences unique to the element.
30 SINEs og utledning av fylogenetiske forhold En SINE er enten der eller ikkeSINEs innsettes på tilfeldig måte i ikke-kodende områder. Samme plassering i to arter tyder på at innsettingen foregitt i en felles stamfarInnsetting av en SINE er irreversibel, fravær er derfor et ancestralt trekk
31 Alu elements Length = ~300 bp Repetitive: > 1,000,000 times in the human genomeConstitute >10% of the human genomeFound mostly in intergenic regions and intronsPropagate in the genome through retroposition (RNA intermediates).
33 Alu elements can be divided into subfamilies The subfamilies are distinguished by ~16 diagnostic positions.
34 Sekvenssammenstilling av Alu-familier 14 Alu-familier hos mennesket, hvorav 1 ikke hos andre primater Alu-insersjoner spesifikke for mennesket.J, S, YAlignment of Alu-subfamily consensus sequences. The consensus sequence forthe Alu Sx subfamily is shown at the top, with the sequences of progressively younger Alusubfamilies underneath. The dots represent the same nucleotides as the consensus sequence.Deletions are shown as dashes, and mutations are shown in coloured boxes; all are colour-codedaccording to the family in which the ancestral mutation arose. Each of the newer subfamilies, suchas Ya5 or Yb8, has all the mutations of the ancestral Alu elements, as well as five or eight extramutations, respectively, that are diagnostic for the particular Alu subfamily. This figure primarilyillustrates the newer subfamilies and does not attempt to show many of the older Alu subfamilies.
36 Transposisjonering av et typisk humant Alu-element The structure of each Alu element is bi-partite,with the 3′ half containing an additional31-bp insertion (not shown) relative to the 5′ half. The total length of each Alu sequenceis ~300 bp, depending on the length of the 3′ oligo(dA)-rich tail. The elements alsocontain a central A-rich region and are flanked by short intact direct repeats that arederived from the site of insertion (black arrows). The 5′ half of each sequence contains anRNA-polymerase-III promoter (A and B boxes). The 3′ terminus of the Alu elementalmost always consists of a run of As that is only occasionally interspersed with otherbases (a).Alu elements increase in number by retrotransposition — a process that involvesreverse transcription of an Alu-derived RNA polymerase III transcript.As the Aluelement does not code for an RNA-polymerase-III termination signal, its transcript willtherefore extend into the flanking unique sequence (b). The typical RNA-polymerase-IIIterminator signal is a run of four or more Ts on the sense strand,which results in three Usat the 3′ terminus of most transcripts. It has been proposed that the run of As at the 3′ endof the Alu might anneal directly at the site of integration in the genome for target-primedreverse transcription (mauve arrow indicates reverse transcription) (c). It seems likelythat the first nick at the site of insertion is often made by the L1 endonuclease at theTTAAAA consensus site. The mechanism for making the second-site nick on the otherstrand and integrating the other end of the Alu element remains unclear.A new set ofdirect repeats (red arrows) is created during the insertion of the new Alu element (d).
42 Signals of splicing 1 2 Donor site Acceptor site Branch point YYYYYYYYYNCAGGTRAGTACAGGDonor siteAcceptor siteBranch pointPyrimidine tract12A-OHLariatA12
43 Because mRNAs and Alus are frequently reverse transcribed and incorporated into the genome, pyrimidine tracts are ubiquitousThe complementary strand of polyA is polyT = pyrimidine tract.
44 The minus strand of Alu elements contains “near” splice sites The minus strand of Alu contains ~3 sites that resemble the acceptor recognition site:Consensus acceptor site:YYYYYYNCAG/RAlu-J: ( ) :TTTTTTGtAG/AThe minus strand of Alu contains ~9 sites that resemble the consensus donor site:Consensus donor site: CAG/GTRAGTAlu-J: (25-17) : CAG/GTGtGA
45 all Alu-containing exons are alternatively spliced. Our findingsOut of 1,182 alternatively spliced cassette exons, 62 have a significant hit to an Alu sequence.Out of 4,151 constitutively spliced exons, none has a significant hit to an Alu sequence. all Alu-containing exons are alternatively spliced.
46 Retention RatioRetention ratio = number of mRNA molecules containing the alternatively spliced exon divided by total number of mRNA molecules.Retention ratio for Alu-containing exons was ~10%.Retention ratio for alternatively spliced exons that do not contain Alu was ~45%.