http://www.ncbi.nlm.nih.gov/books/NBK21745/
Once transcription start sites in eukaryotic DNA had been identified, analysis of the DNA sequences controlling initiation of transcription could begin. In this section, we take a closer look at various elements in the transcription-control regions that regulate transcription of eukaryotic protein-coding genes.
Transcription of genes with promoters containing a TATA box or initiator element begins at a well-defined initiation site. However,transcription of many protein-coding genes has been shown to begin at any one of multiple possible sites over an extended region, often 20 – 200 base pairs in length. As a result, such genes give rise to mRNAs with multiple alternative 5′ ends. These genes, which generally are transcribed at low rates (e.g., genes encoding the enzymes of intermediary metabolism, often called “housekeeping genes”), do not contain a TATA box or an initiator. Most genes of this type contain a CG-rich stretch of 20 – 50 nucleotides within ≈100 base pairs upstream of the start-site region. As we discuss later, a transcription factor called SP1 recognizes these CG-rich sequences. The dinucleotide CG is statistically underrepresented in vertebrate DNAs, and the presence of CG-rich regions just upstream from start sites is a distinctly nonrandom distribution. Such CpG islands, as they often are called, can be identified by their susceptibility to restriction enzymes (e.g., HpaII) that have CG in their recognition sequences. The presence of a CpG island in a newly cloned DNA fragment suggests that it may contain a transcription-initiation region.
http://www.ncbi.nlm.nih.gov/pubmed/17123746
Prevalence of the initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters.
Yang C, Bolotin E, Jiang T, Sladek FM, Martinez E.
76% of human core promoters lack TATA-like elements, have a high GC content, and are enriched in Sp1-binding sites.
No comments:
Post a Comment