Why don't transcription factors get lost?

31 May 2013

Barak Cohen
Centre for Genome Sciences and Systems Biology
Department of Genetics
Washington University School of Medicine


Large genomes are packed with millions of copies of the short, degenerate sequence motifs recognized by transcription factors (TFs). TFs must distinguish their true target binding sites from a vast genomic excess of potential binding sites. Detailed maps of TF-bound genomic regions are emerging from consortium-driven efforts such as ENCODE, yet the DNA sequence features that distinguish functional sites from non-functional sites are still poorly understood.

We used CRE-seq1, a high-throughput enhancer assay conducted in living mouse retinas, to measure, on plasmids, the cis-regulatory activity of 1,300 short (84bp) genomic sequences centered on ChIP-seq peaks for the transcription factor Crx2. We found that DNA sequences from ChIP-seq peaks activate transcription at higher levels than random DNA sequences, and that this activity depends on intact Crx binding sites. Surprisingly, unbound genomic regions with equivalent numbers of Crx motifs do not activate transcription more than random DNA, even in a permissive plasmid environment, demonstrating that unbound Crx motifs are intrinsically non-functional. Random DNA sequences produce a range of cis-regulatory effects, which underscores the importance of including distributions of control sequences in large-scale functional assays, and not simply relying on a small set of "representative" controls. Bound Crx motifs are distinguished from unbound Crx motifs by sequence features associated with high GC content, such as nucleosome positioning sequences and DNA minor groove width.

Our work suggests that most occurrences of Crx motifs in the genome are intrinsically non-functional and not merely inaccessible within repressive chromatin, and that chromatin marks associated with active transcription are a consequence and not a cause of functional sites.


current theory lunch schedule