Exploring molecular mechanisms of RNA-mediated gene regulation
Bacteria and archaea possess adaptive immunity against foreign genetic elements using CRISPR–Cas systems. Upon infection, new foreign DNA sequences are captured and integrated into the host CRISPR locus as new spacers. The CRISPR locus is transcribed and processed to generate mature CRISPR RNAs, each encoding a unique spacer sequence. Each crRNA associates with Cas effector proteins that use crRNAs as guides to silence foreign genetic elements that match the crRNA sequence. We are interested in elucidating the mechanisms underlying CRISPR–Cas immunity, especially in understanding the functions of Cas proteins and developing novel CRISPR-based tools for biotechnology applications.
CRISPR–Cas systems are highly diverse and have been divided into many subtypes. The Type I-E system in E. coli uses the Cascade complex for RNA-guided silencing of foreign DNA. Cascade is composed of Cse1, Cse2, Cas7, Cas5e, and Cas6e subunits and one crRNA, forming a structure that binds and unwinds dsDNA to form an R-loop. Recently, we used single-particle electron microscopy reconstructions of dsDNA-bound Cascade with and without Cas3 to reveal that Cascade positions the PAM-proximal end of the DNA duplex at the Cse1 subunit and near the site of Cas3 association. The finding that the DNA target and Cas3 colocalize with Cse1 implicates this subunit in a key target-validation step during DNA interference. We show biochemically that base pairing of the PAM region is unnecessary for target binding but critical for Cas3-mediated degradation. Together, these data show that the Cse1 subunit of Cascade functions as an essential partner of Cas3 by recognizing DNA target sites and positioning Cas3 adjacent to the PAM to ensure cleavage (in collaboration with Eva Nogales, UC Berkeley, HHMI).
Type II CRISPR–Cas systems use an RNA-guided DNA endonuclease, Cas9, to generate double-strand breaks in invasive DNA during an adaptive bacterial immune response. Cas9-mediated cleavage is strictly dependent on the presence of a protospacer adjacent motif (PAM) in the target DNA. The ability to program Cas9 for DNA cleavage at specific sites defined by guide RNAs has led to its adoption as a versatile platform for genome engineering and gene regulation. To understand the how Cas9 uses its guide RNA for interrogation of target DNA sequences, we have solved molecular structures of Cas9 in the apo, guide RNA-bound, and target DNA-bound states. Crystal structures of Cas9 bound to single-guide RNA reveal a conformation distinct from both the apo and DNA-bound states, in which the 10-nucleotide RNA “seed” sequence required for initial DNA interrogation is preordered in an A-form conformation. This segment of the guide RNA is essential for Cas9 to form a DNA recognition–competent structure that is poised to engage double-stranded DNA target sequences. We construe this as convergent evolution of a “seed” mechanism reminiscent of that used by Argonaute proteins during RNA interference in eukaryotes.
CRISPR RNA-guided surveillance complexes target foreign DNA for degradation through RNA–DNA base-pairing and recognition of a unique sequence adjacent to the target DNA called the protospacer adjacent motif (PAM). Addressing how the DNA is unwound during this binding event, and how short 20–30 base-pair target sequences are efficiently located and recognized within entire genomes, has been a recent focus of our research. In collaboration with Eric Greene’s laboratory at Columbia University, we have applied a combination of single-molecule and bulk biochemical experiments to resolve the mechanism of DNA interrogation for two phylogenetically unrelated complexes: Cas9, the DNA-targeting protein found in Type II CRISPR–Cas systems (S. pyogenes), and Cascade, the DNA-targeting complex found in Type I-E CRISPR–Cas systems (E. coli). Our results have revealed that the target search is PAM-guided, and that these distinct RNA-guided complexes have converged on a common mechanism for target DNA recognition.
Whereas Type I and Type II CRISPR–Cas surveillance complexes target double-stranded DNA, Type III complexes can target single-stranded RNA and DNA. In collaboration with John van der Oost’s laboratory, we are studying the structure and function of Type III complexes. Near-atomic resolution cryo–electron microscopy reconstructions of native Type III Cmr (CRISPR RAMP module) complexes in the absence and presence of target RNA reveal a helical protein arrangement that positions the crRNA for substrate binding. Thumblike β hairpins intercalate between segments of duplexed crRNA:target RNA to facilitate cleavage of the target at 6-nucleotide intervals. The Cmr complex is architecturally similar to the Type I CRISPR-Cascade complex, suggesting divergent evolution of these immune systems from a common ancestor. This work was done in collaboration with Eva Nogales, UC Berkeley, HHMI).
CRISPR-harboring organisms generate immunological memory of previous infections by capturing short segments of foreign DNA for integration into CRISPR loci as spacer sequences. Central to this process are Cas1 and Cas2 – the only conserved proteins in all CRISPR systems. Using a combination of biochemical, structural, and genetic approaches, we found that Cas1 and Cas2 functions as a protein complex. The Cas1–Cas2 complex captures ~30 bp of foreign DNA and integrates them into the CRISPR locus via a direct nucleophilic reaction similar to many retroviral integrases and transposases. Our results uncover the structural basis for foreign DNA capture and the mechanism by which Cas1–Cas2 functions as a molecular ruler to dictate the sequence architecture of CRISPR loci