RNAs store and transfer information among constituents of the cell. From their biogenesis to processing, transport, translation, catalysis, and decay, many cellular factors are involved to achieve tight regulation. Following the development of high-th...
RNAs store and transfer information among constituents of the cell. From their biogenesis to processing, transport, translation, catalysis, and decay, many cellular factors are involved to achieve tight regulation. Following the development of high-throughput DNA sequencing, it has become an essential tool to scrutinize RNA molecules in the cell in unprecedented scale and depth. This thesis concerns methodological advances in two aspects of RNA regulation. First, I develop a novel method to survey global status of polyadenylation that takes a fundamentally different approach from the existing techniques. Despite its importance in gene regulation, global investigation of the 3′ extremity of mRNA has not been feasible due to technical challenges associated with homopolymeric sequences and relative paucity of mRNA. The new technique, named as TAIL-seq, allows measuring poly(A) tail length at the genomic scale for the first time. I also discover widespread uridylation and guanylation at the downstream of poly(A) tail. The U-tails are generally attached to short poly(A) tails (<25 nt) while the G-tails are found mainly on longer poly(A) tails (>40 nt), implicating their generic roles in mRNA stability control. Furthermore, TAIL-seq identifies, with a single nucleotide resolution, numerous nucleolytic events involved in microRNA processing and mRNA cleavage. TAIL-seq will enable exploration of unforeseen diversity of RNA processing and modification.
Secondly, I describe an array of new analytic methods to crosslinking, immunoprecipitation, and sequencing (CLIP-seq) to enhance its utility in the investigation of RNA-protein interactions. CLIP-seq arose as one of the standard techniques to retrieve transcriptome-wide information of RNA-protein interactions in last few years. However, generalized analysis techniques and tools have been missing unlike the other RNA-seq applications. In this study, I generalize analytic workflow for binding site identification by developing new methods. I also provide an open source toolchain that covers most of the common analyses performed for CLIP-seq. In addition, I present ecliptic, a fully automated pipeline, and it will speed up the research of RNA-protein interactions and make more information accessible to researchers.
High-throughput experiments are expanding biology by providing unbiased view and leading to unexpected observations. In this thesis, I introduce two types of development for global investigation of poly(A) tails and single nucleotide resolution survey of RNA-protein interactions. By applying these methods, I discover several phenomena at the 3′ end of RNAs and the binding interfaces between RNA and RBPs. Further development and improvement will offer an ample opportunity for the discovery of unforeseen regulatory pathways.