We study many-body problems in biology at the single-molecule level. Various methodologies developed in the field of single-molecule biophysics allow for studying one molecule’s action in great mechanistic details, e.g. one RNA polymerase transcribing DNA. However, inside a cell, many proteins work together at the same time, often creating many-body problems. We are curious to know whether proteins show “collective behaviors” or new characteristics that are not present when a protein works alone. Is the sum greater than parts? How? And why?

I. DNA supercoiling mediates long-distance interactions between RNA polymerases on a DNA template, affecting their speed and processivity

Transcription, the first step of gene expression done by RNA polymerase, is inseparable from a mechanical property of DNA, called supercoiling. During transcription, the translocation of an RNA polymerase produces DNA supercoils (twin-supercoiled-domain model [1]) and DNA supercoiling inhibits the translocation of a single RNA polymerase [2]. However, it remains untested how DNA supercoiling affects transcription when there are multiple RNA polymerases on a DNA template. Our work in E. coli suggested a built-in mechanism by which transcription-induced DNA supercoiling mediates collaborative or antagonistic “group dynamics” of RNA polymerases [3]. This is an amazing example of long-distance interactions between molecular machines and the ability of DNA to transmit molecular information to the far distance, like a wire transmitting electricity! We will investigate the biophysical mechanism of RNA polymerase group dynamics in vitro and in vivo by developing new single-molecule assays. Our study will help to begin quantitative and predictive narratives about long-range interactions between RNA polymerases in the genome.

II. Collision between RNA polymerases during transcription elongation is important for temporal profile of gene expression and cell-to-cell variations in protein abundance


As shown by this famous electron micrograph, inside cells, multiple RNA polymerases transcribe a gene at the same time, indicating the presence of RNA polymerase “traffic”, similar to that of vehicles. When a car stops on a single-lane road, the trailing cars pile up, creating traffic jam. However, it remains poorly understood what happens when an RNAP pauses while there are multiple RNAPs traveling on a DNA template. There are three scenarios, which are not mutually exclusive: (1) hard-sphere interactions between RNAPs, similar to a pile-up of cars behind, (2) RNAPs pushing paused RNAPs upon collision [4], and (3) dissociation of RNAPs (pre-mature termination) upon collision. My stochastic model of the first scenario [5] predicts that RNAPs pile up behind a paused RNAP and become separated downstream of the pause site (see Fig. A-B and C for analogy in highway traffic) .

Hence, the pause-induced RNAP traffic pattern can override the bursty promoter to result in nonbursty mRNA and protein production. This means that RNAP traffic during elongation can be a critical factor modulating gene expression output, specifically the average level, variability (noise), and timing of protein production in vivo. We aim to investigate how two and more RNA polymerases interact each other during transcription elongation. This will help us understand the origin of cell-to-cell variations in protein expression and phenotypic variations in cell populations.

III. Gene expression machines work together inside cells

Gene expression is a critical molecular process for life. RNA polymerase transcribes DNA into mRNA, which serves as a template for ribosome, the protein production machinery. mRNA has a finite lifetime, due to degradation by ribonuclease. Protein synthesis is affected by the fine balance between transcription and mRNA degradation. The accurate regulation of gene expression is essential for most aspects of cellular life, and misregulation of gene expression leads to cellular malfunctions and diseases. We study how the functions of RNA polymerase, ribosome, and ribonuclease are orchestrated inside cells. We hypothesize that the dynamic and transient interactions between these gene expression machines have regulatory outcomes in gene expression. However, it remains technically challenging to probe dynamic and transient interactions between molecules. Hence, we try to develop new methodologies to follow individual molecules’ dynamics in space and time of their functions, and most critically, to probe interactions between molecules, which can be transient and can be in a short or far range in space. We anticipate that this approach will bridge the gap between biochemistry, cell biology, and genomics and allow us to address mechanistic questions, relevant to cell physiology and potentially medicine.