The common theme of my research is the application of methods from Statistical Physics to solve problems of biological relevance. All my projects are in close collaborations with experimental groups often involving interdisciplinary teams of researchers. Given the complexity of biological systems all projects include the use of computation at some level but the focus of the research is the development of quantitative computable models and not so much the computation in and of itself.
One focus area is the detailed modeling of biophysical properties of nucleic acid molecules. The nucleic acid molecules DNA and RNA are fundamental players in all living cells. A lot can be learned about their function and their interactions with proteins or other nucleic acid molecules through biophysical experiments such as fluorescence tagging, Förster Resonance Energy Transfer (FRET), force-extension measurements using optical or magnetic tweezers or AFM, or nanopore experiments. In all these experiments, the experimental observables (e.g., fluorescence intensities, transfer efficiencies, force-extension curves, or translocation time distributions) are only rather indirect probes of the real quantities of interest, namely the microscopic mechanisms underlying a specific biological phenomenon. Thus, these experiments require theoretical models that link microscopic mechanisms to biophysical observables and thus allow extraction of mechanistic insight from measurement of biophysical observables. The development of these models is at the heart of our research. Recent examples of this work include a model for sequence-dependent flexibility of DNA molecules, a quantitative model for unwrapping of DNA from nucleosomes, and a model for the interplay between RNA secondary structure and single-stranded binding proteins.
The other focus area is the analysis of biological sequences. Biological sequences are produced in an astonishing volume, which grows exponentially with a rate faster than Moore's law (the rate at which computing speed doubles). Over the last few years, high throughput sequencing technologies have revolutionized the way biological systems can be interrogated putting the power to sequence the equivalent of entire human genomes into the hands of individual researchers. Methods from Statistical Physics are exquisitely suited to extract biological information from these large quantitative data sets. One major area of interest in our sequence analysis work is RNA editing. When a protein is supposed to be produced in an organism, the process starts by copying the genomic DNA responsible for the protein into an RNA molecule. Normally, the sequence of this RNA molecule is identical to the sequence of the underlying DNA but sometimes the RNA sequences are changed by the insertion, deletion, or substitution of individual or a few nucleotides. This phenomenon is called RNA editing. It occurs in many different organisms and seems to have several different underlying mechanisms. In many cases it is not known how the organism determines where the sequence changes should occur and how they are performed. We develop computational predictions of these editing events as well as approaches to extract the frequency and location of these editing events and mechanistic information from high throughput sequencing experiments. Beyond our research in RNA editing, we are involved in many collaborations using high throughput sequencing technology for such various tasks as assembly of viral and bacterial genomes, RNA processing, identification of cancer markers, and forensics applications.