BIGLab has been exploring the mysterious hidden wonders of the genome “outback” (a.k.a. the non-coding genome) via both computational and experimental investigations. Ultimately, my lab is aimed toward answering the following fundamental questions in biology.

1) What are biological roles of non-coding genomes ?

2) How complex are our genome and transcriptomes ?

3) Where do the regulatory non-coding elements reside in metazoa genomes and how do these elements regulate gene expression?

4) How have the coding and non-coding functions of RNAs evolved?

If DNA is Hope, then RNA is Dream.


Armed with a strong background in experimental biology, computational biology, and machine learning, I have taken full advantage of high-throughput sequencing technologies to tackle the questions related to gene expression stated above. I am primarily interested in identifying novel regulatory non-coding elements that control gene expression, such as microRNAs (miRNAs) and long non-coding RNAs, and in understanding their mechanisms of action. To achieve these goals, I first reconstructed comprehensive, high-confidence coding and non-coding transcriptome maps in metazoa ranging from C. elegans to human. I next began actively collaborating with geneticists to identify functionally important segments in non-coding genes using a genome-editing approach, as well as with clinicians to search for regulatory non-coding elements and genomic variations that cause cancer development and progression.

Regulatory ncRNAs

We aim to sequence and analyze whole transcriptomes using orthogonal RNA-seqs including general RNA-seqs, deepCAGEs, and 3P-seqs, to identify regulatory ncRNAs including miRNAs and lincRNAs in non-coding genomes, and to elucidate their regulatory roles for gene expressions and critical developmental processes.

Transcriptome maps

We aim to develop computational methods to comprehensively reconstruct transcriptome from multimodal RNA-seqs (stranded and unstranded RNA-seqs) and orthogonal RNA-seqs (deepCAGEs and 3P-seqs). We also have interests in studying mechanisms of alternative splicing / polyadenylations and the relationship to diseases.


Personal Genomics

We aim to develop computational and statistical algorithms and programs for finding and analyzing genetic/epigentic factors related to personal phenotypes (rare and complex diseases and polymorphisms) from whole genome/exome sequencing data. We have interested in finding de novo, somatic, and germline variations in nucleotide and chromosome levels.