Project 1 (DISCOVERING DNA METHYLATION FROM PACBIO LONG-READS)
Whole-genome bisulfite sequencing (WGBS) is considered the "golden standard" in DNA methylation identification, however, usually <80% of BS-seq reads (100~150bp) can be uniquely mapped leaving many regions unexplored. Particularly, those un- & multi- mappable sequences (e.g. repeated elements) are highly interesting with regard to epigenetics. Long-read sequencing (LRS) technology provides nearly complete genome coverage, as well as, fortunately, the modification information. In this project, I would like to improve the present approach of reading epigenetic modifications directly from the raw sequencing data. Through the construction of the PAN-methylome in Arabidopsis, we would next discover the links between complex genetic variations and DNA methylation.
Project 2 (CONTRIBUTIONS OF GENETIC HETEROGENEITY TO COMPLEX TRAITS)
Genome-wide association studies (GWAS) are now a routine tool for complex traits — and it is widely recognized that only a tiny fraction of the genetic variation can be explained using mappable loci. Although the simplest explanation for this remains polygenicity, genetic heterogeneity may also contribute. I am working on developing a new pipeline to deal with the genetic heterogeneity in plants.