2D multi-scale approaches for analysis of high-throughput sequencing data
Supervisor: Heejung Shim
Available for: MSc and undergraduate research projects.
Location: Melbourne Integrative Genomics, University of Melbourne
Project title: 2D multi-scale approaches for analysis of high-throughput sequencing data
Background: identification of differences between multiple groups in molecular and cellular phenotypes measured by high-throughput sequencing assays is frequently encountered in genomics applications. For example, common problems include eQTL mapping using RNA-seq and detecting differences in chromatin accessibility across tissues using DNase-seq or ATAC-seq. Those high-throughput sequencing data provide high-resolution measurements on how traits vary along the whole genome in each sample. Previously, we developed two multi-scale methods and produced two software packages, WaveQTL and multiseq, that better exploit the high-resolution information. Currently, we are building on these methods to develop 2D multi-scale methods for analysis of paired-end reads and Hi-C data.
Proposed projects: The student will develop a software package to implement 2D multi-scale methods. If the student is interested, another project is benchmarking currently available methods against our approaches to real data and computer simulations.
Learning outcomes: software development, statistics / machine learning, programming using C\C++ (or Python) and R, statistical analysis of complex and large-scale genomic data, data visualization.