Research
Computational Infrastructure
Genomic data processing and analysis are both computationally intensive and time consuming that require the support of a computational infrastructure with well configured hardware and engineered software application pipeline. We have been working closely with the HPCC group of MD Anderson and IBM scientists and developed an automated NGS data processing and analysis pipeline that is capable of outputting standardized final results within three days upon receipt of the raw data. The pipeline has been powering the sequencing efforts of the Moon Shots Program® and the integration of genomic data into the institution's big data platform. Current research includes collaborations with IBM scientists to increase the throughput of the pipeline through hardware and software enhancements and management of the metadata and the implementation of a fast track genomic profiling system to produce reports in a week starting from genomic materials (in collaboration with faculty from Experimental Therapeutics).
Algorithm and Software Development
While working on the computational infrastructure or with biologists/clinicians on various research projects, we have identified areas where new computational approaches need to be developed. We implement software applications that can be easily deployed to our production pipeline or users within the MD Anderson community. Several widely used applications have been developed by the lab. We are currently working on algorithms/software dealing with the identification of rare genomic variations, analyzing NGS data derived from PDX models, and the potential of utilizing personalized reference genome to enhance the accuracy of alignment.
Customized Computational Data Analysis
With an automated production pipeline generating standardized analytical results, our computational scientists are able to concentrate more on interacting with the biologists/clinicians to better understand the biological or clinical issues of interests and curtail the downstream analysis accordingly. We have established long term collaborations with faculty members in departments throughout the institution and provided customized computational supports with excellent outcomes that are well reflected in our publication list.