Learning mixtures and DNA copy-numbers from bulk sequencing of tumor samples

views comments

Speaker: Brian Arnold, Senior Data Scientist, Princeton University.

Abstract: As tumors expand, they evolve via the accumulation of copy-number aberrations (CNAs; amplifications or deletions of DNA) and point mutations, creating a mixture of distinct subclones that contain unique mutations. Quantifying CNAs and the the frequency of subclones is critical for understanding of how tumors arise and continually evolve, but it is challenging to do so from bulk tumor samples. We have developed an algorithm, HATCHet2 (Holistic Allele-specific Tumor Copy- number Heterogeneity), that uses a variety of machine learning techniques to infer subclone-specific copy numbers changes and the presence of whole-genome duplication. HATCHet2 has several features that contribute to its superior performance, including its ability to jointly analyze multiple samples from the same tumor. We show that HATCHet2 identifies subclonal CNAs in prostate cancer samples and detects hyptertriploidy and KRAS amplifications in testicular germ cell tumors.

About the speaker: Brian is a senior data scientist in the department of Computer Science at Princeton University, where he collaborates with faculty on a variety of projects involving genomics. Currently, he works with Ben Raphael to develop new methods to study cancer, and he also works with Shane Campbell-Staton to study how human activity shapes the evolution of other species. Brian received his PhD from Harvard University where he studied evolutionary genetics in the department of Organismic and Evolutionary Biology, and he later did postdoctoral studies at the Harvard School of Public Health in the department of Epidemiology. Brian has worked on diverse topics in genomics involving plants, bacteria, elephants, and cancer.

Tags

Learning mixtures and DNA copy-numbers from bulk sequencing of tumor samples

Related Media