Logo del repository
  1. Home
 
Opzioni

Computational modelling of the interplay between genetic and epigenetic in tumour evolution

VALERIANI, LUCREZIA
  • doctoral thesis

Abstract
Cancer arises from complex interactions between genetic and epigenetic alterations that accumulate across cell populations over time. Copy number alterations (CNAs), DNA methylation changes, and clonal heterogeneity are fundamental drivers of tumour progression; understanding how these layers interact is key to decoding tumour evolution. This thesis presents a comprehensive computational framework for the integrative analysis of tumour evolution through the joint modelling of genetic, epigenetic, and clonal variation using whole-genome sequencing data. In the first part, we introduce a Bayesian approach for allele-specific copy number inference that leverages the native properties of long-read sequencing, including haplotype phasing and direct DNA methylation detection. The model jointly analyses read depth, B-allele frequency, and variant allele frequency to estimate tumour purity, ploidy, and allele-specific copy number states, while integrating haplotype-resolved methylation profiles. This formulation enables the identification of genomic regions where structural imbalance and methylation asymmetry co-occur, providing new insights into the interplay between genetic and epigenetic regulation. Benchmarking across simulated datasets spanning multiple sequencing depths and purity levels demonstrates comparable accuracy compared to existing short-read methods. Application to colorectal cancer organoids and preliminary analysis of 100-patient Genomics England cohort further reveals consistent relationships between CNAs and allele-specific methylation in key regulatory regions, supporting the hypothesis that copy number variation and methylation jointly shape gene regulation during tumour evolution. The second part of the thesis introduces a population genetics-informed simulation framework designed to generate realistic synthetic tumour genomes and corresponding sequencing data for benchmarking and methodological development. The simulator models tumour evolution as a stochastic branching process, where each subclone acquires somatic point mutations, CNAs, and epigenetic alterations under user-defined evolutionary parameters such as mutation rate, selection strength, and clonal expansion dynamics. This approach captures the genetic heterogeneity observed in real tumours while maintaining explicit control over ground-truth evolutionary history. The framework includes a read-level sequencing simulator capable of producing both real whole genome sequencing data with customizable coverage, error rate, read length, and tumour purity, thereby allowing systematic evaluation of analytical tools under controlled conditions. To ensure reproducibility and accessibility, the simulation platform is complemented by a standardized Nextflow pipeline, nf-core/tumourevo, which integrates modules for variant and driver annotation, copy number quality control, mutational signature analysis, and subclonal reconstruction. Benchmarking experiments demonstrate the framework’s capacity to reproduce realistic tumour evolutionary scenarios and to quantify the accuracy and limitations of existing inference methods. Together, these contributions establish a unified framework that connects Bayesian modelling, evolutionary simulation, and long-read sequencing technologies. By jointly analysing copy number, methylation, and clonal structure, this work advances our capacity to interpret tumour evolution and provides a robust methodological foundation for integrating multi-layer molecular data in cancer genomics.
Cancer arises from complex interactions between genetic and epigenetic alterations that accumulate across cell populations over time. Copy number alterations (CNAs), DNA methylation changes, and clonal heterogeneity are fundamental drivers of tumour progression; understanding how these layers interact is key to decoding tumour evolution. This thesis presents a comprehensive computational framework for the integrative analysis of tumour evolution through the joint modelling of genetic, epigenetic, and clonal variation using whole-genome sequencing data. In the first part, we introduce a Bayesian approach for allele-specific copy number inference that leverages the native properties of long-read sequencing, including haplotype phasing and direct DNA methylation detection. The model jointly analyses read depth, B-allele frequency, and variant allele frequency to estimate tumour purity, ploidy, and allele-specific copy number states, while integrating haplotype-resolved methylation profiles. This formulation enables the identification of genomic regions where structural imbalance and methylation asymmetry co-occur, providing new insights into the interplay between genetic and epigenetic regulation. Benchmarking across simulated datasets spanning multiple sequencing depths and purity levels demonstrates comparable accuracy compared to existing short-read methods. Application to colorectal cancer organoids and preliminary analysis of 100-patient Genomics England cohort further reveals consistent relationships between CNAs and allele-specific methylation in key regulatory regions, supporting the hypothesis that copy number variation and methylation jointly shape gene regulation during tumour evolution. The second part of the thesis introduces a population genetics-informed simulation framework designed to generate realistic synthetic tumour genomes and corresponding sequencing data for benchmarking and methodological development. The simulator models tumour evolution as a stochastic branching process, where each subclone acquires somatic point mutations, CNAs, and epigenetic alterations under user-defined evolutionary parameters such as mutation rate, selection strength, and clonal expansion dynamics. This approach captures the genetic heterogeneity observed in real tumours while maintaining explicit control over ground-truth evolutionary history. The framework includes a read-level sequencing simulator capable of producing both real whole genome sequencing data with customizable coverage, error rate, read length, and tumour purity, thereby allowing systematic evaluation of analytical tools under controlled conditions. To ensure reproducibility and accessibility, the simulation platform is complemented by a standardized Nextflow pipeline, nf-core/tumourevo, which integrates modules for variant and driver annotation, copy number quality control, mutational signature analysis, and subclonal reconstruction. Benchmarking experiments demonstrate the framework’s capacity to reproduce realistic tumour evolutionary scenarios and to quantify the accuracy and limitations of existing inference methods. Together, these contributions establish a unified framework that connects Bayesian modelling, evolutionary simulation, and long-read sequencing technologies. By jointly analysing copy number, methylation, and clonal structure, this work advances our capacity to interpret tumour evolution and provides a robust methodological foundation for integrating multi-layer molecular data in cancer genomics.
Archivio
https://hdl.handle.net/11368/3126320
https://ricerca.unityfvg.it/handle/11368/3126320
Diritti
embargoed access
FVG url
https://arts.units.it/bitstream/11368/3126320/2/Valeriani_PhDThesis_Final.pdf
Soggetti
  • Bayesian inference

  • Hidden Markov Model

  • Bioinformatic

  • Copy Number Calling

  • Simulations

  • Settore INF/01 - Info...

google-scholar
Get Involved!
  • Source Code
  • Documentation
  • Slack Channel
Make it your own

DSpace-CRIS can be extensively configured to meet your needs. Decide which information need to be collected and available with fine-grained security. Start updating the theme to match your nstitution's web identity.

Need professional help?

The original creators of DSpace-CRIS at 4Science can take your project to the next level, get in touch!

Realizzato con Software DSpace-CRIS - Estensione mantenuta e ottimizzata da 4Science

  • Impostazioni dei cookie
  • Informativa sulla privacy
  • Accordo con l'utente finale
  • Invia il tuo Feedback