Estimating Concordance Between Measured and Predicted Genetic Variant Effects on Chromatin Accessibility

dc.contributor.advisorAlasoo, Kaur, juhendaja
dc.contributor.authorKuningas, Kristiina
dc.contributor.otherTartu Ülikool. Loodus- ja täppisteaduste valdkondet
dc.contributor.otherTartu Ülikool. Arvutiteaduse instituutet
dc.date.accessioned2023-10-18T08:14:53Z
dc.date.available2023-10-18T08:14:53Z
dc.date.issued2023
dc.description.abstractMany GWAS studies have identified genetic variants associated with human traits or diseases. However, understanding the underlying molecular mechanisms of those associations has been challenging. Chromatin accessibility is one of those traits that has been associated with a higher risk for a disease. If chromatin is not accessible, then transcription factors cannot bind to it and gene expression or protein synthesis cannot be initiated. This can lead to an altered risk for some diseases. Therefore, it is essential to study quantitative trait loci that affect chromatin accessibility (caQTLs). One of the approaches to find genetic variants is caQTL mapping. It uses open chromatin data and genotype imputation to find associations between genetic variants and chromatin accessibility. Additional fine-mapping distinguishes the potentially causal variants. In addition, deep learning models predicting genetic variants’ effects on molecular traits have been integrated into the studies to understand even better the biological mechanisms behind the associations between genetic variants and phenotypic traits. However, the predictive accuracy of these models is still unclear. In this thesis, we created five caQTL datasets for five different cell types based on the fine-mapping results. These datasets were then used to validate the performance of a state-of-the-art deep learning model Enformer in predicting genetic variant effects on chromatin accessibility. Although other studies have evaluated Enformer predictions already, then they have done it from gene expression perspective based on measured effects from RNA-seq data. This thesis, however, compares measured genetic variants’ effects on chromatin accessibility from ATAC-seq data to Enformer’s predicted effects. It compares both the effect size but also the direction of it. It provides an initial overview of how Enformer performs on chromatin accessibility. Results showed that Enformer performs pretty well on especially the variants for which it predicts stronger effects. In addition, it provided expected results when the cell type of a measured variant was different from the cell type of the predicted variant, meaning it had more opposite effects than it would have with a similar cell type. On the other hand, it also showed very low near-zero effect scores in many cases when the measured effect was higher.et
dc.identifier.urihttps://hdl.handle.net/10062/93584
dc.language.isoenget
dc.publisherTartu Ülikoolet
dc.rightsopenAccesset
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectbioinformaticset
dc.subjectcaQTLset
dc.subjectchromatin accessibilityet
dc.subject.othermagistritöödet
dc.subject.otherinformaatikaet
dc.subject.otherinfotehnoloogiaet
dc.subject.otherinformaticset
dc.subject.otherinfotechnologyet
dc.titleEstimating Concordance Between Measured and Predicted Genetic Variant Effects on Chromatin Accessibilityet
dc.typeThesiset

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Kuningas_andmeteadus_2023.pdf
Size:
2.15 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: