Valikunihe ja selle korrigeerimise meetodid TÜ Eesti geenivaramu näitel
Laen...
Kuupäev
Autorid
Ajakirja pealkiri
Ajakirja ISSN
Köite pealkiri
Kirjastaja
Tartu Ülikool
Abstrakt
Bakalaureusetöö eesmärk oli uurida, millisel määral mõjutab valikunihe TÜ Eesti geenivaramu andmetel hinnatud kodade virvendusarütmia riski mudelit. Valikunihe tekib olukorras, kus analüüs ei põhine juhuvalimil, vaid uuringuga liitumist võivad mõjutada teatud tegurid (sh sellised, millel on
seos uuritava tunnusega). Kuna geenivaramus on kasutatud erinevaid värbamisviise, koosneb see sisuliselt kahest kohordist ning on põhjust arvata, et nende koos analüüsimisel võib esineda valikunihe. Hinnates mudelid kodade virvendusarütmia tekkimise riskile viie aasta jooksul alates liitumise hetkest selgus, et kahes kohordis olid parameetrite hinnangud erinevad. Erinevuste vähendamiseks arvutati teise kohordiga liitumise tõenäosuse mudeli abil pöördtõenäosuskaalud ning hinnati kaalutud arütmia mudel teises kohordis. Selgus, et kaalumine ühtlustas parameetrite hinnanguid vaid vähesel määral.
Lisaks hinnati mudelid kohortides, jättes välja väga väikse või suure liitumistõenäosusega inimesed, mis samuti ei andnud sarnasemaid hinnanguid. Kuna kaalumine ei lähendanud hinnanguid märkimisväärselt, võib järeldada, et analüüsi ei kaasatud tunnuseid, mis kirjeldaksid hästi kohortidevahelisi erinevusi, või on tegemist ajalise faktoriga, kuna geenivaramu andmeid on kogutud alates 2002. aastast ning selle aja jooksul on muutunud haiguste diagnoosimine, mis võib samuti arütmia diagnoosi saamist mõjutada.
The aim of this bachelor’s thesis was to investigate to what extent selection bias affects a model for the risk of cardiac arrhythmia estimated using data from the Estonian Biobank of the University of Tartu. Selection bias arises when the analysis is not based on a random sample, but participation in the study may be influenced by certain factors. As different recruitment strategies have been used in the biobank, the data can be considered to consist of two cohorts, suggesting that selection bias may be present in joint analyses. Logistic regression models were estimated to assess the risk of developing cardiac arrhythmia within five years of joining the biobank. The results in dicated that the parameter estimates differed between the cohorts. To reduce these differences, inverse probability weights were calculated and a weighted model was estimated. However, weighting reduced the differences only slight ly. In addition, models were estimated using trimmed cohorts, which also did not reduce the difference in the estimates significantly. This suggests that the variables included in the analysis may not sufficiently capture the differences between the cohorts or that some temporal factors may play a role, as the biobank data have been collected over a long period during which diagnostic practices may have changed, potentially affecting the probability of receiving an arrhythmia diagnosis.
The aim of this bachelor’s thesis was to investigate to what extent selection bias affects a model for the risk of cardiac arrhythmia estimated using data from the Estonian Biobank of the University of Tartu. Selection bias arises when the analysis is not based on a random sample, but participation in the study may be influenced by certain factors. As different recruitment strategies have been used in the biobank, the data can be considered to consist of two cohorts, suggesting that selection bias may be present in joint analyses. Logistic regression models were estimated to assess the risk of developing cardiac arrhythmia within five years of joining the biobank. The results in dicated that the parameter estimates differed between the cohorts. To reduce these differences, inverse probability weights were calculated and a weighted model was estimated. However, weighting reduced the differences only slight ly. In addition, models were estimated using trimmed cohorts, which also did not reduce the difference in the estimates significantly. This suggests that the variables included in the analysis may not sufficiently capture the differences between the cohorts or that some temporal factors may play a role, as the biobank data have been collected over a long period during which diagnostic practices may have changed, potentially affecting the probability of receiving an arrhythmia diagnosis.
Kirjeldus
Märksõnad
valikunihe, pöördtõenäosuskaalud, südame rütmihäired, selection bias, inverse probability weighting, heart arrhythmias