Toward an Automated Data Quality Rule Detection in Data Warehouses

dc.contributor.advisorNikiforova, Anastasija, juhendaja
dc.contributor.authorMartinsaari, Heidi Carolina
dc.contributor.otherTartu Ülikool. Loodus- ja täppisteaduste valdkondet
dc.contributor.otherTartu Ülikool. Arvutiteaduse instituutet
dc.date.accessioned2023-10-24T08:33:55Z
dc.date.available2023-10-24T08:33:55Z
dc.date.issued2023
dc.description.abstractData is a valuable asset from which information and knowledge are derived. However, business success is not depending on the amount of data only, but also on the quality of these data. On the other hand, data quality management requires a good system and the cooperation of several parties which is time-consuming and costly. Thus, it is considered if using artificial intelligence in ensuring data quality would help to avoid human errors, complement human actions, and reduce personnel costs and the workload of data quality specialists. The objective of this thesis is to explore the current landscape of data quality solutions to find out whether these are able to automatically detect data quality rules using machine learning methods, specialising in data warehouses. For this, a systematic review of data quality software available in the market and provided in academic publications was conducted. It was found that most of the data quality tools are used for data cleansing and fixing, meant for domain-specific databases instead of data warehouses. Meanwhile, only a few tools were capable of detecting data quality rules, not to mention implementing this in data warehouses. Whereas the subject of automated data quality rule detection is insufficiently covered in the academic landscape and poorly represented in the market, this thesis makes a call for action in this area.et
dc.identifier.urihttps://hdl.handle.net/10062/93699
dc.language.isoenget
dc.publisherTartu Ülikoolet
dc.rightsopenAccesset
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectData Qualityet
dc.subjectData Quality Ruleet
dc.subjectData Quality Managementet
dc.subjectData Warehouseet
dc.subject.othermagistritöödet
dc.subject.otherinformaatikaet
dc.subject.otherinfotehnoloogiaet
dc.subject.otherinformaticset
dc.subject.otherinfotechnologyet
dc.titleToward an Automated Data Quality Rule Detection in Data Warehouseset
dc.typeThesiset

Failid

Originaal pakett

Nüüd näidatakse 1 - 1 1
Laen...
Pisipilt
Nimi:
Martinsaari_Andmeteadus_2023.pdf
Suurus:
1.61 MB
Formaat:
Adobe Portable Document Format
Kirjeldus:

Litsentsi pakett

Nüüd näidatakse 1 - 1 1
Laen...
Pisipilt
Nimi:
license.txt
Suurus:
1.71 KB
Formaat:
Item-specific license agreed upon to submission
Kirjeldus: