Toward an Automated Data Quality Rule Detection in Data Warehouses
Laen...
Kuupäev
Autorid
Ajakirja pealkiri
Ajakirja ISSN
Köite pealkiri
Kirjastaja
Tartu Ülikool
Abstrakt
Data is a valuable asset from which information and knowledge are derived. However,
business success is not depending on the amount of data only, but also on the quality of
these data. On the other hand, data quality management requires a good system and the
cooperation of several parties which is time-consuming and costly. Thus, it is considered
if using artificial intelligence in ensuring data quality would help to avoid human errors,
complement human actions, and reduce personnel costs and the workload of data quality
specialists.
The objective of this thesis is to explore the current landscape of data quality solutions
to find out whether these are able to automatically detect data quality rules using machine
learning methods, specialising in data warehouses. For this, a systematic review of data
quality software available in the market and provided in academic publications was
conducted.
It was found that most of the data quality tools are used for data cleansing and fixing,
meant for domain-specific databases instead of data warehouses. Meanwhile, only a few
tools were capable of detecting data quality rules, not to mention implementing this in
data warehouses.
Whereas the subject of automated data quality rule detection is insufficiently covered
in the academic landscape and poorly represented in the market, this thesis makes a call
for action in this area.
Kirjeldus
Märksõnad
Data Quality, Data Quality Rule, Data Quality Management, Data Warehouse