Adaptive out-of-order handling in streaming conformance checking
Date
2024-10-29
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Kui äriprotsessides tekib vigu võib sellel olla laiaulatuslik mõju organisatsiooni toimimisele. Seetõttu on oluline leida kõrvalekalded etteantud äriprotsessist kiiresti, täpselt ning selgelt. Hetkel parim viis täpseks ning selgeks vastavuskontrolliks, ehk kõrvalekallete tuvastamiseks, on joondus, mis näitab samm-sammult päriselu tegevuste vastavust äriprotsessile. Paraku on joondus aga praeguste meetodite juures arvutuslikult aeglane ning kiiresti saabuvate andmete puhul ebaotstarbekas. Töö esimene osa tutvustab kohandatud prefiksipuul toimivat vastavuskontrolli mis võib küll mõjutada analüüsi täpsust, kuid on arvutuslikult senistest meetoditest märkimisväärselt efektiivsem.
Mida kauem aega möödub kõrvalekalde tekkimisest selle avastamiseni, seda suurem on kõrvalekalde potentsiaalne mõju. Voogandmed, ehk peaaegu reaalajas saabuvad andmed, on olulised selleks, et teha äriprotsessides vastavuskontrolli võimalikult lähedal sündmuste juhtumise hetkele. Töö teine osa esitleb algoritmi mis töötab voogandmetel, kasutab oma väljundis joondust ja on kohati mitu suurusjärku kiirem varasematest voogandmetel töötavatest meetoditest.
Voogandmetel tuginev analüüs on olemuslikult keerukas, kuna andmeid saabub pidevalt ning, teoreetiliselt, lõputult. Töö kolmas osa analüüsib kuidas täiendada algoritmi nii, et me oskaks hinnata äriprotsessi juhtumite terviklikkust ning seda, millal juhtum lõpule jõuab.
Kiirete ja hajusate andmevoogude puhul võib juhtuda, et sündmused saabuvad väärjärjestuses - sündmus, mis juhtus päriselus hiljem, saabub süsteemi enne sündmust mis juhtus temast varem. Töö neljas osa tutvustab teadaolevalt esimest algoritmi mis taolist olukorda vastavuskontrollis lahendab. Antud lahendus on kohanemisvõimeline, suutes reguleerida end sõltuvalt väärjärjestuses saabunud sõnumite mahust.
Käesolevas kokkuvõttes käsitleti pealkirjas olevaid teemasid väärjärjestuses, kuid loodetavasti oli sellega võimalik lugemisprotsessi ajal kohaneda.
The occurrence of errors in business processes can have a wide impact on the organization. Thus, it is important to find deviations in a fast, accurate, and explainable manner. The state-of-the-art approach for finding deviations via conformance checking is an alignment, showing how the real-life activities match the process. Unfortunately, finding the alignment using current methods is computationally slow and impractical for fast-arriving data. The first part of the thesis introduces a trie-based conformance checking approach that is computationally more effective than previous methods with a small impact on accuracy. The longer it takes from the occurrence of a discrepancy until it is discovered, the bigger its potential impact. Streaming data, i.e., data that arrives in near real-time, is important as it allows conformance checking to take place close to the actual occurrence of events. The second contribution presents an algorithm that works on streaming data, outputs an alignment, and is, in some experiments, several orders of magnitude faster than the previous state of the art. Analyzing streaming data is complex, as the data arrives continuously, and the stream is theoretically infinite. The third contribution analyzes how to improve the algorithm so that we could assess the completeness of a case in a business process and the confidence of the case concluding. Fast-paced and distributed data streams can cause out-of-order event arrival. That is, an event that occurred later arrives in the system before another event that occurred earlier. In the final contribution, a novel method is introduced for handling out-of-order event arrival, being knowingly the first method to tackle this problem in conformance checking. The method is adaptable, regulating itself based on the level of out-of-orderedness in the stream. In this summary, the topics of the title were discussed out of order, but hopefully, it was possible to adapt to this during the reading process.
The occurrence of errors in business processes can have a wide impact on the organization. Thus, it is important to find deviations in a fast, accurate, and explainable manner. The state-of-the-art approach for finding deviations via conformance checking is an alignment, showing how the real-life activities match the process. Unfortunately, finding the alignment using current methods is computationally slow and impractical for fast-arriving data. The first part of the thesis introduces a trie-based conformance checking approach that is computationally more effective than previous methods with a small impact on accuracy. The longer it takes from the occurrence of a discrepancy until it is discovered, the bigger its potential impact. Streaming data, i.e., data that arrives in near real-time, is important as it allows conformance checking to take place close to the actual occurrence of events. The second contribution presents an algorithm that works on streaming data, outputs an alignment, and is, in some experiments, several orders of magnitude faster than the previous state of the art. Analyzing streaming data is complex, as the data arrives continuously, and the stream is theoretically infinite. The third contribution analyzes how to improve the algorithm so that we could assess the completeness of a case in a business process and the confidence of the case concluding. Fast-paced and distributed data streams can cause out-of-order event arrival. That is, an event that occurred later arrives in the system before another event that occurred earlier. In the final contribution, a novel method is introduced for handling out-of-order event arrival, being knowingly the first method to tackle this problem in conformance checking. The method is adaptable, regulating itself based on the level of out-of-orderedness in the stream. In this summary, the topics of the title were discussed out of order, but hopefully, it was possible to adapt to this during the reading process.