Bench-ranking: a prescriptive analysis approach for large knowledge graphs query workloads

Ragab, Mohamed

Bench-ranking: a prescriptive analysis approach for large knowledge graphs query workloads

dc.contributor.advisor	Tommasini, Riccardo, juhendaja
dc.contributor.advisor	Awad, Ahmed, juhendaja
dc.contributor.author	Ragab, Mohamed
dc.contributor.other	Tartu Ülikool. Loodus- ja täppisteaduste valdkond	et
dc.date.accessioned	2022-12-21T13:14:48Z
dc.date.available	2022-12-21T13:14:48Z
dc.date.issued	2022-12-21
dc.description.abstract	Relatsiooniliste suurandmete (BD) töötlemisraamistike kasutamine suurte teadmiste graafide töötlemiseks kätkeb endas võimalust päringu jõudlust optimeerimida. Kaasaegsed BD-süsteemid on samas keerulised andmesüsteemid, mille konfiguratsioonid omavad olulist mõju jõudlusele. Erinevate raamistike ja konfiguratsioonide võrdlusuuringud pakuvad kogukonnale parimaid tavasid parema jõudluse saavutamiseks. Enamik neist võrdlusuuringutest saab liigitada siiski vaid kirjeldavaks ja diagnostiliseks analüütikaks. Lisaks puudub ühtne standard nende uuringute võrdlemiseks kvantitatiivselt järjestatud kujul. Veelgi enam, suurte graafide töötlemiseks vajalike konveierite kavandamine eeldab täiendavaid disainiotsuseid mis tulenevad mitteloomulikust (relatsioonilisest) graafi töötlemise paradigmast. Taolisi disainiotsuseid ei saa automaatselt langetada, nt relatsiooniskeemi, partitsioonitehnika ja salvestusvormingute valikut. Käesolevas töös käsitleme kuidas me antud uurimuslünga täidame. Esmalt näitame disainiotsuste kompromisside mõju BD-süsteemide jõudluse korratavusele suurte teadmiste graafide päringute tegemisel. Lisaks näitame BD-raamistike jõudluse kirjeldavate ja diagnostiliste analüüside piiranguid suurte graafide päringute tegemisel. Seejärel uurime, kuidas lubada ettekirjutavat analüütikat järjestamisfunktsioonide ja mitmemõõtmeliste optimeerimistehnikate (nn "Bench-Ranking") kaudu. See lähenemine peidab kirjeldava tulemusanalüüsi keerukuse, suunates praktiku otse teostatavate teadlike otsusteni.	et
dc.description.abstract	Leveraging relational Big Data (BD) processing frameworks to process large knowledge graphs yields a great interest in optimizing query performance. Modern BD systems are yet complicated data systems, where the configurations notably affect the performance. Benchmarking different frameworks and configurations provides the community with best practices for better performance. However, most of these benchmarking efforts are classified as descriptive and diagnostic analytics. Moreover, there is no standard for comparing these benchmarks based on quantitative ranking techniques. Moreover, designing mature pipelines for processing big graphs entails considering additional design decisions that emerge with the non-native (relational) graph processing paradigm. Those design decisions cannot be decided automatically, e.g., the choice of the relational schema, partitioning technique, and storage formats. Thus, in this thesis, we discuss how our work fills this timely research gap. Particularly, we first show the impact of those design decisions’ trade-offs on the BD systems’ performance replicability when querying large knowledge graphs. Moreover, we showed the limitations of the descriptive and diagnostic analyses of BD frameworks’ performance for querying large graphs. Thus, we investigate how to enable prescriptive analytics via ranking functions and Multi-Dimensional optimization techniques (called ”Bench-Ranking”). This approach abstracts out from the complexity of descriptive performance analysis, guiding the practitioner directly to actionable informed decisions.	en
dc.description.uri	https://www.ester.ee/record=b5533321	et
dc.identifier.isbn	978-9916-27-114-8
dc.identifier.isbn	978-9916-27-115-5 (pdf)
dc.identifier.issn	2613-5906
dc.identifier.issn	2806-2345 (pdf)
dc.identifier.uri	http://hdl.handle.net/10062/88356
dc.language.iso	eng	et
dc.relation.ispartofseries	Dissertationes informaticae Universitatis Tartuensis;40
dc.rights	openAccess	et
dc.rights	Attribution-NonCommercial-NoDerivatives 4.0 International	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject	big data	en
dc.subject	graphs	en
dc.subject	data processing	en
dc.subject	semantic web	en
dc.subject.other	dissertatsioonid	et
dc.subject.other	ETD	et
dc.subject.other	dissertations	et
dc.subject.other	väitekirjad	et
dc.subject.other	suurandmed	et
dc.subject.other	graafid	et
dc.subject.other	andmetöötlus	et
dc.subject.other	semantiline veeb	et
dc.title	Bench-ranking: a prescriptive analysis approach for large knowledge graphs query workloads	et
dc.title.alternative	Bench-Ranking: ettekirjutav analüüsimeetod suurte teadmiste graafide päringutele	et
dc.type	Thesis	et

Failid

Originaal pakett

Nüüd näidatakse 1 - 1 1

Nimi:: ragab_mohamed.pdf
Suurus:: 7.58 MB
Formaat:: Adobe Portable Document Format
Kirjeldus:

Lae alla

Litsentsi pakett

Nüüd näidatakse 1 - 1 1

Nimi:: license.txt
Suurus:: 1 B
Formaat:: Item-specific license agreed upon to submission
Kirjeldus:

Lae alla

Kollektsioonid

1. TÜ väitekirjad alates 2004. Kaitstud doktoritööd, teadusmagistritööd. Doctoral theses, PhD, MSc, MPhil.