Algebraic approaches to problems arising in decentralized systems
Failid
Kuupäev
2021-09-20
Autorid
Ajakirja pealkiri
Ajakirja ISSN
Köite pealkiri
Kirjastaja
Abstrakt
Seoses kasvava pilveplatvormide ja uute juhtmeta ühenduste tehnoloogiate kasutuselevõtuga on digitaalse kommunikatsiooni paradigma muutunud server-klient mudelist keerukateks hajutatud mudeliteks. Teenusepakkujad peavad hajutama teenuseid erinevate andmekeskuste vahel käitlemaks suuri andmemahtusid ja olemaks kasutajatele füüsiliselt lähedal. Kuid andmete duplitseerimine andmekeskuste vahel on ressursse raiskav ning kulukas. Me vaatleme kolme lähenemist, mis võimaldavad vähendada edastatud andmete mahtu serverite ja kasutajate vahel. Me teisendame uuritavad kommunikatsiooniprobleemid matemaatiliste probleemidena ja seejärel rakendame meetodeid algebrast vastava matemaatilise ülesande lahendamiseks.
Kõigepealt uurime me andmete sünkroniseerimise ülesannet. Andemete sünkroniseerimises on sõlmed oma andmehulkadega ning nende eesmärgiks on leida kõikide andmehulkade ühend. Me täiendame olemasolevat pööratavatel Bloomi filtritel põhinevat meetodit, eemaldades nõude teada hulkade sümmeetrilise vahe suurust.
Teiseks uurime me andmete edastamise ülesannet. Andmete edastamise ülesandes võib võrgutopoloogiat kirjeldada suvaline tugevalt ühendatud suunatud graaf. Iga sõlme eesmärgiks on rekonstrueerida päritud elemendid teiste sõlmede andmehulkadest. Me kirjeldame protokolle nii ühe- kui mitmevooruliste võrgutopoloogiate jaoks. Me näitame, et ühevooruline protokoll on andmevahetuse mõttes optimaalne ja et mitmevooruline protokoll vajab minimaalse arvu voorusid.
Viimasena uurime me sünkroniseeritud andmetel funktsiooni arvutamise ülesannet. See ülesanne erineb andmete sünkroniseerimise ülesandest kuna sõlmede eesmärk on teada saada konkreetse funktsiooni väärtus andmete ühendil. Me näitame, et ülesande definitsiooni muutus lubab meil oluliselt vähendada edastatud andmete mahtu. Me anname teatud funktsioonide pere jaoks ülemise- ja alumise tõkke edastatud andmete hulgale nii deterministlikus kui juhuslikus mudelis.
With the increased usage of cloud hosting platforms and new wireless technologies, the communication paradigm has changed from server-client models to complex decentralized models. The service providers need to distribute their services across different data centers to handle the enormous traffic loads generated by the customers and be close to the clients to provide a low level of latency for a good user experience. However, duplicating the data across multiple servers is resource wasteful and cost-inefficient. We consider three directions that allow reducing the communication between the servers and users. For all directions, we represent the problems as mathematical structures and apply algebraic methods to provide solutions to the related problems. We first consider the data synchronization problem, where there are nodes with their data sets. Their goal is to obtain the union of the sets. We propose an improvement over an existing method based on invertible Bloom filters, which removes the requirement on the size of the symmetric set difference. Secondly, we consider the data distribution problem. In data distribution, the graph representing the network topology is an arbitrary strongly connected directed graph. The nodes' goal is to recover particular elements from other nodes. We describe protocols for single- and multi-round network topologies. We show that the single-round protocol is communicationally optimal, and multi-round protocol has a minimal number of rounds. Finally, we consider the problem of function computation on synchronized data. This problem differs from data synchronization as the nodes' goal is to compute the value of a function on the union of the sets. We see that the change in the problem definition allows achieving a significant reduction in the number of transmitted bits. We give upper and lower bounds on the communication complexity for a family of functions in both deterministic and random models.
With the increased usage of cloud hosting platforms and new wireless technologies, the communication paradigm has changed from server-client models to complex decentralized models. The service providers need to distribute their services across different data centers to handle the enormous traffic loads generated by the customers and be close to the clients to provide a low level of latency for a good user experience. However, duplicating the data across multiple servers is resource wasteful and cost-inefficient. We consider three directions that allow reducing the communication between the servers and users. For all directions, we represent the problems as mathematical structures and apply algebraic methods to provide solutions to the related problems. We first consider the data synchronization problem, where there are nodes with their data sets. Their goal is to obtain the union of the sets. We propose an improvement over an existing method based on invertible Bloom filters, which removes the requirement on the size of the symmetric set difference. Secondly, we consider the data distribution problem. In data distribution, the graph representing the network topology is an arbitrary strongly connected directed graph. The nodes' goal is to recover particular elements from other nodes. We describe protocols for single- and multi-round network topologies. We show that the single-round protocol is communicationally optimal, and multi-round protocol has a minimal number of rounds. Finally, we consider the problem of function computation on synchronized data. This problem differs from data synchronization as the nodes' goal is to compute the value of a function on the union of the sets. We see that the change in the problem definition allows achieving a significant reduction in the number of transmitted bits. We give upper and lower bounds on the communication complexity for a family of functions in both deterministic and random models.
Kirjeldus
Väitekirja elektrooniline versioon ei sisalda publikatsioone
Märksõnad
distributed systems, client-server systems, data communication, network protocols