Real time vs micro-batching in streaming data processing: performance and guidelines

Date

2021

Journal Title

Journal ISSN

Volume Title

Publisher

Tartu Ülikool

Abstract

Data is used in every second of our life. Nowadays, the majority of this data is coming through the Internet. For providing better fast and scalable service, technologies needed to be efficient and scaled regarding those needs. The initiative of this thesis to provide simple workload for engine comparison. In this master thesis, I will focus on Apache Flink, Spark Streaming, Apache Kafka, Apache Storm, Storm Trident for real-time and micro-batch in streaming data processing. This thesis aims to show the comparisons among those technologies.

Description

Keywords

Stream processing, Apache Kafka, Apache Spark, Apache Flink, Real-time streaming, Micro-batch processing

Citation