Benchmarking Stream Processing Frameworks for Large Scale Data Shuffling

Softwaretechnik-Trends, 2023

Distributed stream processing frameworks help building scalable and reliable applications that perform transformations and aggregations on continuous data streams. We outline our ongoing research on designing a new benchmark for distributed stream processing frameworks. In contrast to other benchmarks, it focuses on use cases where stream processing frameworks are mainly used for redistributing data records to perform state-local aggregations, while the actual aggregation logic is considered as black-box software components. We describe our benchmark architecture based on a real-world use case, show how we imple mented it with four state-of-the-art frameworks, and give an overview of initial experimental results.