Thesis Research Project Johannes Rank

Automatic Performance Prediction of Stream Processing Systems

Johannes Rank
Aim: This dissertation aims to automatically predict the performance of stream processing systems, by simulating the individual processing tasks of the application logic in order to provide better insights such as bottleneck identification, over-queuing prediction and fine grained resource consumption Abstract: Stream processing systems have become the major engine whenever data processing with low latency or real-time capability is required. Meeting the performance demands of these systems is hence crucial to ensure stable operation and correct functioning. Existing performance modelling approaches target this issue by predicting performance characteristics on the architecture level, e.g by measuring existing deployments and estimating the impact of related changes. However, these approaches neglect the actual application logic of such systems and are hence not able to identify bottlenecks on a task level or to predict the performance during the development phase. This dissertation aims to develop a more comprehensive performance modelling approach that is able to embody both, the deployment architecture and the actual processing logic of stream processing systems on the example of the SAP HANA streaming solution. The contribution of this dissertation are hence a theoretical concept on how to predict the performance of streaming processing systems on an application level and a tool implementation that is able to utilize these predictions automatically for SAP HANA streaming during operation and development.