Backend Jun 2, 2020

The Long Road Towards Unified APIs in Apache Flink

The parallel data processing system Apache Flink features APIs for different use cases. There is the DataSet API for batch-style programs, the DataStream API for real-time programs, and the Table API/SQL for analytical use cases on both batch and streaming data.

For some time now, the Flink community has discussed how we can reduce the number of APIs by unifying them. In this talk I will discuss the motivations for wanting more unified APIs, what it actually means to unify APIs and how we plan to finally achieve this goal. During this I will also try and examine why it has taken the community a bit longer to resolve this big undertaking.

This talk does not require any pre-existing knowledge about Apache Flink, I will give a brief introduction and also discuss other systems such as Apache Beam to understand the landscape of unified APIs and how Flink will fit in.

Please let me know if I need to change it somehow (maybe shorter?). Also let me know if you need anything else from me.