Navigating Modern Data Platform Complexity – Challenges and Open Source Approaches
Data-driven business models are prevalent across industries. Data systems have evolved from relational databases via NoSQL and Big Data to near real-time data processing. Data streaming requires rethinking conventional paradigms. The rise of AI/ML adds complexity, requiring MLOps to bridge engineering and science. Building cost-effective, future-proof platforms involves technical and non-technical perspectives. This talk maps challenges in real-time data architecture and AI, offering insights for technical managers on open-source solutions.
Target Audience: Technical managers
Prerequisites: Basic understanding of Data Platforms and AI/ML
Level: Basic
Extended Abstract:
Data-driven business models are now ubiquitous across all industries. The underlying data architectures have undergone significant evolution over the past decades—transitioning from traditional relational databases to NoSQL, Big Data systems, and now to modern real-time storage and processing solutions, each introducing unique challenges. Data streaming platforms and their use cases present inherent complexities, as conventional data management paradigms, such as relational table structures, do not directly apply to continuously flowing data streams. As a consequence, data teams must address technical and organizational hurdles.
In recent years, the widespread adoption of artificial intelligence (AI) and machine learning (ML) has further complicated the data landscape. Data teams are now faced with the additional challenge of integrating state-of-the-art models to continuously enrich real-time data streams. Machine Learning Operations (MLOps) has emerged as a solution to bridge the gap between data engineering and the practical application of scientific methodologies. However, MLOps itself introduces a new set of challenges and a diverse array of tools that must be effectively managed.
As a result, data platforms have become more complex than ever before. Designing a successful platform requires a comprehensive approach that considers both technical and non-technical perspectives to ensure scalability, future-proofing, and cost-effectiveness. Organizations that choose not to rely solely on Platform-as-a-Service (PaaS) or Software-as-a-Service (SaaS) providers can explore the world of open-source technologies, selecting and integrating components, tools, and frameworks to meet their specific needs.
In this talk, I will present and map of the key challenges associated with modern data platforms, focusing on enabling near real-time data processing and enrichment through AI. I will share insights and lessons learned from real-world use cases applied to real-time data in production.
The audience will improve its understanding of the current data landscape, the dependencies between various open-source tools, and identify strategic decisions required for successful implementation. The session is made for technical managers, offering them valuable perspectives to foster discussions and deepen their technical understanding of potential open-source solutions.
Dr.-Ing. Christoph Böhm is a passionate data enthusiast with over a decade of industry experience. As a founder and partner at bakdata, he serves as a tech team lead, driving the development of modern data platforms. Collaborating with customers from diverse industries, he refines requirements and leads technical development in close coordination with both technical and non-technical stakeholders.