The Who, What, and Why of Data Lake Table Formats
A comprehensive exploration of the intricacies of Data Lake Table Formats and their impact on business analytics.
Data lake table formats are a critical component of modern data analytics. They provide a way to organize and manage data in a data lake, and they offer several benefits for business analytics, and AI.
Target Audience: Technical Leaders and practitioners (CDOs, CTOs, anyone working with data)
Prerequisites: None
Level: Basic
Extended Abstract:
A comprehensive exploration of the intricacies of Data Lake Table Formats and their impact on business analytics.
Data lake table formats are a critical component of modern data analytics. They provide a way to organize and manage data in a data lake, and they offer several benefits for business analytics, and AI.
Scalability: Data lake table formats can scale to handle large amounts of data.
Performance: Data lake table formats can improve the performance of queries on large datasets.
Durability: Data lake table formats can ensure that data is durable and recoverable.
Auditability: Data lake table formats can help to ensure that data is auditable and compliant.
This presentation will explore the who, what, and why of data lake table formats. We will discuss the different data lake table formats, such as Apache Iceberg, Apache Hudi, and Delta Lake. We will also discuss the benefits of using data lake table formats for business analytics.
By the end of this presentation, you will better understand data lake table formats and how they can be used to improve business analytics.
Key takeaways:
Data lake table formats are a critical component of modern data analytics.
They offer a number of benefits for business analytics and AI, including scalability, performance, durability, and auditability.
There are a variety of data lake table formats available, including Apache Iceberg, Apache Hudi, and Delta Lake.
Head of Developer Relations
Andrew Madson leads Developer Relations at Fivetran, where he builds global programs that help developers and data teams adopt modern data and AI tooling. He has built DevRel, education, and evangelism functions at Fivetran, Tobiko Data, and Dremio, delivering technical content, large-scale community programs, and keynotes across major industry conferences.
Andrew is a published author of the O’Reilly “Definitive Guide to Apache Polaris” and a graduate professor of data science and engineering. Andrew previously led AI and data teams at Arizona State University, J.P. Morgan Chase, and MassMutual among others.
