The Data Stack Show

Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

https://datastackshow.com

subscribe
share






185: The Evolution of Data Processing, Data Formats, and Data Sharing with Ryan Blue of Tabular


Highlights from this week’s conversation include:

  • The Evolution of Data Processing (2:36)
  • Ryan’s Background and Journey in Data (4:52)
  • Challenges in Transitioning to S3 (8:47)
  • Impact of Latency on Query Performance (11:43)
  • Challenges with Table Representation (15:26)
  • Designing a New Metadata Format (21:36)
  • Integration with Existing Tools and Open Source Project (24:07)
  • Initial Features of Iceberg (26:11)
  • Challenges of Manual Partitioning (31:49)
  • Designing the Iceberg Table Format (37:31)
  • Trade-offs in Writing Workloads (47:22)
  • Database Systems and File Systems (55:00)
  • Vendor Influence on Access Controls (1:01:58)
  • Restructuring Data Security (1:03:39)
  • Delegating Access Controls (1:07:22)
  • Column-level Access Controls (1:14:19)
  • Exciting Releases and Future Plans (1:17:47)
  • Centralization of Components in Data Infrastructure (1:25:37)
  • Fundamental Shift in Data Architecture (1:28:28)

The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.

RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.


fyyd: Podcast Search Engine
share








 April 10, 2024  1h29m