Gesamtlänge aller Episoden: 16 days 1 hour 36 minutes
Data lineage is the common thread that ties together all of your data pipelines, workflows, and systems. In order to get a holistic understanding of your data quality, where errors are occurring, or how a report was constructed you need to track the lineage of the data from beginning to end. The complicating factor is that every framework, platform, and product has its own concepts of how to store, represent, and expose that information...
An interview about how you can build your data warehouse on top of PostgreSQL for flexibility and full control over your data.
A conversation about how Tinybird invested in Clickhouse to power analytical APIs that are fast to build and operate.
A conversation about how the team at Data Mechanics is bringing Apache Spark into the cloud native world and the positive impact that has on your development experience.
A conversation about the grand vision and current realities of DataOps and how you can start on the journey toward more maintainable and reliable data systems.
An interview with Maxime Beauchemin about how to use Apache Superset as a platform for self-service data exploration and analytics.
An interview about how the team at Cherre built an internal machine learning project to use as a service in their data pipelines to make dealing with messy address data less painful.
An interview with Josh Benamram about the emerging roles across the data ecosystem and how they interact with data systems.
In this episode Prukalpa Sankar discusses how Atlan uses metadata from all of your workflows to bring everyone on the same page, letting you delivery on your data projects in record time.
An interview about the Soda Data platform and the open source components that they are building to level up the quality of your data pipelines.