Gesamtlänge aller Episoden: 16 days 1 hour 36 minutes
With all of the messaging about treating data as a product it is becoming difficult to know what that even means. Vishal Singh is the head of products at Starburst which means that he has to spend all of his time thinking and talking about the details of product thinking and its application to data...
Encryption and security are critical elements in data analytics and machine learning applications. We have well developed protocols and practices around data that is at rest and in motion, but security around data in use is still severely lacking. Recognizing this shortcoming and the capabilities that could be unlocked by a robust solution Rishabh Poddar helped to create Opaque Systems as an outgrowth of his PhD studies...
Five years of hosting the Data Engineering Podcast has provided Tobias Macey with a wealth of insight into the work of building and operating data systems at a variety of scales and for myriad purposes. In order to condense that acquired knowledge into a format that is useful to everyone Scott Hirleman turns the tables in this episode and asks Tobias about the tactical and strategic aspects of his experiences applying those lessons to the work of building a data platform from scratch.
The data ecosystem has seen a constant flurry of activity for the past several years, and it shows no signs of slowing down. With all of the products, techniques, and buzzwords being discussed it can be easy to be overcome by the hype. In this episode Juan Sequeda and Tim Gasper from data.world share their views on the core principles that you can use to ground your work and avoid getting caught in the hype cycles.
One of the reasons that data work is so challenging is because no single person or team owns the entire process. This introduces friction in the process of collecting, processing, and using data. In order to reduce the potential for broken pipelines some teams have started to adopt the idea of data contracts...
An interview with Frank Liu about how the open source Towhee library simplifies the work of building pipelines to generate vector embeddings of your data for building machine learning projects.
An interview with Nick van Wiggeren about the Planetscale serverless MySQL service built on top of the open source Vitess project and the impact on developer productivity that it offers when you don't have to worry about database operations.
An interview with Sabin Thomas about how Zing Data is lets you bring business intelligence with you when you're on the go with first-class support for mobile devices
An interview with Arjun Narayan about how to enable organizations of all sizes to take advantage of real-time data, including the technical and organizational investments required to make it happen.
An interview with Wes McKinney about his work at Voltron Data to support and grow the Arrow project and its integration with the broader data ecosystem