Data Engineering Podcast

This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

https://www.dataengineeringpodcast.com

Eine durchschnittliche Folge dieses Podcasts dauert 53m. Bisher sind 430 Folge(n) erschienen. Dies ist ein wöchentlich erscheinender Podcast.

Gesamtlänge aller Episoden: 16 days 42 minutes

subscribe
share






data.world with Bryon Jacob - Episode 9


Data.World: The Platform For The Web Of Linked Data (Interview)


share








 December 3, 2017  46m
 
 
share








 November 22, 2017  51m
 
 

Buzzfeed Data Infrastructure with Walter Menendez - Episode 7


Summary

Buzzfeed needs to be able to understand how its users are interacting with the myriad articles, videos, etc. that they are posting. This lets them produce new content that will continue to be well-received. To surface the insights that they need to grow their business they need a robust data infrastructure to reliably capture all of those interactions...


share








 November 14, 2017  43m
 
 

Astronomer with Ry Walker - Episode 6


Summary

Building a data pipeline that is reliable and flexible is a difficult task, especially when you have a small team. Astronomer is a platform that lets you skip straight to processing your valuable business data. Ry Walker, the CEO of Astronomer, explains how the company got started, how the platform works, and their commitment to open source...


share








 August 6, 2017  42m
 
 

Rebuilding Yelp's Data Pipeline with Justin Cunningham - Episode 5


Summary

Yelp needs to be able to consume and process all of the user interactions that happen in their platform in as close to real-time as possible...


share








 June 18, 2017  42m
 
 

ScyllaDB with Eyal Gutkind - Episode 4


Summary

If you like the features of Cassandra DB but wish it ran faster with fewer resources then ScyllaDB is the answer you have been looking for. In this episode Eyal Gutkind explains how Scylla was created and how it differentiates itself in the crowded database market.

Preamble
  • Hello and welcome to the Data Engineering Podcast, the show about modern data infrastructure
  • Go to dataengineeringpodcast...


share








 March 18, 2017  35m
 
 

Defining Data Engineering with Maxime Beauchemin - Episode 3


Summary

What exactly is data engineering? How has it evolved in recent years and where is it going? How do you get started in the field? In this episode, Maxime Beauchemin joins me to discuss these questions and more.

Transcript provided by CastSource

Preamble
  • Hello and welcome to the Data Engineering Podcast, the show about modern data infrastructure
  • Go to dataengineeringpodcast...


share








 March 5, 2017  45m
 
 

Dask with Matthew Rocklin - Episode 2


Summary

There is a vast constellation of tools and platforms for processing and analyzing your data. In this episode Matthew Rocklin talks about how Dask fills the gap between a task oriented workflow tool and an in memory processing framework, and how it brings the power of Python to bear on the problem of big data...


share








 January 22, 2017  46m
 
 

Pachyderm with Daniel Whitenack - Episode 1


Summary

Do you wish that you could track the changes in your data the same way that you track the changes in your code? Pachyderm is a platform for building a data lake with a versioned file system. It also lets you use whatever languages you want to run your analysis with its container based task graph...


share








 January 14, 2017  44m
 
 

Introducing The Show


Preamble

  • Hello and welcome to the Data Engineering Podcast, the show about modern data infrastructure
  • Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch.
  • You can help support the show by checking out the Patreon page which is linked from the site...


share








 January 8, 2017  4m