Data Engineering Podcast

This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

https://www.dataengineeringpodcast.com

Eine durchschnittliche Folge dieses Podcasts dauert 53m. Bisher sind 431 Folge(n) erschienen. Dies ist ein wöchentlich erscheinender Podcast.

Gesamtlänge aller Episoden: 16 days 1 hour 36 minutes

subscribe
share






Exploring The Nuances Of Building An Intentional Data Culture


The ecosystem for data professionals has matured to the point that there are a large and growing number of distinct roles. With the scope and importance of data steadily increasing it is important for organizations to ensure that everyone is aligned and operating in a positive environment. To help facilitate the nascent conversation about what constitutes an effective and productive data culture, the team at Data Council have dedicated an entire conference track to the subject...


share








 March 6, 2023  45m
 
 

Building A Data Mesh Platform At PayPal


There has been a lot of discussion about the practical application of data mesh and how to implement it in an organization. Jean-Georges Perrin was tasked with designing a new data platform implementation at PayPal and wound up building a data mesh. In this episode he shares that journey and the combination of technical and organizational challenges that he encountered in the process.


share








 February 27, 2023  46m
 
 

The View Below The Waterline Of Apache Iceberg And How It Fits In Your Data Lakehouse


Cloud data warehouses have unlocked a massive amount of innovation and investment in data applications, but they are still inherently limiting. Because of their complete ownership of your data they constrain the possibilities of what data you can store and how it can be used. Projects like Apache Iceberg provide a viable alternative in the form of data lakehouses that provide the scalability and flexibility of data lakes, combined with the ease of use and performance of data warehouses...


share








 February 19, 2023  55m
 
 

Let The Whole Team Participate In Data With The Quilt Versioned Data Hub


Data is a team sport, but it's often difficult for everyone on the team to participate. For a long time the mantra of data tools has been "by developers, for developers", which automatically excludes a large portion of the business members who play a crucial role in the success of any data project. Quilt Data was created as an answer to make it easier for everyone to contribute to the data being used by an organization and collaborate on its application...


share








 February 11, 2023  52m
 
 

Reflecting On The Past 6 Years Of Data Engineering


This podcast started almost exactly six years ago, and the technology landscape was much different than it is now. In that time there have been a number of generational shifts in how data engineering is done. In this episode I reflect on some of the major themes and take a brief look forward at some of the upcoming changes.


share








 February 6, 2023  32m
 
 

Let Your Business Intelligence Platform Build The Models Automatically With Omni Analytics


Business intelligence has gone through many generational shifts, but each generation has largely maintained the same workflow. Data analysts create reports that are used by the business to understand and direct the business, but the process is very labor and time intensive. The team at Omni have taken a new approach by automatically building models based on the queries that are executed...


share








 January 30, 2023  50m
 
 

Safely Test Your Applications And Analytics With Production Quality Data Using Tonic AI


The most interesting and challenging bugs always happen in production, but recreating them is a constant challenge due to differences in the data that you are working with. Building your own scripts to replicate data from production is time consuming and error-prone. Tonic is a platform designed to solve the problem of having reliable, production-like data available for developing and testing your software, analytics, and machine learning projects...


share








 January 23, 2023  45m
 
 

Building Applications With Data As Code On The DataOS


The modern data stack has made it more economical to use enterprise grade technologies to power analytics at organizations of every scale. Unfortunately it has also introduced new overhead to manage the full experience as a single workflow. At the Modern Data Company they created the DataOS platform as a means of driving your full analytics lifecycle through code, while providing automatic knowledge graphs and data discovery...


share








 January 16, 2023  48m
 
 

Automate Your Pipeline Creation For Streaming Data Transformations With SQLake


Managing end-to-end data flows becomes complex and unwieldy as the scale of data and its variety of applications in an organization grows. Part of this complexity is due to the transformation and orchestration of data living in disparate systems. The team at Upsolver is taking aim at this problem with the latest iteration of their platform in the form of SQLake...


share








 January 8, 2023  44m
 
 

Increase Your Odds Of Success For Analytics And AI Through More Effective Knowledge Management With AlignAI


Making effective use of data requires proper context around the information that is being used. As the size and complexity of your organization increases the difficulty of ensuring that everyone has the necessary knowledge about how to get their work done scales exponentially. Wikis and intranets are a common way to attempt to solve this problem, but they are frequently ineffective...


share








 December 30, 2022  59m