Data Engineering Podcast

This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

https://www.dataengineeringpodcast.com

Eine durchschnittliche Folge dieses Podcasts dauert 53m. Bisher sind 429 Folge(n) erschienen. Dieser Podcast erscheint wöchentlich.

Gesamtlänge aller Episoden: 15 days 23 hours 52 minutes

subscribe
share






Addressing The Challenges Of Component Integration In Data Platform Architectures


Building a data platform that is enjoyable and accessible for all of its end users is a substantial challenge. One of the core complexities that needs to be addressed is the fractal set of integrations that need to be managed across the individual components...


share








 November 27, 2023  29m
 
 

Unlocking Your dbt Projects With Practical Advice For Practitioners


The dbt project has become overwhelmingly popular across analytics and data engineering teams. While it is easy to adopt, there are many potential pitfalls. Dustin Dorsey and Cameron Cyr co-authored a practical guide to building your dbt project. In this episode they share their hard-won wisdom about how to build and scale your dbt projects.


share








 November 20, 2023  1h16m
 
 

Enhancing The Abilities Of Software Engineers With Generative AI At Tabnine


Software development involves an interesting balance of creativity and repetition of patterns. Generative AI has accelerated the ability of developer tools to provide useful suggestions that speed up the work of engineers. Tabnine is one of the main platforms offering an AI powered assistant for software engineers...


share








 November 13, 2023  1h7m
 
 

Shining Some Light In The Black Box Of PostgreSQL Performance


Databases are the core of most applications, but they are often treated as inscrutable black boxes. When an application is slow, there is a good probability that the database needs some attention. In this episode Lukas Fittl shares some hard-won wisdom about the causes and solution of many performance bottlenecks and the work that he is doing to shine some light on PostgreSQL to make it easier to understand how to keep it running smoothly.


share








 November 6, 2023  54m
 
 

Surveying The Market Of Database Products


Databases are the core of most applications, whether transactional or analytical. In recent years the selection of database products has exploded, making the critical decision of which engine(s) to use even more difficult. In this episode Tanya Bragin shares her experiences as a product manager for two major vendors and the lessons that she has learned about how teams should approach the process of tool selection.


share








 October 30, 2023  47m
 
 

Defining A Strategy For Your Data Products


The primary application of data has moved beyond analytics. With the broader audience comes the need to present data in a more approachable format. This has led to the broad adoption of data products being the delivery mechanism for information. In this episode Ranjith Raghunath shares his thoughts on how to build a strategy for the development, delivery, and evolution of data products.


share








 October 23, 2023  1h3m
 
 

Reducing The Barrier To Entry For Building Stream Processing Applications With Decodable


Building streaming applications has gotten substantially easier over the past several years. Despite this, it is still operationally challenging to deploy and maintain your own stream processing infrastructure. Decodable was built with a mission of eliminating all of the painful aspects of developing and deploying stream processing systems for engineering teams...


share








 October 16, 2023  1h8m
 
 

Using Data To Illuminate The Intentionally Opaque Insurance Industry


The insurance industry is notoriously opaque and hard to navigate. Max Cho found that fact frustrating enough that he decided to build a business of making policy selection more navigable. In this episode he shares his journey of data collection and analysis and the challenges of automating an intentionally manual industry.


share








 October 9, 2023  51m
 
 

Building ETL Pipelines With Generative AI


Artificial intelligence applications require substantial high quality data, which is provided through ETL pipelines. Now that AI has reached the level of sophistication seen in the various generative models it is being used to build new ETL workflows. In this episode Jay Mishra shares his experiences and insights building ETL pipelines with the help of generative AI.


share








 October 2, 2023  51m
 
 

Powering Vector Search With Real Time And Incremental Vector Indexes


The rapid growth of machine learning, especially large language models, have led to a commensurate growth in the need to store and compare vectors. In this episode Louis Brandy discusses the applications for vector search capabilities both in and outside of AI, as well as the challenges of maintaining real-time indexes of vector data.


share








 September 25, 2023  59m