Summary
Five years of hosting the Data Engineering Podcast has provided Tobias Macey with a wealth of insight into the work of building and operating data systems at a variety of scales and for myriad purposes. In order to condense that acquired knowledge into a format that is useful to everyone Scott Hirleman turns the tables in this episode and asks Tobias about the tactical and strategic aspects of his experiences applying those lessons to the work of building a data platform from scratch.
AnnouncementsHow did you get involved in the area of data management?
Data platform building journey
General build vs buy and vendor selection process
Guest call out
Tobias' advice and learnings from building out a data platform:
Tobias' data platform components: data lakehouse paradigm, Airbyte for data integration (chosen over Meltano), Trino/Starburst Galaxy for distributed querying, AWS S3 for the storage layer, AWS Glue for very basic metadata cataloguing, Dagster as the crucial orchestration layer, dbt
Contact InfoThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
Sponsored By:
Support Data Engineering Podcast