Software Engineering Daily

Technical interviews about software topics.

https://softwareengineeringdaily.com/

subscribe
share






Prefect Dataflow Scheduler with Jeremiah Lowin


A data workflow scheduler is a tool used for connecting multiple systems together in order to build pipelines for processing data. A data pipeline might include a Hadoop task for ETL, a Spark task for stream processing, and a TensorFlow task to train a machine learning model. 

The workflow scheduler manages the tasks in that data pipeline and the logical flow between them. Airflow is a popular data workflow scheduler that was originally created at Airbnb. Since then, the project has been adopted by numerous companies that need workflow orchestration for their data pipelines. Jeremiah Lowin was a core committer to Airflow for several years before he identified several features of Airflow that he wanted to change.

Prefect is a dataflow scheduler that was born out of Jeremiah’s experience working with Airflow. Prefect’s features include data sharing between tasks, task parameterization, and a different API than Airflow. Jeremiah joins the show to discuss Prefect, and how his experience with Airflow led to his current work in dataflow scheduling.

Sponsorship inquiries: sponsor@softwareengineeringdaily.com

The post Prefect Dataflow Scheduler with Jeremiah Lowin appeared first on Software Engineering Daily.


fyyd: Podcast Search Engine
share








 April 29, 2020  1h4m