This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.
…
continue reading
Welcome to The Data Flowcast: Mastering Airflow for Data Engineering & AI — the podcast where we keep you up to date with insights and ideas propelling the Airflow community forward. Join us each week, as we explore the current state, future and potential of Airflow with leading thinkers in the community, and discover how best to leverage this workflow management system to meet the ever-evolving needs of data engineering and AI ecosystems. Podcast Webpage: https://www.astronomer.io/podcast/
…
continue reading
The Data Engineering Show is a podcast for data engineering and BI practitioners to go beyond theory. Learn from the biggest influencers in tech about their practical day-to-day data challenges and solutions in a casual and fun setting. SEASON 1 DATA BROS Eldad and Boaz Farkash shared the same stuffed toys growing up as well as a big passion for data. After founding Sisense and building it to become a high-growth analytics unicorn, they moved on to their next venture, Firebolt, a leading hig ...
…
continue reading
Discussions around Data Engineering
…
continue reading
Databases and data engineering episodes of Software Engineering Daily
…
continue reading
Unlocking the Power of Data: A Guide for Leaders and Executives" As a leader or executive, you know the importance of data in driving business decisions and staying ahead of the competition. But, with the increasing amount of data generated daily, it can be overwhelming to know where to start and how to utilize this valuable asset effectively. This blog, with multiple topics, addresses the technical terminology in data engineering and analytics on the cloud.
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Evolving Responsibilities in AI Data Management
38:57
38:57
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
38:57Summary In this episode of the Data Engineering Podcast Bartosz Mikulski talks about preparing data for AI applications. Bartosz shares his journey from data engineering to MLOps and emphasizes the importance of data testing over software development in AI contexts. He discusses the types of data assets required for AI applications, including exten…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Hybrid Testing Solutions for Autonomous Driving at Bosch with Jens Scheffler and Christian Schilling
33:45
33:45
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
33:45Testing autonomous vehicles demands precision, scalability and powerful orchestration tools — enter Apache Airflow, a key component of Bosch’s cutting-edge testing framework. In this episode, we sit down with Jens Scheffler, Test Execution Cluster Technical Architect, and Christian Schilling, Product Owner Open Loop Testing Automated Driving, both …
…
continue reading
![Artwork](/static/images/128pixel.png)
1
AI and Data Movement: Trends and Best Practices with Estuary’s Daniel Pálma
30:33
30:33
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
30:33In this episode of The Data Engineering Show, the bros sit with Daniel Pálma, Head of Marketing at Estuary. Join them as they; Talk about Daniel’s career transition from data engineering to marketing and how his background in data engineering has been a tremendous help to his marketing competence. Discuss the role of AI in the evolution of data mov…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Overcoming Airflow Scaling Challenges at Monzo Bank with Jonathan Rainer
43:39
43:39
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
43:39Scaling a data orchestration platform to manage thousands of tasks daily demands innovative solutions and strategic problem-solving. In this episode, we explore the complexities of scaling Airflow and the challenges of orchestrating thousands of tasks in dynamic data environments. Jonathan Rainer, Former Platform Engineer at Monzo Bank, joins us to…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Orchestrating Analytics and AI Workflows at Telia with Arjun Anandkumar
26:00
26:00
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
26:00The future of data engineering lies in seamless orchestration and automation. In this episode, Arjun Anandkumar, Data Engineer at Telia, shares how his team uses Airflow to drive analytics and AI workflows. He highlights the challenges of scaling data platforms and how adopting best practices can simplify complex processes for teams across the orga…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
The Role of Airflow in Finance Transformation at Etraveli Group with Mihir Samant
21:19
21:19
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
21:19Transforming bottlenecked finance processes into streamlined, automated systems requires the right tools and a forward-thinking approach. In this episode, Mihir Samant, Senior Data Analyst at Etraveli Group, joins us to share how his team leverages Airflow to revolutionize finance automation. With extensive experience in data workflows and a passio…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Inside Ford’s Data Transformation: Advanced Orchestration Strategies with Vasantha Kosuri-Marshall
38:54
38:54
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
38:54Data engineering is entering a new era, where orchestration and automation are redefining how large-scale projects operate. This episode features Vasantha Kosuri-Marshall, Data and ML Ops Engineer at Ford Motor Company. Vasantha shares her expertise in managing complex data pipelines. She takes us through Ford's transition to cloud platforms, the a…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
CSVs Will Never Die And OneSchema Is Counting On It
54:40
54:40
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
54:40Summary In this episode of the Data Engineering Podcast Andrew Luo, CEO of OneSchema, talks about handling CSV data in business operations. Andrew shares his background in data engineering and CRM migration, which led to the creation of OneSchema, a platform designed to automate CSV imports and improve data validation processes. He discusses the ch…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Powering Finance With Advanced Data Solutions at Ramp with Ryan Delgado
24:35
24:35
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
24:35Data is the backbone of every modern business, but unlocking its full potential requires the right tools and strategies. In this episode, Ryan Delgado, Director of Engineering at Ramp, joins us to explore how innovative data platforms can transform business operations and fuel growth. He shares insights on integrating Apache Airflow, optimizing dat…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
AI and Data Change Management with Chad Sanderson, CEO Gable AI
36:43
36:43
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
36:43In this episode of The Data Engineering Show, host Benjamin and co-host Eldad sit with Chad Sanderson, CEO and co-founder of Gable AI to explore the interesting world of data change management. Join them as they: Delve into challenges of data quality, how it degrades over time and the one-sided data quality checks on the “last mile” of the data sup…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Breaking Down Data Silos: AI and ML in Master Data Management
57:30
57:30
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
57:30Summary In this episode of the Data Engineering Podcast Dan Bruckner, co-founder and CTO of Tamr, talks about the application of machine learning (ML) and artificial intelligence (AI) in master data management (MDM). Dan shares his journey from working at CERN to becoming a data expert and discusses the challenges of reconciling large-scale organiz…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Building a Data Vision Board: A Guide to Strategic Planning
49:59
49:59
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
49:59Summary In this episode of the Data Engineering Podcast Lior Barak shares his insights on developing a three-year strategic vision for data management. He discusses the importance of having a strategic plan for data, highlighting the need for data teams to focus on impact rather than just enablement. He introduces the concept of a "data vision boar…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Exploring the Power of Airflow 3 at Astronomer with Amogh Desai
30:24
30:24
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
30:24What does it take to go from fixing a broken link to becoming a committer for one of the world’s leading open-source projects? Amogh Desai, Senior Software Engineer at Astronomer, takes us through his journey with Apache Airflow. From small contributions to building meaningful connections in the open-source community, Amogh’s story provides actiona…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
How Orchestration Impacts Data Platform Architecture
59:39
59:39
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
59:39Summary The core task of data engineering is managing the flows of data through an organization. In order to ensure those flows are executing on schedule and without error is the role of the data orchestrator. Which orchestration engine you choose impacts the ways that you architect the rest of your data platform. In this episode Hugo Lu shares his…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Using Airflow To Power Machine Learning Pipelines at Optimove with Vasyl Vasyuta
24:11
24:11
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
24:11Data orchestration and machine learning are shaping how organizations handle massive datasets and drive customer-focused strategies. Tools like Apache Airflow are central to this transformation. In this episode, Vasyl Vasyuta, R&D Team Leader at Optimove, joins us to discuss how his team leverages Airflow to optimize data processing, orchestrate ma…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
An Exploration Of The Impediments To Reusable Data Pipelines
51:32
51:32
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
51:32Summary In this episode of the Data Engineering Podcast the inimitable Max Beauchemin talks about reusability in data pipelines. The conversation explores the "write everything twice" problem, where similar pipelines are built without code reuse, and discusses the challenges of managing different SQL dialects and relational databases. Max also touc…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Maximizing Business Impact Through Data at GlossGenius with Katie Bauer
25:49
25:49
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
25:49Bridging the gap between data teams and business priorities is essential for maximizing impact and building value-driven workflows. Katie Bauer, Senior Director of Data at GlossGenius, joins us to share her principles for creating effective, aligned data teams. In this episode, Katie draws from her experience at GlossGenius, Reddit and Twitter to h…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Optimizing Large-Scale Deployments at LinkedIn with Rahul Gade
27:47
27:47
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
27:47Scaling deployments for a billion users demands innovation, precision and resilience. In this episode, we dive into how LinkedIn optimizes its continuous deployment process using Apache Airflow. Rahul Gade, Staff Software Engineer at LinkedIn, shares his insights on building scalable systems and democratizing deployments for over 10,000 engineers. …
…
continue reading
![Artwork](/static/images/128pixel.png)
1
The Art of Database Selection and Evolution
59:56
59:56
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
59:56Summary In this episode of the Data Engineering Podcast Sam Kleinman talks about the pivotal role of databases in software engineering. Sam shares his journey into the world of data and discusses the complexities of database selection, highlighting the trade-offs between different database architectures and how these choices affect system design, q…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Bridging Code and UI in Data Orchestration with Kestra
44:30
44:30
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
44:30Summary In this episode of the Data Engineering Podcast, Anna Geller talks about the integration of code and UI-driven interfaces for data orchestration. Anna defines data orchestration as automating the coordination of workflow nodes that interact with data across various business functions, discussing how it goes beyond ETL and analytics to enabl…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Tech Stacks and Tradeoffs: Xudo's Founder on Picking the Right Tools for BI Success
24:56
24:56
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
24:56Wouter Trappers is the founder of Xudo and shares his slightly unconventional path from philosopher to data consultant with the Bros in this latest episode of The Data Engineering Show. Wouter’s grounding in philosophy has proved to be a shaping influence on his approach to business intelligence. Much more than just a software solution, for Wouter,…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Streaming Data Into The Lakehouse With Iceberg And Trino At Going
39:49
39:49
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
39:49In this episode, I had the pleasure of speaking with Ken Pickering, VP of Engineering at Going, about the intricacies of streaming data into a Trino and Iceberg lakehouse. Ken shared his journey from product engineering to becoming deeply involved in data-centric roles, highlighting his experiences in ecommerce and InsurTech. At Going, Ken leads th…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
How Uber Manages 1 Million Daily Tasks Using Airflow, with Shobhit Shah and Sumit Maheshwari
28:44
28:44
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
28:44When data orchestration reaches Uber’s scale, innovation becomes a necessity, not a luxury. In this episode, we discuss the innovations behind Uber’s unique Airflow setup. With our guests Shobhit Shah and Sumit Maheshwari, both Staff Software Engineers at Uber, we explore how their team manages one of the largest data workflow systems in the world.…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
An Opinionated Look At End-to-end Code Only Analytical Workflows With Bruin
56:11
56:11
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
56:11Summary The challenges of integrating all of the tools in the modern data stack has led to a new generation of tools that focus on a fully integrated workflow. At the same time, there have been many approaches to how much of the workflow is driven by code vs. not. Burak Karakan is of the opinion that a fully integrated workflow that is driven entir…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Building Resilient Data Systems for Modern Enterprises at Astrafy with Andrea Bombino
28:29
28:29
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
28:29Efficient data orchestration is the backbone of modern analytics and AI-driven workflows. Without the right tools, even the best data can fall short of its potential. In this episode, Andrea Bombino, Co-Founder and Head of Analytics Engineering at Astrafy, shares insights into his team’s approach to optimizing data transformation and orchestration …
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Feldera: Bridging Batch and Streaming with Incremental Computation
47:36
47:36
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
47:36Summary In this episode of the Data Engineering Podcast, the creators of Feldera talk about their incremental compute engine designed for continuous computation of data, machine learning, and AI workloads. The discussion covers the concept of incremental computation, the origins of Feldera, and its unique ability to handle both streaming and batch …
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Data Rewind: Conversation Highlights from Zach Wilson, Matthew Housley, Joe Reis, and Krishnan Viswanathan
28:02
28:02
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
28:02In this special roundup episode of The Data Engineering Show, the Bros revisits some of the best bits from episodes with data thought leaders Zach Wilson, Matthew Housley, Joe Reis, and Krishnan Viswanathan, spotlighting essential trends and lessons learned across the evolving data engineering landscape. From data observability to bridging academia…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Inside Airflow 3: Redefining Data Engineering with Vikram Koka
30:08
30:08
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
30:08Data orchestration is evolving faster than ever and Apache Airflow 3 is set to revolutionize how enterprises handle complex workflows. In this episode, we dive into the exciting advancements with Vikram Koka, Chief Strategy Officer at Astronomer and PMC Member at The Apache Software Foundation. Vikram shares his insights on the evolution of Airflow…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Accelerate Migration Of Your Data Warehouse with Datafold's AI Powered Migration Agent
48:50
48:50
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
48:50Summary Gleb Mezhanskiy, CEO and co-founder of DataFold, joins Tobias Macey to discuss the challenges and innovations in data migrations. Gleb shares his experiences building and scaling data platforms at companies like Autodesk and Lyft, and how these experiences inspired the creation of DataFold to address data quality issues across teams. He out…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Building a Data-Driven HR Platform at 15Five with Guy Dassa
20:25
20:25
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
20:25Data and AI are revolutionizing HR, empowering leaders to measure performance and drive strategic decisions like never before. In this episode, we explore the transformation of HR technology with Guy Dassa, Chief Technology Officer at 15Five, as he shares insights into their evolving data platform. Guy discusses how 15Five equips HR leaders with to…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Bring Vector Search And Storage To The Data Lake With Lance
58:01
58:01
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
58:01Summary The rapid growth of generative AI applications has prompted a surge of investment in vector databases. While there are numerous engines available now, Lance is designed to integrate with data lake and lakehouse architectures. In this episode Weston Pace explains the inner workings of the Lance format for table definitions and file storage, …
…
continue reading
![Artwork](/static/images/128pixel.png)
1
The Role of Python in Shaping the Future of Data Platforms with DLT
54:08
54:08
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
54:08Summary In this episode of the Data Engineering Podcast, Adrian Broderieux and Marcin Rudolph, co-founders of DLT Hub, delve into the principles guiding DLT's development, emphasizing its role as a library rather than a platform, and its integration with lakehouse architectures and AI application frameworks. The episode explores the impact of the P…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
Build Your Data Transformations Faster And Safer With SDF
42:36
42:36
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
42:36Summary In this episode of the Data Engineering Podcast Lukas Schulte, co-founder and CEO of SDF, explores the development and capabilities of this fast and expressive SQL transformation tool. From its origins as a solution for addressing data privacy, governance, and quality concerns in modern data management, to its unique features like static an…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
The Intersection of AI and Data Management at Dosu with Devin Stein
20:18
20:18
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
20:18Unlocking engineering productivity goes beyond coding — it’s about managing knowledge efficiently. In this episode, we explore the innovative ways in which Dosu leverages Airflow for data orchestration and supports the Airflow project. Devin Stein, Founder of Dosu, shares his insights on how engineering teams can focus on value-added work by automa…
…
continue reading
![Artwork](/static/images/128pixel.png)
1
The Resurgence of SQL: Insights from Ryanne Dolan from LinkedIn
32:57
32:57
Main Kemudian
Main Kemudian
Senarai
Suka
Disukai
32:57In this episode of The Data Engineering Show, the bros, Eldad and Benjamin are joined by Ryanne Dolan from LinkedIn to discuss the innovative Hoptimator (H2) project. This conversation reveals how LinkedIn has improved its data pipelines by automating the setup and management of complex workflows. Together they cover: Automated Data Pipelines: Ryan…
…
continue reading