How Denormalized is Building ‘DuckDB for Streaming’ with Apache DataFusion

Tech on the Rocks

Player FM - Internet Radio Done Right

Ditambah thirty-four minggu yang lalu

Kandungan disediakan oleh Kostas Pardalis, Nitay Joffe. Semua kandungan podcast termasuk episod, grafik dan perihalan podcast dimuat naik dan disediakan terus oleh Kostas Pardalis, Nitay Joffe atau rakan kongsi platform podcast mereka. Jika anda percaya seseorang menggunakan karya berhak cipta anda tanpa kebenaran anda, anda boleh mengikuti proses yang digariskan di sini https://ms.player.fm/legal.

This Is Woman's Work with Nicole Kalil

1
How To Pitch Yourself (And Get A Yes) | 300 27:52

4 days ago27:52

Main Kemudian

Senarai

Suka

Disukai

27:52

We made it— 300 episodes of This Is Woman’s Work ! And we’re marking this milestone by giving you something that could seriously change the game in your business or career: the skill of pitching yourself effectively. Whether you’re dreaming of being a podcast guest, landing a speaking gig, signing a client, or just asking for what you want with confidence—you’re already pitching yourself, every day. But are you doing it well? In this milestone episode, Nicole breaks down exactly how to pitch yourself to be a podcast guest … and actually hear “yes.” With hundreds of pitches landing in her inbox each month, she shares what makes a guest stand out (or get deleted), the biggest mistakes people make, and why podcast guesting is still one of the most powerful ways to grow your reach, authority, and influence. In This Episode, We Cover: ✅ Why we all need to pitch ourselves—and how to do it without feeling gross ✅ The step-by-step process for landing guest spots on podcasts (and more) ✅ A breakdown of the 3 podcast levels: Practice, Peer, and A-List—and how to approach each ✅ The must-haves of a successful podcast pitch (including real examples) ✅ How to craft a pitch that gets read, gets remembered, and gets results Whether you’re new to pitching or want to level up your game, this episode gives you the exact strategy Nicole and her team use to land guest spots on dozens of podcasts every year. Because your voice deserves to be heard. And the world needs what only you can bring. 🎁 Get the FREE Podcast Pitch Checklist + Additional Information on your Practice Group, Peer Group, and A-List Group Strategies: https://nicolekalil.com/podcast 📥 Download The Podcast Pitch Checklist Here Related Podcast Episodes: Shameless and Strategic: How to Brag About Yourself with Tiffany Houser | 298 How To Write & Publish A Book with Michelle Savage | 279 How To Land Your TED Talk and Skyrocket Your Personal Brand with Ashley Stahl | 250 Share the Love: If you found this episode insightful, please share it with a friend, tag us on social media, and leave a review on your favorite podcast platform! 🔗 Subscribe & Review: Apple Podcasts | Spotify | Amazon Music…

about a year ago 1:02:01

MP3•Laman utama episod

The conversation explores topics such as developer experience, fault tolerance, state management, and the future of stream processing interfaces. Whether you’re a data engineer, application developer, or simply interested in the evolution of real-time data infrastructure, this episode offers valuable insights into making stream processing more accessible and efficient.

Contacts & Links
Amey Chaugule
Matt Green
Denormalized
Denormalized Github Repo

Chapters
00:00 Introduction and Background
12:03 Building an Embedded Stream Processing Engine
18:39 The Need for Stream Processing in the Current Landscape
22:45 Interfaces for Interacting with Stream Processing Systems
26:58 The Target Persona for Stream Processing Systems
31:23 Simplifying Stream Processing Workloads and State Management
34:50 State and Buffer Management
37:03 Distributed Computing vs. Single-Node Systems
42:28 Cost Savings with Single-Node Systems
47:04 The Power and Extensibility of Data Fusion
55:26 Integrating Data Store with Data Fusion
57:02 The Future of Streaming Systems
01:00:18 intro-outro-fade.mp3

Click here to view the episode transcript.

16 episod

#Tech #Kostas Pardalis, Nitay Joffe #Technology #Infrastructure #Cloud IT #Systems #Data

Tech on the Rocks

How Denormalized is Building ‘DuckDB for Streaming’ with Apache DataFusion

Tech on the Rocks

published about a year ago

Kongsi

MP3•Laman utama episod

Contacts & Links
Amey Chaugule
Matt Green
Denormalized
Denormalized Github Repo

Click here to view the episode transcript.

16 episod

#Tech #Kostas Pardalis, Nitay Joffe #Technology #Infrastructure #Cloud IT #Systems #Data

Semua episod

Tech on the Rocks

1
From Data Mesh to Lake House: Revolutionizing Metadata with Lakekeeper 57:25

29 days ago57:25

57:25

Summary In this episode, Viktor Kessler shares his journey and insights from his extensive experience in data management—from building risk management systems and data warehouses to working as a solutions architect at MongoDB and Dremio, and now co-founding a startup. Initially exploring data mesh concepts, Viktor explains how real-world challenges—such as the disconnect between technical data models and business needs, inconsistent definitions across departments, and the difficulty in managing actionable metadata—led him and his co-founder to pivot toward building a lake house solution. His startup is developing Lakekeeper, an open source REST catalog for Apache Iceberg, which aims to bridge the gap between decentralized data production and centralized metadata management. The conversation also delves into the evolution of data catalogs, the necessity for self-service analytics, and how creating consumption-ready data products can transform data functions from cost centers into profit centers. Finally, Viktor outlines ways for interested listeners to get involved with the Lakekeeper community through GitHub, upcoming meetups, and a dedicated Discord channel. Chapters 00:00 Introduction to Viktor Kessler and His Journey 04:57 Transitioning from Data Mesh to Lake House 09:15 Understanding Data Mesh: Pain Points and Solutions 13:47 The Role of Metadata in Data Management 18:16 The Evolution of Catalogs and Metadata Management 28:14 Stabilizing the Consumption Pipeline 31:18 Centralizing Metadata for Decentralized Organizations 37:09 Bridging the Gap: Technical and Business Perspectives 43:17 Rethinking Data Products and Consumption 50:45 Finding Balance: Control and Flexibility in Data Management…

Tech on the Rocks

1
Reinventing Stream Processing: From LinkedIn to Responsive with Apurva Mehta 58:13

6 weeks ago58:13

58:13

Summary In this episode, Apurva Mehta, co-founder and CEO of Responsive, recounts his extensive journey in stream processing—from his early work at LinkedIn and Confluent to his current venture at Responsive. He explains how stream processing evolved from simple event ingestion and graph indexing to powering complex, stateful applications such as search indexing, inventory management, and trade settlement. Apurva clarifies the often-misunderstood concept of “real time,” arguing that low latency (often in the one- to two-second range) is more accurate for many applications than the instantaneous response many assume. He delves into the challenges of state management, discussing the limitations of embedded state stores like RocksDB and traditional databases (e.g., Postgres) when faced with high update rates and complex transactional requirements. The conversation also covers the trade-offs between SQL-based streaming interfaces and more flexible APIs, and how Responsive is innovating by decoupling state from compute—leveraging remote state solutions built on object stores (like S3) with specialized systems such as SlateDB—to improve elasticity, cost efficiency, and operational simplicity in mission-critical applications. Chapters 00:00 Introduction to Apurva Mehta and Streaming Background 08:50 Defining Real-Time in Streaming Contexts 14:18 Challenges of Stateful Stream Processing 19:50 Comparing Streaming Processing with Traditional Databases 26:38 Product Perspectives on Streaming vs Analytical Systems 31:10 Operational Rigor and Business Opportunities 38:31 Developers' Needs: Beyond SQL 45:53 Simplifying Infrastructure: The Cost of Complexity 51:03 The Future of Streaming Applications Click here to view the episode transcript.…

Tech on the Rocks

1
Semantic Layers: The Missing Link Between AI and Data with David Jayatillake from Cube 59:03

8 weeks ago59:03

59:03

In this episode, we chat with David Jayatillake, VP of AI at Cube, about semantic layers and their crucial role in making AI work reliably with data. We explore how semantic layers act as a bridge between raw data and business meaning, and why they're more practical than pure knowledge graphs. David shares insights from his experience at Delphi Labs, where they achieved 100% accuracy in natural language data queries by combining semantic layers with AI, compared to just 16% accuracy with direct text-to-SQL approaches. We discuss the challenges of building and maintaining semantic layers, the importance of proper naming and documentation, and how AI can help automate their creation. Finally, we explore the future of semantic layers in the context of AI agents and enterprise data systems, and learn about Cube's upcoming AI-powered features for 2025. 00:00 Introduction to AI and Semantic Layers 05:09 The Evolution of Semantic Layers Before and After AI 09:48 Challenges in Implementing Semantic Layers 14:11 The Role of Semantic Layers in Data Access 18:59 The Future of Semantic Layers with AI 23:25 Comparing Text to SQL and Semantic Layer Approaches 27:40 Limitations and Constraints of Semantic Layers 30:08 Understanding LLMs and Semantic Errors 35:03 The Importance of Naming in Semantic Layers 37:07 Debugging Semantic Issues in LLMs 38:07 The Future of LLMs as Agents 41:53 Discovering Services for LLM Agents 50:34 What's Next for Cube and AI Integration…

Tech on the Rocks

1
From black holes to AI in mathematics: AI Innovation in Mathematics and Health with Yaron Hadad 59:24

11 weeks ago59:24

59:24

In this episode, we chat with Yaron Hadad, a fascinating individual who transitioned from theoretical physics to entrepreneurship. We explore his groundbreaking work on black holes and gravitational waves, and learn about the Ramanujan Machine - an algorithmic system he helped develop that discovers new mathematical formulas and democratizes mathematical research. We'll hear about the scientific community's mixed reactions to this innovative approach. The conversation then shifts to his work with Neutrino, a company he founded that uses AI and continuous monitoring devices to understand how food affects individual health. We delve into the complexities of nutrition science, the challenges of processing multiple data streams, and the future of personalized health monitoring. Throughout the episode, Yaron shares insights on bridging theoretical research with practical applications, and the role of AI in advancing both pure mathematics and healthcare. 00:00 Yaron Hadad's Journey: From Physics to AI in Healthcare 04:50 The Complexity of Einstein's Equations and Their Solutions 10:12 AI in Mathematics: The Ramanujan Machine and Conjectures 15:41 Navigating Criticism: The Scientific Community's Response to Innovation 29:24 The Impact of Algorithms in Mathematics 35:30 The Planck Machine: A New Approach 41:15 Neutrino: A Personal Journey in Nutrition 50:11 Connecting Food Complexity to Health Metrics…

Tech on the Rocks

1
Building a Native Search Engine in PostgreSQL: ParadeDB's Journey to Replace Elasticsearch with Philippe Noël 1:00:21

13 weeks ago1:00:21

1:00:21

In this episode, we chat with Philippe Noël, founder of ParadeDB, about building an Elasticsearch alternative natively on PostgreSQL. We explore the challenges and benefits of extending PostgreSQL versus building a separate system, diving into topics like full-text search, faceted analytics, and why organizations need these capabilities. We discuss the emerging bring-your-own-cloud deployment model, the state of the PostgreSQL extension ecosystem, and what makes a truly production-ready database extension. Philippe shares insights on the future of search technology and how recent AI developments are actually increasing the demand for traditional search capabilities. The conversation also covers the misconceptions around PostgreSQL's scalability and the trade-offs between multi-tenant and single-tenant architectures in modern data infrastructure. Chapters 00:00 Introduction to ParadeDB and Its Mission 06:35 User-Facing Search and Analytics 11:45 The Role of Postgres in Modern Data Solutions 17:30 Future of Multimodal Databases 31:04 The Rise of Fintech and Data Integrity 36:36 Deployment Models: BYOC and Control Plane 43:41 The Evolution of Cloud Infrastructure and Serverless Databases 49:38 The Future of Search and Community Engagement Click here to view the episode transcript.…

Tech on the Rocks

1
Optimizing SQL with LLMs: Building Verified AI Systems at Espresso AI with Ben Lerner 1:06:04

15 weeks ago1:06:04

1:06:04

In this episode, we chat with Ben, founder of Espresso AI, about his journey from building Excel Python integrations to optimizing data warehouse compute costs. We explore his experience at companies like Uber and Google, where he worked on everything from distributed systems to ML and storage infrastructure. We learn about the evolution of his latest venture, which started as a C++ compiler optimization project and transformed into a system for optimizing Snowflake workloads using ML. Ben shares insights about applying LLMs to SQL optimization, the challenges of verified code transformation, and the importance of formal verification in ML systems. Finally, we discuss his practical approach to choosing ML models and the critical lesson he learned about talking to users before building products. Chapters 00:00 Ben's Journey: From Startups to Big Tech 13:00 The Importance of Timing in Entrepreneurship 19:22 Consulting Insights: Learning from Clients 23:32 Transitioning to Big Tech: Experiences at Uber and Google 30:58 The Future of AI: End-to-End Systems and Data Utilization 35:53 Transitioning Between Domains: From ML to Distributed Systems 44:24 Espresso's Mission: Optimizing SQL with ML 51:26 The Future of Code Optimization and AI Click here to view the episode transcript.…

Tech on the Rocks

1
Security as Code: Building Developer-First Security Tools with David Mytton 1:03:51

17 weeks ago1:03:51

1:03:51

In this episode, we chat with David Mytton, founder and CEO of Arcjet and creator of console.dev. We explore his journey from building a cloud monitoring startup to founding a security-as-code company. David shares fascinating insights about bot detection, the challenges of securing modern applications, and why traditional security approaches often fail to meet developers' needs. We discuss the innovative use of WebAssembly for high-performance security checks, the importance of developer experience in security tools, and the delicate balance between security and latency. The conversation also covers his work on environmental technology and cloud computing sustainability, as well as his experience reviewing developer tools for console.dev, where he emphasizes the critical role of documentation in distinguishing great developer tools from mediocre ones. Chapters 00:00 Introduction to David Mytton and Arcjet 07:09 The Evolution of Observability 12:37 The Future of Observability Tools 18:19 Innovations in Data Storage for Observability 23:57 Challenges in AI Implementation 31:33 The Dichotomy of AI and Human Involvement 36:17 Detecting Bots: Techniques and Challenges 42:46 AI's Role in Enhancing Security 47:52 Latency and Decision-Making in Security 52:40 Managing Software Lifecycle and Observability 58:58 The Role of Documentation in Developer Tools Click here to view the episode transcript.…

Tech on the Rocks

1
Dev Environments in the AI Era: Standardizing Development Infrastructure with Daytona's Ivan 1:09:23

19 weeks ago1:09:23

1:09:23

In this episode, we chat with Ivan, co-founder and CEO of Daytona, about the evolution of developer environments and tooling. We explore his journey from founding CodeAnywhere in 2009, one of the first browser-based IDEs, to creating the popular Shift developer conference, and now building Daytona's dev environment automation platform. We discuss the changing landscape of development environments, from local-only setups to today's complex hybrid configurations, and why managing these environments has become increasingly challenging. Ivan shares insights about open source business models, the distinction between users and buyers in dev tools, and what the future holds for AI-assisted development. We also learn about Daytona's unique approach to solving dev environment complexity through standardization and automation, and get Ivan's perspective on the future of IDE companies in an AI-driven world. Chapters 00:00 Introduction to Ivan and Daytona 07:22 Understanding Development Environments 13:59 The User vs. Buyer Dilemma 22:20 Open Source Strategy and Community Building 29:22 How Daytona Works and Its Value Proposition 37:44 Emerging Trends in Collaborative Coding 44:38 Latency Challenges in AI-Assisted Development 50:41 The Future of Developer Tooling Companies 01:02:29 Lessons from Organizing Conferences…

Tech on the Rocks

1
Evolving Data Infrastructure for the AI Era: AWS, Meta, and Beyond with Roy Ben-Alta 1:03:28

21 weeks ago1:03:28

1:03:28

In this episode, we chat with Roy Ben-Alta, co-founder of Oakminer AI and former director at Meta AI Research, about his fascinating journey through the evolution of data infrastructure and AI. We explore his early days at AWS when cloud adoption was still controversial, his experience building large language models at Meta, and the challenges of training and deploying AI systems at scale. Roy shares valuable insights about the future of data warehouses, the emergence of knowledge-centric systems, and the critical role of data engineering in AI. We'll also hear his practical advice on building AI companies today, including thoughts on model evaluation frameworks, vendor lock-in, and the eternal "build vs. buy" decision. Drawing from his extensive experience across Amazon, Meta, and now as a founder, Roy offers a unique perspective on how AI is transforming traditional data infrastructure and what it means for the future of enterprise software. Chapters 00:00 Introduction to Roy Benalta and AI Background 04:07 Warren Buffett Experience and MBA Insights 06:45 Lessons from Amazon and Meta Leadership 09:15 Early Days of AWS and Cloud Adoption 12:12 Redshift vs. Snowflake: A Data Warehouse Perspective 14:49 Navigating Complex Data Systems in Organizations 31:21 The Future of Personalized Software Solutions 32:19 Building Large Language Models at Meta 39:27 Evolution of Data Platforms and Infrastructure 50:50 Engineering Knowledge and LLMs 58:27 Build vs. Buy: Strategic Decisions for Startups…

Tech on the Rocks

1
From Functions to Full Applications: How Serverless Evolved Beyond AWS Lambda with Nitzan Shapira 58:18

23 weeks ago58:18

58:18

In this episode, we chat with Nitzan Shapira, co-founder and former CEO of Epsagon, which was acquired by Cisco in 2021. We explore Nitzan's journey from working in cybersecurity to building an observability platform for cloud applications, particularly focused on serverless architectures. We learn about the early days of serverless adoption, the challenges in making observability tools developer-friendly, and why distributed tracing was a key differentiator for Epsagon. We discuss the evolution of observability tools, the future impact of AI on both observability and software development, and the changing landscape of serverless computing. Finally, we hear Nitzan's current perspective on enterprise AI adoption from his role at Cisco, where he helps evaluate and build new AI-focused business lines. 03:17 Transition from Security to Observability 09:52 Exploring Ideas and Choosing Serverless 16:43 Adoption of Distributed Tracing 20:54 The Future of Observability 25:26 Building a Product that Developers Love 31:03 Challenges in Observability and Data Costs 32:47 The Excitement and Evolution of Serverless 35:44 Serverless as a Horizontal Platform 37:15 The Future of Serverless and No-Code/Low-Code Tools 38:15 Technical Limits and the Future of Serverless 40:38 Navigating Near-Death Moments and Go-to-Market Challenges 48:36 Cisco's Gen .AI Ecosystem and New Business Lines 50:25 The State of the AI Ecosystem and Enterprise Adoption 53:54 Using AI to Enhance Engineering and Product Development 55:02 Using AI in Go-to-Market Strategies…

Tech on the Rocks

1
From GPU Compilers to architecting Kubernetes: A Conversation with Brian Grant 1:01:45

26 weeks ago1:01:45

1:01:45

From GPU computing pioneer to Kubernetes architect, Brian Grant takes us on a fascinating journey through his career at the forefront of systems engineering. In this episode, we explore his early work on GPU compilers in the pre-CUDA era, where he tackled unique challenges in high-performance computing when graphics cards weren't yet designed for general computation. Brian then shares insights from his time at Google, where he helped develop Borg and later became the original lead architect of Kubernetes. He explains key architectural decisions that shaped Kubernetes, from its extensible resource model to its approach to service discovery, and why they chose to create a rich set of abstractions rather than a minimal interface. The conversation concludes with Brian's thoughts on standardization challenges in cloud infrastructure and his vision for moving beyond infrastructure as code, offering valuable perspective on both the history and future of distributed systems. Links: Brian Grant LI Chapters 00:00 Introduction and Background 03:11 Early Work in High-Performance Computing 06:21 Challenges of Building Compilers for GPUs 13:14 Influential Innovations in Compilers 31:46 The Future of Compilers 33:11 The Rise of Niche Programming Languages 34:01 The Evolution of Google's Borg and Kubernetes 39:06 Challenges of Managing Applications in a Dynamically Scheduled Environment 48:12 The Need for Standardization in Application Interfaces and Management Systems 01:00:55 Driving Network Effects and Creating Cohesive Ecosystems Click here to view the episode transcript.…

Tech on the Rocks

1
Proving Code Correctness: FizzBee and the Future of Formal Methods in Software Design with FizzBee's creator JP 1:01:28

28 weeks ago1:01:28

1:01:28

In this episode, we chat with JP, creator of FizzBee, about formal methods and their application in software engineering. We explore the differences between coding and engineering, discussing how formal methods can improve system design and reliability. JP shares insights from his time at Google and explains why tools like FizzBee are crucial for distributed systems. We delve into the challenges of adopting formal methods in industry, the potential of FizzBee to make these techniques more accessible, and how it compares to other tools like TLA+. Finally, we discuss the future of software development, including the role of LLMs in code generation and the ongoing importance of human engineers in system design. Links FizzBee FizzBee Github Repo FizzBee Blog Chapters 00:00 Introduction and Overview 02:42 JP's Experience at Google and the Growth of the Company 04:51 The Difference Between Engineers and Coders 06:41 The Importance of Rigor and Quality in Engineering 10:08 The Limitations of QA and the Need for Formal Methods 14:00 The Role of Best Practices in Software Engineering 14:56 Design Specification Languages for System Correctness 21:43 The Applicability of Formal Methods in Distributed Systems 31:20 Getting Started with FizzBee: A Practical Example 36:06 Common Assumptions and Misconceptions in Distributed Systems 43:23 The Role of FizzBee in the Design Phase 48:04 The Future of FizzBee: LLMs and Code Generation 58:20 Getting Started with FizzBee: Tutorials and Online Playground Click here to view the episode transcript.…

Tech on the Rocks

1
MLOps Evolution: Data, Experiments, and AI with Dean Pleban from DagsHub 53:55

29 weeks ago53:55

53:55

In this episode, we chat with Dean Pleban, CEO of DagsHub, about machine learning operations. We explore the differences between DevOps and MLOps, focusing on data management and experiment tracking. Dean shares insights on versioning various components in ML projects and discusses the importance of user experience in MLOps tools. We also touch on DagsHub's integration of AI in their product and Dean's vision for the future of AI and machine learning in industry. Links DagsHub The MLOps Podcast Dean on LI Chapters 00:00 Introduction and Background 03:03 Challenges of Managing Machine Learning Projects 10:00 The Concept of Experiments in Machine Learning 12:51 Data Curation and Validation for High-Quality Data 27:07 Connecting the Components of Machine Learning Projects with DAGS Hub 29:12 The Importance of Data and Clear Interfaces 43:29 Incorporating Machine Learning into DAGsHub 51:27 The Future of ML and AI…

Tech on the Rocks

1
How Denormalized is Building ‘DuckDB for Streaming’ with Apache DataFusion 1:02:01

31 weeks ago1:02:01

1:02:01

In this episode, Kostas and Nitay are joined by Amey Chaugule and Matt Green, co-founders of Denormalized. They delve into how Denormalized is building an embedded stream processing engine—think “DuckDB for streaming”—to simplify real-time data workloads. Drawing from their extensive backgrounds at companies like Uber, Lyft, Stripe, and Coinbase. Amey and Matt discuss the challenges of existing stream processing systems like Spark, Flink, and Kafka. They explain how their approach leverages Apache DataFusion, to create a single-node solution that reduces the complexities inherent in distributed systems. The conversation explores topics such as developer experience, fault tolerance, state management, and the future of stream processing interfaces. Whether you’re a data engineer, application developer, or simply interested in the evolution of real-time data infrastructure, this episode offers valuable insights into making stream processing more accessible and efficient. Contacts & Links Amey Chaugule Matt Green Denormalized Denormalized Github Repo Chapters 00:00 Introduction and Background 12:03 Building an Embedded Stream Processing Engine 18:39 The Need for Stream Processing in the Current Landscape 22:45 Interfaces for Interacting with Stream Processing Systems 26:58 The Target Persona for Stream Processing Systems 31:23 Simplifying Stream Processing Workloads and State Management 34:50 State and Buffer Management 37:03 Distributed Computing vs. Single-Node Systems 42:28 Cost Savings with Single-Node Systems 47:04 The Power and Extensibility of Data Fusion 55:26 Integrating Data Store with Data Fusion 57:02 The Future of Streaming Systems 01:00:18 intro-outro-fade.mp3 Click here to view the episode transcript.…

Tech on the Rocks

1
Unifying structured and unstructured data for AI: Rethinking ML infrastructure with Nikhil Simha and Varant Zanoyan 1:01:45

33 weeks ago1:01:45

1:01:45

In this episode, we dive deep into the future of data infrastructure for AI and ML with Nikhil Simha and Varant Zanoyan, two seasoned engineers from Airbnb and Facebook. Nikhil and Varant share their journey from building real-time data systems and ML infrastructure at tech giants to launching their own venture. The conversation explores the intricacies of designing developer-friendly APIs, the complexities of handling both batch and streaming data, and the delicate balance between customer needs and product vision in a startup environment. Contacts & Links Nikhil Simha Varant Zanoyan Chronon project Chapters 00:00 Introduction and Past Experiences 04:38 The Challenges of Building Data Infrastructure for Machine Learning 08:01 Merging Real-Time Data Processing with Machine Learning 14:08 Backfilling New Features in Data Infrastructure 20:57 Defining Failure in Data Infrastructure 26:45 The Choice Between SQL and Data Frame APIs 34:31 The Vision for Future Improvements 38:17 Introduction to Chrono and Open Source 43:29 The Future of Chrono: New Computation Paradigms 48:38 Balancing Customer Needs and Vision 57:21 Engaging with Customers and the Open Source Community 01:01:26 Potential Use Cases and Future Directions Click here to view the episode transcript.…

Tech on the Rocks

1
Stream processing, LSMs and leaky abstractions with Chris Riccomini 53:06

34 weeks ago53:06

53:06

In this episode, we chat with Chris Riccomini about the evolution of stream processing and the challenges in building applications on streaming systems. We also chat about leaky abstractions, good and bad API designs, what Chris loves and hates about Rust and finally about his exciting new project that involves object storage and LSMs. Connect with Chris at: LinkedIn X Blog Materialized View Newsletter - His newsletter The missing README - His book SlateDB - His latest OSS Project Chapters 00:00 Introduction and Background 04:05 The State of Stream Processing Today 08:53 The Limitations of SQL in Streaming Systems 14:00 Prioritizing the Developer Experience in Stream Processing 18:15 Improving the Usability of Streaming Systems 27:54 The Potential of State Machine Programming in Complex Systems 32:41 The Power of Rust: Compiling and Language Bindings 34:06 The Shift from Sidecar to Embedded Libraries Driven by Rust 35:49 Building an LSM on Object Storage: Cost-Effective State Management 39:47 The Unbundling and Composable Nature of Databases 47:30 The Future of Data Systems: More Companies and Focus on Metadata Click here to view the episode transcript.…

Selamat datang ke Player FM

Player FM mengimbas laman-laman web bagi podcast berkualiti tinggi untuk anda nikmati sekarang. Ia merupakan aplikasi podcast terbaik dan berfungsi untuk Android, iPhone, dan web. Daftar untuk melaraskan langganan merentasi peranti.

Dengarkan lebih 500+ topik

Ring Battery Doorbell, Head-to-Toe Video, Live View with Two-Way Talk, and Motion Detection & Alerts (2024 release), Venetian Bronze

Pink Pony Club

Bounty Quick Size Paper Towels, White, 8 Family Rolls = 20 Regular Rolls (Packaging May Vary)

Podcast Berbaloi untuk Didengar

Tech on the Rocks « » How Denormalized is Building ‘DuckDB for Streaming’ with Apache DataFusion

How Denormalized is Building ‘DuckDB for Streaming’ with Apache DataFusion

Podcast Berbaloi untuk Didengar

Selamat datang ke Player FM

The Let Them Theory: A Life-Changing Tool That Millions of People Can't Stop Talking About

The Let Them Theory: A Life-Changing Tool That Millions of People Can't Stop Talking About

Minecraft

Amazon Basics Dog and Puppy Pee Pads with Leak-Proof Quick-Dry Design for Potty Training, Standard Absorbency, Regular Size, 22 x 22 Inches, Pack of 100, Blue & White

The Let Them Theory: A Life-Changing Tool That Millions of People Can't Stop Talking About

Panduan Rujukan Pantas

Tech on the Rocks « »
How Denormalized is Building ‘DuckDB for Streaming’ with Apache DataFusion