Blog

Insights and updates from the Synnada team

Company

Recap of the Data & Drinks Meetup Featuring Apache DataFusion: Amsterdam 2025

The first Data & Drinks event of 2025 took place in Amsterdam on January 23, hosted by Xomnia at their HQ. This edition focused on Apache DataFusion, drawing a highly technical audience eager to explore its real-world applications and inner workings.

Jan 29, 2025Synnada

Recap of the First Apache DataFusion Meetup in Europe: Belgrade 2024

On September 27, 2024, the first Apache DataFusion Meetup in Europe took place in Belgrade, bringing together nearly 70 attendees. The event was held at the Microsoft office, where speakers showcased their work and shared insights on how they are utilizing DataFusion in various projects.

Oct 3, 2024Synnada

Apache DataFusion is a Top Level Project!

Apache DataFusion has been elevated to a Top-Level Project by the Apache Software Foundation, underscoring its maturity and essential role in data processing. This recognition reflects DataFusion's rapid growth, robust performance, and active community engagement.

Aug 2, 2024Synnada

Apache Arrow, Arrow/DataFusion, AI-native Data Infra — An Interview with Our CEO Ozan

Our CEO Ozan recently joined an episode of the Streaming Caffeine podcast — Streaming Caffeine E10: Ozan from Synnada, about Arrow Datafusion, Rust, Databases, SQL, AI — to discuss our perspective on DataFusion and the future of data infrastructure.

Nov 8, 2023Synnada

Modern Data Stack and the Data Chasm Part 2: A Path to Leaner Data Systems

This post explores how pioneering teams at Airbnb, Uber, and Apache Arrow overcame the data chasm, followed by an introduction to the Lean Data Stack paradigm as a way to build durable, economical, and flexible data systems.

Oct 20, 2023Sami Can Tandoğdu, Mehmet Ozan Kabak

Modern Data Stack and the Data Chasm Part 1: Emergence of Complexity in Data Systems

The data ecosystem is rapidly expanding and fragmenting, posing integration challenges industry-wide. Many companies fall into a "data chasm", needing to abruptly scale their tools from 2-4 to 15-20, exacerbating complexity. Some organizations pioneered methodologies to cross this chasm and extract value. How can others navigate this data chasm?

Sep 11, 2023Sami Can Tandoğdu, Mehmet Ozan Kabak

AI / ML: The Race for Specialized Electricity Supremacy

This blog post explores the AI/ML landscape, comparing it to a gold rush where the focus is on providing "specialized electricity" in the form of computing, storage, and networking resources.

Apr 5, 2023Sami Can Tandoğdu, Mehmet Ozan Kabak

Next Frontier: Action-Capable Intelligent Agents

The world of AI and data is undergoing a rapid transformation. Enabling technologies are maturing to a level where we should be able to deploy action-capable, autonomous intelligent agents at scale. But what will it take to make this a reality?

Feb 13, 2023Sami Can Tandoğdu, Mehmet Ozan Kabak

Engineering

Running Windowing Queries in Stream Processing

Windowing queries in stream processing play a pivotal role in handling time-series data. This post unravels how to harness streaming-friendly window functions in queries with just using ANSI-SQL, emphasizing the importance of ordering for achieving optimal results in streaming datasets.

Aug 13, 2023Mustafa Akur, Mehmet Ozan Kabak

Sliding Window Hash Join: Efficiently Joining Infinite Streams with Order Preservation

The Sliding Window Hash Join (SWHJ) algorithm joins potentially infinite streams while preserving the order by building hash tables incrementally, storing only relevant rows from the build side that fall within a sliding window, allowing efficient processing of streams without materializing all data.

Jul 28, 2023Metehan Yıldırım, Mehmet Ozan Kabak

Probabilistic Data Structures in Streaming: Count-Min Sketch

The Count-Min Sketch uses hash functions to map streamed items into a 2D counter array. When processing the stream, items are hashed to incremented counters, frequencies are est. by taking the min count across rows for an item's hashes.

Jul 13, 2023Metehan Yıldırım, Mehmet Ozan Kabak

General-purpose Stream Joins via Pruning Symmetric Hash Joins

Sliding window join for stream processing brings Datafusion a step closer to unified data processing. Find out how to efficiently join the streams with less memory usage and how to intelligently buffer both join sides.

Feb 20, 2023Metehan Yıldırım, Mehmet Ozan Kabak, Mustafa Akur