What is lambda architecture?

Lambda architecture is a big data systems design pattern that combines both batch and real-time stream processing to meet the latency and throughput needs of large-scale data processing pipelines.

It serves requests using a read-optimized view generated by batch processing while using fast stream processing on recent data to fill the batch latency gap. The batch and stream results are stitched together at query time.

Lambda architecture emerged as a popular pattern but adds complexity of managing separate pipelines. The Kappa architecture and unified processing approaches simplify this by collapsing the batch and streaming into a single pipeline.

What does it do? How does it work?

In a lambda architecture, incoming data is dispatched to both a batch processing system that stores the master dataset, and a lower-latency stream processing system that handles recent data.

Queries are resolved by merging results from both - taking the pre-computed batch results for historical data, and fast real-time stream results for recent data.

Why is it important? Where is it used?

Lambda architecture gained popularity as a pattern for handling large data volumes with high throughput batch processing and low latency real-time analytics.

It can be useful for workloads that require a combination of real-time data queries and complex analytics on the full historical dataset. However, complexity and operational costs have led to alternative unified approaches.

FAQ

What are the components of a lambda architecture?

The core components of a lambda architecture are:

Batch processing layer for master dataset
Stream processing layer for recent real-time data
Serving layer to respond to queries
Coordination glue code to orchestrate and reconcile batch and stream processing

What are the pros and cons of lambda architecture?

Pros:

Tuned for latency and throughput
Robust and fault-tolerant

Cons:

Complexity of operating disparate systems
Reconciling batch and stream results
Repeated reprocessing of data

How is it different from Kappa architecture?

The Kappa architecture aims to simplify lambda architecture by eliminating the batch processing component and relying solely on stream processing for both real-time and historical processing.

When is lambda architecture a good choice?

Lambda architecture excels for workloads requiring:

Very low latency reads on recent data
High throughput storage and queries on the full master dataset
Complex analytical batch processing jobs

Lambda Architecture

What is lambda architecture?

What does it do? How does it work?

Why is it important? Where is it used?

FAQ

What are the components of a lambda architecture?

What are the pros and cons of lambda architecture?

How is it different from Kappa architecture?

When is lambda architecture a good choice?

References:

Related Entries

Kappa Architecture

Unified Processing

Batch Processing

Fresh Perspectives And Stories
From Our Team

Get early access to AI-native data infrastructure

Lambda Architecture

What is lambda architecture?

What does it do? How does it work?

Why is it important? Where is it used?

FAQ

What are the components of a lambda architecture?

What are the pros and cons of lambda architecture?

How is it different from Kappa architecture?

When is lambda architecture a good choice?

References:

Related Entries

Kappa Architecture

Unified Processing

Batch Processing

Fresh Perspectives And StoriesFrom Our Team

Get early access to AI-native data infrastructure

Fresh Perspectives And Stories
From Our Team