DEV Community

Cover image for Event Sourcing - System Design Pattern
Kader Khan
Kader Khan

Posted on

Event Sourcing - System Design Pattern

“Imagine every action in your system writes to a timeline. This timeline can be read later to rebuild any version of the system — like time travel.”

The Problem with Traditional CRUD Systems

In traditional systems (like most apps we’ve built):

  • We update the database to change state (e.g., set status = “processed”)
  • We overwrite old values
  • We lose history — we only store the latest state

📌 This leads to real problems such as:

  1. No audit trail
    We often can’t answer questions like: “What exactly happened to this order between 10:01 and 10:03?”

  2. Inconsistencies due to partial failures
    If part of a workflow fails (e.g., processing succeeds, but updating state fails), the system goes into an inconsistent state with no clear way to fix it.

  3. Hard to debug or replay history
    we cannot rewind to a point in time and reconstruct what state should have been.

👉 As systems scale with heavy workloads, these problems get worse. we need a better way to track changes than just “update this value now.”


Event Sourcing — The Core Idea (Solved Problem)

Event Sourcing says:
👉 Instead of saving only the current state in the database, save every change as an event in order.

These events are:

✔ Immutable (never changed after they’re written)
✔ Ordered (every event has a timestamp or sequence)
✔ Replayed to reconstruct the current state

So instead of doing:

Product Price
100

We store events like:

  • PriceChanged from 90 ➝ 100 at 10:01AM
  • PriceChanged from 100 ➝ 110 at 10:10AM

To compute the current state, we simply replay those events.


💡 What Event Sourcing Solves (In Simple Terms)

Traditional CRUD Event Sourcing
Only current state Full history of all changes
Hard to track why something happened we can replay to see why something happened
Race conditions can corrupt data we always record events in a safe log
Hard to debug we’ve got an audit trail

So the problem being solved is not just scaling — it’s:

“How do we store every change in a way we can trace, debug, and rebuild the system state reliably?”


📦 Event Sourcing Architecture (AWS)


🧱 AWS Architecture Example — Ride Booking (From AWS Guidance)

AWS provides a real architecture pattern for event sourcing:

1. User Action — Client Calls API Gateway

A user does something, e.g., Book a Ride.
This request first hits Amazon API Gateway, which exposes a public API endpoint.


2. Lambda Writes an Event to AWS Kinesis(Kafka in AWS)

The Lambda function acts as a command handler:

✔ It checks business logic
✔ It creates an event like RideBooked
✔ It sends this event to Amazon Kinesis Data Streams — an append-only event storage and streaming service

📌 Why Kinesis?
Because it can handle very high write throughput and acts as an event log we can replay.


3. Events Are Stored & Archived

Kinesis doesn’t just stream — we can also:

✔ Archive events in Amazon S3 for long-term retention (for compliance & audits)
✔ Retain events for replay or future analysis

This means our system generates a complete history of every change, backed up indefinitely.


4. Event Processor Lambda Builds Materialized Views

Another Lambda function consumes events from Kinesis to build read models (optimized tables that are easy to query). Typical read stores are:

✔ Amazon Aurora (MySQL/PostgreSQL)
✔ Amazon DynamoDB

This process creates current state views for read-heavy use cases.


5. Replay to Rebuild State (Hydration Model)

If something goes wrong, or we want to compute state at any point in time, we simply replay the events using hydration model stored in Kinesis + archived in S3.

This is what it calls Hydration — re-deriving the current or historical state of the system from the event log.


🧠 Hydration Model Explained (Simple)

Think of hydration as:

🎬 Re-running the entire timeline of events
so that our system always ends up in the correct state.

For example Video streaming platform, in this service example:

  • Event 1: VideoUploaded
  • Event 2: VideoProcessingStarted
  • Event 3: VideoProcessingSucceeded

To know current state:

state = "initial"
apply VideoUploaded → state="uploaded"
apply VideoProcessingStarted → state="processing"
apply VideoProcessingSucceeded → state="success"
Enter fullscreen mode Exit fullscreen mode

That’s Hydration — it rebuilds state by replaying events in order, not by reading a single “status” value.


🐘 Why Kafka or Kinesis Are Used

Both Kafka (used in the transcript example) and Kinesis (AWS alternative) are event streaming platforms — essentially massive, durable, ordered logs of events. Also these are make sure specially, consumer group and topic partitions concepts make sure processors are getting sequential events and patch those sequentially too.

Why this matters

✔ We can replay events — essential for event sourcing
✔ We can scale horizontally (many consumers)
✔ We guarantee event order within partitions — crucial for replay and consistent state reconstruction


📌 Consumer Groups & Topic Partitions (Why They Matter)

When the event volume is large, we cannot have one server read everything.

So we use:

🔹 Kafka Consumer Group

Multiple workers that form a group and share work.
Each worker gets assigned partitions so no duplicates occur.

🔹 Topic Partitions

A topic (event category) is split into partitions — think of partitions as divided lanes of the event log. This allows:

✔ Parallel processing
✔ Ordered event consumption per partition
✔ Scale without losing order for each entity

For example in the video streaming platform pipelines, video A events are always in partition 0 and video B in partition 1, so events for each video are always processed in order even across many workers.


Problem Being Solved

Traditional system:

Database:
video_id | status
------------------
123      | "processing"
Enter fullscreen mode Exit fullscreen mode

Problems:
✔ What if the update failed?
✔ What do you show to the user?
✔ What if you need to know the exact steps the video went through?

Event Sourcing Pattern solves it:

Event Log:
1. VideoUploaded(videoID=123)
2. VideoProcessingStarted(videoID=123)
3. VideoProcessingProgress(videoID=123, percent=50)
4. VideoProcessingFailed(videoID=123, error="timeout")
Enter fullscreen mode Exit fullscreen mode

To get state:

Hydration Model reads:

apply VideoUploaded → status="uploaded"
apply VideoProcessingStarted → status="processing"
apply VideoProcessingProgress → status="processing:50%"
apply VideoProcessingFailed → status="failed"
Enter fullscreen mode Exit fullscreen mode

We can even show why the failure happened — something impossible with simple CRUD.

🧩 AWS Services we can use

Role AWS Service
API entrypoint API Gateway
Command processor AWS Lambda
Event storage Kinesis Data Streams
Archive & audit log Amazon S3
Event distribution EventBridge / DynamoDB Streams
Read-optimized views Aurora / DynamoDB
Async processing Lambda consumers

Top comments (0)