Splitting a Monolith into Microservices

Introduction

  • Core Idea: Moving from a monolithic architecture to microservices involves breaking down applications into smaller, independently deployable units.

  • Challenge: Transitioning to a distributed system introduces complexities like inter-service communication, data sharing, and failure modes.

  • Goal: Enable scalability—not just technically but organizationally—by allowing teams to work autonomously.


Why Split a Monolith?

  1. Scalability in Team Dynamics:

    • Microservices allow teams to iterate independently.

    • Each service has its own deployment cycle, reducing bottlenecks in large teams.

  2. Addressing Real-World Distribution:

    • Businesses are inherently distributed systems; services model this reality.

    • Enables better integration between interdependent systems (e.g., Single Sign-On, Order Management).

  3. Avoid Coordination Overhead:

    • Monolithic changes are easier to synchronize but lead to rigidity.

    • Microservices allow loosely coupled, bounded contexts that can evolve independently.


Challenges in Splitting Monoliths

Ecommerce Example

Orchestrator

Orchestration-based sagas are easier to understand and manage for complex processes. However, they introduce a central point of control, which can be a bottleneck or single point of failure if not designed carefully. And the communication between the Orchestrator and the rest of the services is still complex and susceptible to errors.

  • GCP Workflows is ideal for lightweight, event-driven, API-first automation where you need to quickly integrate services or microservices (e.g., processing user requests, serverless automation).

  1. The Orchestrator (GCP Workflows) receives the order request.

    • Workflow calls the Order Service API to create an order.

    • Calls the Inventory Service API to reserve items.

    • Calls the Payment Service API to charge the customer.

    • Sends a notification through an email/SMS API.

Cons:

  • Can create a single point of failure if the orchestrator goes down.

  • Less flexible and scalable since the orchestrator must handle all communication.

  • Adds overhead in terms of managing and scaling the central controller.

Choreography

In Choreographed Transactions there is no central coordinator. Instead, each service listens for events from other services and decides when to execute its local transaction, or the necessary corrective actions. More importantly, each service decides what are the correct actions to execute.

For example, in an e-commerce system:

  1. The Order service receives a message, creates an order, and publishes an "OrderCreated" event.

  2. The Inventory service listens for "OrderCreated", reserves the items, and publishes "InventoryReserved".

  3. The Payment service listens for "InventoryReserved", processes the payment, and publishes "PaymentProcessed".

  4. The Order service listens for "PaymentProcessed" and marks the order as complete.

If any step fails (e.g., payment fails), the service publishes a failure event. Other services listen for this and execute compensating actions (e.g., Inventory releases the reserved items).

To implement a choreographed transaction you would typically use services like Kafka. While something like GCP Pub/Sub could be used, it operates in an eventual consistency model, which may not be suitable for some use cases requiring strong consistency across services.

Pros:

  • Decentralized, with no single point of failure.

  • More scalable and flexible, as services interact directly with each other.

  • Reduces bottlenecks by eliminating the need for a central controller.

Idempotency

There is no "exactly once" delivery of events. There is either "at most once" or "at least once". To ensure no messages are lost, you'll want "at least once". That means your services need to be prepared to receive duplicates, and either be able to identify and drop them, or only execute idempotent actions.

By the way, idempotency means that an action can be executed an arbitrary number of times and it will have the same result as if it was executed once. Ideally all actions in all systems should be idempotent, but in event-driven systems it's even more important. Fortunately, event IDs help a lot with this.

The Data Dichotomy:

  • Conflict:

    • Services encourage encapsulation (hiding data).

      • They expose only a limited set of functionalities through APIs or interfaces, while hiding its internal data and implementation details.

    • Databases and storage systems promote making data accessible.

      • Usually centralized and easy for multiple applications to access.

  • Impact: Data across services may become inconsistent and harder to sync/align.

  1. Shared Data Complexities:

    • Inter-service dependencies on common datasets lead to tight coupling.

    • Duplicated or locally altered data can diverge, causing consistency issues.

  2. Design Flaws to Avoid:

    • "Kookie" Databases: Service interfaces that evolve into complex, shared databases.

    • (A) Service interfaces are poorly suited to sharing data at any level of scale

    • (B) Messaging moves data, but provides no historical reference, and this leads to data corruption over time.

    • (C) Shared databases concentrate too much in one place


Solution: Event-Driven Architectures and Streaming Platforms

  1. Distributed Log (e.g., Apache Kafka):

    • Acts as a central source for immutable event streams.

    • Retains historical data for reproducibility and recovery.

  2. Stateful Stream Processing:

    • Services process shared datasets locally while keeping the "golden source" in the log.

    • Combines the benefits of encapsulation with data accessibility.

  3. Advantages:

    • Balance: Resolves the data dichotomy by keeping the data immutable and operations decentralized.

    • Flexibility: Services can join and process streams as needed without polluting the source.

    • Consistency: Reduces data divergence by maintaining a regenerable source of truth.


Practical Steps to Splitting Monoliths

  1. Identify Boundaries:

    • Break down domains into bounded contexts (e.g., SSO, Order Processing).

  2. Define Interfaces:

    • Use APIs or messaging systems for inter-service communication.

  3. Adopt Event-Driven Patterns:

    • Replace synchronous dependencies with event streams to decouple services.

  4. Incremental Transition:

    • Gradually replace monolithic components with microservices.

    • Start with non-critical features to avoid disruption.

  5. Centralize Streams, Decentralize Logic:

    • Use a distributed log for event storage and stream processing engines for business logic.


Key Tools and Concepts

  • Apache Kafka:

    • Stores event streams for long-term, immutable data sharing.

    • Facilitates integration with connectors to export/import data on demand.

  • Stateful Stream Processing:

    • Embeds database-like processing within services to maintain independence.

  • Decentralized Processing:

    • Encapsulate function (not data) within services, ensuring autonomy.


Summary for Interview

When asked about splitting a monolith:

  • Begin: Highlight why companies adopt microservices: scalability, adaptability, and better alignment with distributed business realities.

  • Explain Challenges: Discuss the data dichotomy, shared data issues, and the pitfalls of shared databases or batch transfers.

    • In the real world, business services won't be able to have an easy separation of concerns.

  • Present Solutions: Advocate for event-driven architectures with distributed logs like Kafka, emphasizing its role in balancing encapsulation and accessibility.

  • Provide a Roadmap: Outline an incremental approach—starting with identifying domains, defining interfaces, and adopting event-driven patterns.

Last updated