InfoQ Homepage Distributed Systems Content on InfoQ

Articles

RSS Feed

Newer Older

Architecture & Design

Replacing Database Sequences at Scale Without Breaking 100+ Services

The article discusses the challenges faced during a migration from a relational database to NoSQL, focusing on the importance of database sequences for unique identifiers. It outlines the development of a new sequence service using DynamoDB and a two-tier caching architecture.

Saumya Tyagi
on Apr 03, 2026
Architecture & Design

Event-Driven Patterns for Cloud-Native Banking: Lessons from What Works and What Hurts

Event-driven architecture helps banks decouple systems, scale services, and create clear activity trails. But it also introduces complexity, new failure modes, and operational challenges. Chris Tacey-Green explains where it adds value in banking systems and the practical patterns, such as inbox/outbox and stable event contracts, needed to make it reliable.

Chris Tacey-Green
on Mar 31, 2026
DevOps

Configuration as a Control Plane: Designing for Safety and Reliability at Scale

Configuration has evolved from static deployment files into a live control plane that directly shapes system behavior. The evolution of configuration management highlights why misconfigurations can trigger large outages and how hyperscalers deploy changes safely using staged rollouts, validation, blast radius limits, and automated rollback at scale.

Karthiek Maralla
on Mar 20, 2026
Development

One Cache to Rule Them All: Handling Responses and In-Flight Requests with Durable Objects

Traditional caching fails to stop "thundering herds" where multiple clients trigger the same work during a miss. This article proposes using Cloudflare Durable Objects to treat in-flight work and finished results as two states of one cache entry. By routing to a single owner, systems eliminate redundant tasks. This pattern replaces complex locks with simple promises, simplifying the system design.

Gabor Koos
on Jan 28, 2026
Architecture & Design

Scaling Cloud and Distributed Applications: Lessons and Strategies

The article shares goals and strategies for scaling cloud and distributed applications, focusing on lessons learned from cloud migration at Chase.com at JP Morgan Chase. The discussion centers on three primary goals and the strategies addressing the goals, concluding how these approaches were achieved in practice. For those managing large-scale systems, these lessons provide valuable guidance!

Durai Arasan
on Dec 04, 2025
Cloud

Engineering Principles for Building a Successful Cloud-Prem Solution

Discover how Cloud-Prem solutions combine cloud efficiency with on-premise control, meeting data sovereignty and compliance demands while optimizing operational costs and enhancing customer security.

Satyam Dhar
on Jun 26, 2025
DevOps

Analyzing Apache Kafka Stretch Clusters: WAN Disruptions, Failure Scenarios, and DR Strategies

Proficient in analyzing the dynamics of Apache Kafka Stretch Clusters, I assess WAN disruptions and devise effective Disaster Recovery (DR) strategies. With deep expertise, I ensure high availability and data integrity across multi-region deployments. My insights optimize operational resilience, safeguarding vital services against service level agreement violations.

Srikanth Daggumalli Nishchai Jayanna Manjula
on Jun 20, 2025
Cloud

Designing Resilient Event-Driven Systems at Scale

Learn how to design resilient event-driven systems that scale. Explore key patterns like shuffle sharding and decoupling queues to handle load spikes and failures. Understand common pitfalls like over-relying on retries and neglecting observability for robust, scalable architectures.

Rajesh Kumar Pandey
on May 30, 2025
Cloud

Distributed Cloud Computing: Enhancing Privacy with AI-Driven Solutions

Distributed cloud, PETs, and AI enable secure, private data processing. This integration enhances collaboration, security, and compliance across marketing, finance, and healthcare, addressing the growing need for data protection.

Rohit Garg Ankit Awasthi
on Apr 25, 2025
Java

Reactive Real-Time Notifications with SSE, Spring Boot, and Redis Pub/Sub

Explore the power of reactive programming for building scalable real-time notification systems. Using Spring Boot Reactive and Spring WebFlux, leverage non-blocking operations to handle high-volume, asynchronous data flows efficiently. Discover how Redis Pub/Sub enables event-driven messaging and how the SSE protocol provides persistent connections for instant client updates without polling.

Matteo Rossi
on Nov 21, 2024
Architecture & Design

Taking Advantage of Cell-Based Architectures to Build Resilient and Fault-Tolerant Systems

Cell-based architectures offer a robust approach to building resilient systems. They achieve this through the core principles of isolation, autonomy, and replication. Each cell manages its resources and makes decisions autonomously. Observability for cell-based architecture requires a tailored approach to address the unique challenges and opportunities presented by this distributed system design.

Yury Niño Roa
on Oct 21, 2024
Architecture & Design

Article Series: Cell-Based Architectures: How to Build Scalable and Resilient Systems

In this article series, we take readers on a journey of discovery and provide a comprehensive overview and in-depth analysis of many key aspects of cell-based architectures, as well as practical advice for applying this approach to existing and new architectures.

Rafal Gancarz
on Oct 14, 2024

Newer Articles

Older Articles

InfoQ Software Architects' Newsletter

Articles