Database Sharding
Distributing data across multiple database instances by splitting rows based on a shard key, enabling horizontal scaling beyond the capacity of a single server.
Description
Sharding partitions a database horizontally so that different rows reside on different database servers (shards). A shard key (e.g., tenant_id, user_id, or geographic region) determines which shard holds each row. The application or a proxy layer routes queries to the correct shard based on the shard key. This allows the system to scale write throughput and storage capacity linearly by adding more shards, overcoming the vertical scaling limits of a single database server.
Choosing a shard key is the most critical decision and is difficult to change later. A good shard key distributes data evenly (avoiding hot shards), aligns with query patterns (most queries should hit a single shard), and is immutable (changing a shard key requires moving data between shards). Common strategies include hash-based sharding (hash the key and modulo by shard count), range-based sharding (e.g., users A-M on shard 1, N-Z on shard 2), and directory-based sharding (a lookup table maps keys to shards).
Sharding introduces significant operational complexity: cross-shard queries require scatter-gather, cross-shard transactions require two-phase commit or saga patterns, schema migrations must be coordinated across all shards, and rebalancing shards as data grows is operationally risky. Before sharding, exhaust simpler scaling strategies: read replicas for read scaling, connection pooling for connection scaling, partitioning (single-server table partitioning in PostgreSQL), and caching. Consider managed sharded databases like CockroachDB, Vitess (for MySQL), or Citus (for PostgreSQL) that handle routing and rebalancing automatically.
Prompt Snippet
Before implementing sharding, exhaust vertical scaling (upgrade instance), read replicas, table partitioning (PostgreSQL declarative partitioning by range or hash), and caching. If sharding is necessary, select a shard key with high cardinality and even distribution -- tenant_id for multi-tenant SaaS, or user_id hashed via consistent hashing. Use Citus for PostgreSQL to get transparent distributed queries and avoid manual routing. Plan for resharding by keeping the shard count as a power of 2 and implementing a shard mapping table rather than hardcoded modulo arithmetic.
Tags
Related Terms
Read Replicas
Secondary database instances that asynchronously replicate data from the primary, offloading read queries to scale read throughput.
Database Replication
Copying data from one database server to another in real-time or near-real-time to provide redundancy, failover capability, and read scaling.
Multi-Tenant Database Design
Architectural patterns for storing multiple customers' data in shared database infrastructure while ensuring isolation, security, and performance fairness.