Back to all terms
Database
Databaseintermediate

Database Backup Strategies

Systematic approaches to creating, storing, and verifying database backups to protect against data loss from hardware failure, human error, or security incidents.

Also known as: backup and restore, pg_dump, database snapshots

Description

Database backups are the last line of defense against data loss. A comprehensive backup strategy includes multiple backup types: logical backups (pg_dump/pg_dumpall) that export data as SQL statements, physical backups (pg_basebackup, file-system snapshots) that copy the raw data files, and continuous archiving (WAL archiving) that captures every change for point-in-time recovery. Each type has different trade-offs in terms of backup speed, restore speed, storage requirements, and granularity.

The 3-2-1 backup rule is a minimum standard: keep 3 copies of your data, on 2 different storage media, with 1 copy offsite (different region or provider). For managed databases (RDS, Cloud SQL), automated daily snapshots are usually enabled by default, but verify retention periods and test restores regularly. For self-managed databases, automate backups with cron or systemd timers and stream them to object storage (S3, GCS) with server-side encryption. Implement retention policies: keep daily backups for 7 days, weekly for 4 weeks, and monthly for 12 months.

The most critical and most neglected aspect of backup strategy is restore testing. A backup that cannot be restored is worthless. Schedule monthly restore drills where you restore a backup to a test environment and verify data integrity. Measure and document the Recovery Time Objective (RTO -- how long until the database is operational) and Recovery Point Objective (RPO -- how much data can you afford to lose). Alert on backup failures and monitor backup size trends to detect anomalies.

Prompt Snippet

Implement the 3-2-1 backup strategy: daily pg_basebackup to local storage with 7-day retention, streamed WAL archives to S3 with 30-day retention for point-in-time recovery, and weekly pg_dump logical backups to a cross-region S3 bucket with 90-day retention. Enable server-side encryption (AES-256) on all backup buckets. Automate monthly restore drills to a staging environment and assert row counts within 0.1% of production. Set PagerDuty alerts on backup job failures and monitor backup duration trends -- a 2x increase in backup time may indicate table bloat requiring VACUUM FULL.

Tags

operationsdisaster-recoveryreliabilitysecurity