Soft Deletes vs Hard Deletes
Choosing between marking records as deleted (soft delete) with a flag or timestamp versus permanently removing them (hard delete) from the database.
Description
A hard delete permanently removes a row from the database with a DELETE statement. It is simple, reclaims storage, and ensures deleted data cannot be accidentally accessed. A soft delete marks a row as deleted by setting a flag (is_deleted = true) or timestamp (deleted_at = NOW()) while keeping the row in the table. Subsequent queries must filter out soft-deleted rows, but the data remains available for recovery, auditing, or compliance purposes.
Soft deletes are valuable when you need audit trails (who deleted what and when), undo functionality (users can recover accidentally deleted items), referential integrity preservation (other tables may reference the deleted row), or regulatory compliance (data retention requirements). However, soft deletes add complexity: every query must include WHERE deleted_at IS NULL, indexes must account for the deleted_at column, and table sizes grow indefinitely. ORMs like Prisma and Sequelize support soft deletes via middleware or plugins that automatically filter deleted rows.
A pragmatic approach is to use soft deletes for user-facing data (accounts, documents, projects) where recovery and auditing are valuable, and hard deletes for ephemeral or derived data (sessions, cache entries, temporary tokens). Implement a reaping job that permanently purges soft-deleted records after a retention period (e.g., 90 days) to prevent unbounded table growth. Consider using a separate audit_log or events table instead of soft deletes when you need deletion history but do not need to restore the actual record.
Prompt Snippet
Implement soft deletes using a deleted_at TIMESTAMPTZ column (NULL means active) rather than a boolean flag, as the timestamp provides audit information. Add a partial index CREATE INDEX idx_active ON table (id) WHERE deleted_at IS NULL to maintain query performance on active records. Apply a default scope in the ORM or repository layer that filters WHERE deleted_at IS NULL on all read queries. Implement a scheduled reaper job that runs HARD DELETEs on records where deleted_at < NOW() - INTERVAL '90 days' in batches of 1000 to avoid long-running transactions and lock contention.
Tags
Related Terms
Database Normalization (1NF-3NF)
The process of organizing database columns and tables to reduce data redundancy and improve data integrity through progressive normal forms.
Foreign Key Constraints
Database-enforced rules that guarantee a value in one table's column must reference an existing row in another table, preventing orphaned records.
Cascading Deletes
Automatically deleting child rows when a parent row is deleted, enforced by foreign key constraints with ON DELETE CASCADE.