Back to all terms
Database
Databasebasic

Database Normalization (1NF-3NF)

The process of organizing database columns and tables to reduce data redundancy and improve data integrity through progressive normal forms.

Also known as: normal forms, 1NF 2NF 3NF, relational normalization

Description

Normalization is a systematic approach to decomposing tables to eliminate redundancy and dependency anomalies. First Normal Form (1NF) requires that each column contains atomic (indivisible) values and each row is unique -- no repeating groups or arrays. Second Normal Form (2NF) builds on 1NF by requiring that every non-key column depends on the entire primary key, not just part of it -- this matters for tables with composite primary keys. Third Normal Form (3NF) requires that non-key columns depend only on the primary key, not on other non-key columns (no transitive dependencies).

For example, a denormalized orders table might store customer_name, customer_email, product_name, and product_price alongside order data. This creates update anomalies: changing a customer's email requires updating every order row for that customer. Normalizing to 3NF splits this into customers, products, and orders tables linked by foreign keys, ensuring each fact is stored exactly once.

Most production schemas should aim for 3NF as a starting point. Beyond 3NF, there are Boyce-Codd Normal Form (BCNF), 4NF, and 5NF, but they are rarely needed in practice. The discipline of normalization forces you to think carefully about data dependencies and entity boundaries, which leads to cleaner schemas even if you later selectively denormalize for performance. Always normalize first, then denormalize with clear justification.

Prompt Snippet

Design the initial schema in Third Normal Form: ensure every non-key attribute depends on the key, the whole key, and nothing but the key. Identify transitive dependencies (e.g., order -> customer_id -> customer_email) and extract them into separate tables with foreign key constraints. Use an ER diagram tool (dbdiagram.io or pgModeler) to visualize relationships during design review. Document any intentional violations of 3NF in the schema's ADR with the performance justification and the denormalization strategy used.

Tags

data-modelingfundamentalsschema-designbest-practices