Skip to main content

Denormalization

  • Denormalization is the process of intentionally introducing redundancy into a database by combining tables or adding derived data.
  • It’s the opposite of normalization.
  • The goal is not to “break the rules” but to improve performance in real-world applications.

In short:

  • Normalization = Optimize for consistency & integrity.
  • Denormalization = Optimize for query performance (sometimes at the cost of redundancy).

Why Do We Denormalize?

Fully normalized databases are great for data integrity, but they have some problems in practice:

  1. Performance Issues
    • In normalized design, retrieving data often requires many joins.
    • Joins are expensive for large datasets.
  2. Reporting and Analytics
    • Applications like dashboards, reports, or data warehouses need fast aggregation, not complex joins.
  3. Read-Heavy Workloads
    • If the database is mostly used for reads (not writes), denormalization makes sense.
  4. Caching Frequently Used Data
    • Instead of computing results repeatedly (like totals or counts), we may store them directly.

When to Use Denormalization

You should consider denormalization when:

  • You need faster queries (fewer joins).
  • You have a read-heavy system (e.g., reporting, dashboards).
  • The system can tolerate some redundancy.
  • You want to precompute aggregates (e.g., order totals).
  • You are designing a data warehouse (star schema, OLAP).

Avoid denormalization if:

  • Your system is write-heavy (because updates become harder).
  • Data consistency is critical and anomalies cannot be tolerated.

Example of Denormalization

Normalized Design (3NF)

Customer

CustomerIDNameCity
1AliceDhaka
2BobChittagong
Order
OrderIDCustomerIDOrderDate
---------------------------
10112023-01-01
10222023-01-05

To get Order with Customer Name & City, we must join:

SELECT o.OrderID, o.OrderDate, c.Name, c.City
FROM Order o
JOIN Customer c ON o.CustomerID = c.CustomerID;

Denormalized Design

We combine Order and Customer into one table:

OrderIDCustomerIDNameCityOrderDate
1011AliceDhaka2023-01-01
1022BobChittagong2023-01-05

Types of Denormalization

  1. Combining Tables
    • Merging related tables into one (as in Customer + Order).
  2. Adding Redundant Columns
    • E.g., Store CustomerName directly in Order table.
  3. Precomputing Derived Data
    • E.g., Store OrderTotal in the Order table instead of calculating it from OrderDetails every time.
  4. Duplicating Tables Create summary tables for reporting (like monthly sales).
  5. Star Schema / Snowflake Schema (in Data Warehouses)
    • Fact table (sales) + dimension tables (customer, product, time).

Normalization vs Denormalization

AspectNormalizationDenormalization
GoalReduce redundancy, improve integrityImprove performance, reduce joins
ProsConsistent data, avoids anomaliesFaster queries, better for reporting
ConsSlow queries (many joins), not optimized for readsRedundancy, anomalies possible
Best UseOLTP (transactional systems)OLAP (reporting, analytics, warehouses)