Denormalization
- Denormalization is the process of intentionally introducing redundancy into a database by combining tables or adding derived data.
- It’s the opposite of normalization.
- The goal is not to “break the rules” but to improve performance in real-world applications.
In short:
- Normalization = Optimize for consistency & integrity.
- Denormalization = Optimize for query performance (sometimes at the cost of redundancy).
Why Do We Denormalize?
Fully normalized databases are great for data integrity, but they have some problems in practice:
- Performance Issues
- In normalized design, retrieving data often requires many joins.
- Joins are expensive for large datasets.
- Reporting and Analytics
- Applications like dashboards, reports, or data warehouses need fast aggregation, not complex joins.
- Read-Heavy Workloads
- If the database is mostly used for reads (not writes), denormalization makes sense.
- Caching Frequently Used Data
- Instead of computing results repeatedly (like totals or counts), we may store them directly.
When to Use Denormalization
You should consider denormalization when:
- You need faster queries (fewer joins).
- You have a read-heavy system (e.g., reporting, dashboards).
- The system can tolerate some redundancy.
- You want to precompute aggregates (e.g., order totals).
- You are designing a data warehouse (star schema, OLAP).
Avoid denormalization if:
- Your system is write-heavy (because updates become harder).
- Data consistency is critical and anomalies cannot be tolerated.
Example of Denormalization
Normalized Design (3NF)
Customer
| CustomerID | Name | City |
|---|---|---|
| 1 | Alice | Dhaka |
| 2 | Bob | Chittagong |
Order | ||
| OrderID | CustomerID | OrderDate |
| ------- | ---------- | ---------- |
| 101 | 1 | 2023-01-01 |
| 102 | 2 | 2023-01-05 |
To get Order with Customer Name & City, we must join:
SELECT o.OrderID, o.OrderDate, c.Name, c.City
FROM Order o
JOIN Customer c ON o.CustomerID = c.CustomerID;
Denormalized Design
We combine Order and Customer into one table:
| OrderID | CustomerID | Name | City | OrderDate |
|---|---|---|---|---|
| 101 | 1 | Alice | Dhaka | 2023-01-01 |
| 102 | 2 | Bob | Chittagong | 2023-01-05 |
Types of Denormalization
- Combining Tables
- Merging related tables into one (as in Customer + Order).
- Adding Redundant Columns
- E.g., Store
CustomerNamedirectly in Order table.
- E.g., Store
- Precomputing Derived Data
- E.g., Store
OrderTotalin theOrdertable instead of calculating it fromOrderDetailsevery time.
- E.g., Store
- Duplicating Tables Create summary tables for reporting (like monthly sales).
- Star Schema / Snowflake Schema (in Data Warehouses)
- Fact table (sales) + dimension tables (customer, product, time).
Normalization vs Denormalization
| Aspect | Normalization | Denormalization |
|---|---|---|
| Goal | Reduce redundancy, improve integrity | Improve performance, reduce joins |
| Pros | Consistent data, avoids anomalies | Faster queries, better for reporting |
| Cons | Slow queries (many joins), not optimized for reads | Redundancy, anomalies possible |
| Best Use | OLTP (transactional systems) | OLAP (reporting, analytics, warehouses) |