Partitioning

When working with very large datasets, queries can become slow and storage can be inefficient.

Partitioning is a MySQL technique that splits a large table into smaller, more manageable pieces (called partitions) but still treats them as a single logical table.

Partitioning = dividing a table’s rows into separate physical storage chunks.
Queries can target only the relevant partition instead of scanning the entire table.
MySQL decides automatically which partition to search based on the query condition.

This improves query performance, manageability, and maintenance.

Benefits of Partitioning

Faster queries (partition pruning → scans only relevant partitions).
Better performance with huge tables (billions of rows).
Easier maintenance (you can archive or drop a partition without touching the rest).
Efficient storage (old data can be separated).

Limitations of Partitioning

Partition key must be part of PRIMARY KEY or UNIQUE KEY.
No foreign keys allowed in partitioned tables.
Too many partitions can slow down queries.
Not all storage engines support it (must be InnoDB in modern MySQL).

Types of Partitioning

RANGE → ranges of values (time-based data).
LIST → specific values (categories).
HASH → evenly distribute with a hash function.
KEY → system-managed hash (usually on primary keys).

RANGE Partitioning

Rows are placed into partitions based on ranges of values.

Good for time-based data (e.g., monthly logs).

CREATE TABLE sales (
  id INT,
  sale_date DATE,
  amount DECIMAL(10,2)
)
PARTITION BY RANGE (YEAR(sale_date)) (
  PARTITION p2019 VALUES LESS THAN (2020),
  PARTITION p2020 VALUES LESS THAN (2021),
  PARTITION p2021 VALUES LESS THAN (2022),
  PARTITION pmax VALUES LESS THAN MAXVALUE
);

Data from 2019 goes into p2019, 2020 into p2020, and so on.

Query:

SELECT * FROM sales WHERE sale_date BETWEEN '2020-01-01' AND '2020-12-31';

MySQL only scans p2020, not the whole table.

LIST Partitioning

Rows are partitioned based on a list of specific values.

Good for categorical data (e.g., regions, departments).

CREATE TABLE employees (
  id INT,
  name VARCHAR(50),
  region VARCHAR(10)
)
PARTITION BY LIST COLUMNS (region) (
  PARTITION p_north VALUES IN ('North'),
  PARTITION p_south VALUES IN ('South'),
  PARTITION p_east VALUES IN ('East'),
  PARTITION p_west VALUES IN ('West')
);

HASH Partitioning

Rows are distributed into partitions based on a hash function.

Good for evenly distributing data when values are unpredictable.

CREATE TABLE orders (
  order_id INT,
  customer_id INT,
  amount DECIMAL(10,2)
)
PARTITION BY HASH(customer_id) PARTITIONS 4;

Rows are distributed across 4 partitions using a hash of customer_id. Useful for load balancing queries across partitions.

KEY Partitioning

Similar to HASH, but MySQL chooses the hash function internally.

Typically used with primary keys.

CREATE TABLE products (
  product_id INT PRIMARY KEY,
  name VARCHAR(100),
  price DECIMAL(10,2)
)
PARTITION BY KEY(product_id) PARTITIONS 3;

MySQL automatically hashes product_id into 3 partitions.

Practical Example of Partitioning

Imagine a website access log table with millions of rows per year:

CREATE TABLE access_logs (
  id INT AUTO_INCREMENT PRIMARY KEY,
  log_date DATE NOT NULL,
  user_id INT,
  action VARCHAR(100)
)
PARTITION BY RANGE (YEAR(log_date)*100 + MONTH(log_date)) (
  PARTITION p202201 VALUES LESS THAN (202202),
  PARTITION p202202 VALUES LESS THAN (202203),
  PARTITION p202203 VALUES LESS THAN (202204),
  PARTITION pmax VALUES LESS THAN MAXVALUE
);

January 2022 logs → p202201
February 2022 logs → p202202

Query example:

SELECT * FROM access_logs WHERE log_date BETWEEN '2022-02-01' AND '2022-02-28';

Only scans p202202, making it much faster.

Types of Partitioning​

RANGE Partitioning​

LIST Partitioning​

HASH Partitioning​

KEY Partitioning​

Practical Example of Partitioning​