Indexing
An index in a database is like the index of a book — instead of reading the whole book to find a topic, you can go directly to the page numbers listed.
An Index is a database object that improves the speed of data retrieval (SELECT queries) but can slow down INSERT, UPDATE, and DELETE operations because indexes must also be updated.
- Index columns that are frequently used in
WHERE,JOIN,ORDER BY, andGROUP BYclauses.
In SQL:
- An index is a data structure that stores pointers to rows in a table, organized to make lookups faster.
- Without an index, SQL must perform a full table scan (check every row).
- With an index, SQL can jump directly to the matching rows.
Why Use Indexes?
Indexes improve performance for:
- Searching (
SELECTqueries withWHERE) - Sorting (
ORDER BY) - Joining multiple tables
- Filtering data with conditions
Always index primary keys and foreign keys. Create composite indexes carefully (leftmost rule).
How Indexes Work Internally
Most relational databases (MySQL, PostgreSQL, SQL Server, Oracle) use a B-Tree or B+Tree structure for standard indexes:
- B-Tree structure keeps keys sorted.
- When you search, it’s logarithmic time (
O(log n)), not linear (O(n)). - Each node contains:
- Key (indexed value)
- Pointer to the actual row in the table (or to more index nodes)
- Databases navigate the tree like searching in a phone book.
Example for an index on last_name:
[Brown]
/ \
[Adams] [Smith]
If you search for "Jones", the DB quickly jumps into the right branch instead of scanning all rows.
Types of Indexes
| Index Type | Description | Example Use Case |
|---|---|---|
| Single-column index | Index on one column | Searching by email |
| Composite index | Index on multiple columns | Searching by (last_name, first_name) |
| Unique index | Ensures values are unique | Primary key, unique email |
| Full-text index | For text searching | Search in articles or descriptions |
| Hash index | Uses hash table for equality searches | Exact lookups (PostgreSQL, MySQL MEMORY table) |
| Clustered index | Physically orders table data by the index | Primary key in SQL Server |
| Non-clustered index | Separate from table data, points to rows | Secondary indexes |
Creating an Index
Create a Table with Index
CREATE TABLE employees (
emp_id INT PRIMARY KEY,
name VARCHAR(100),
department VARCHAR(50),
salary DECIMAL(10,2),
INDEX (department), -- normal index
UNIQUE INDEX (name) -- unique index
);
emp_id→ Primary Key index.department→ Regular index (useful for searching/filtering by department).name→ Unique index (prevents duplicate names).
Add Index to Existing Table
-- Add a simple index
CREATE INDEX idx_department ON employees(department);
-- Add a multi-column index
CREATE INDEX idx_dept_salary ON employees(department, salary);
-- Add a unique index
CREATE UNIQUE INDEX idx_name_unique ON employees(name);
Now, a query like:
SELECT * FROM employees WHERE department = 'IT';
Will use idx_department to find the row quickly.
Composite Index Example
CREATE INDEX idx_dept_salary ON employees(department, salary);
-- Efficient use
SELECT * FROM employees WHERE department = 'IT' AND salary > 50000;
-- Also works (prefix)
SELECT * FROM employees WHERE department = 'IT';
-- ❌ Not efficient
SELECT * FROM employees WHERE salary > 50000;
MySQL can only use the leftmost column(s) of a composite index.
Covering Index
If an index contains all the columns needed by a query, MySQL does not even read the table → it uses only the index. This is called a covering index.
CREATE INDEX idx_dept_salary_name ON employees(department, salary, name);
SELECT name, salary FROM employees WHERE department = 'IT';
MySQL can get department, salary, and name directly from the index → no table lookup.
How Indexes Improve Performance
Without index:
- DB scans all rows (
O(n)complexity). - Slower for large datasets.
With index:
- DB uses B-Tree navigation (
O(log n)complexity). - Much faster for lookups, filtering, sorting, joins.
Downsides of Indexes Indexes are not free:
- Storage cost — They take extra disk space.
- Write cost — Inserts, updates, and deletes become slower because the index must also be updated.
- Over-indexing — Too many indexes can harm performance.
Checking Index Usage
You can check if your query uses an index:
EXPLAIN SELECT * FROM employees WHERE department = 'IT';
- Which tables are involved
- The order in which tables are read
- Whether indexes are used
- How many rows MySQL expects to examine
- The join type MySQL chooses
This helps you understand why a query is slow and how to optimize it.
EXPLAIN ANALYZE runs the query and shows actual execution time and rows scanned (not just estimates).
Key Columns in EXPLAIN Output
| Column | Meaning |
|---|---|
| id | The query step/order of execution. Higher = later execution. |
| select_type | The type of query (SIMPLE, PRIMARY, SUBQUERY, etc.). |
| table | The table being accessed. |
| type | The join type (VERY important for performance). |
| possible_keys | Indexes MySQL could use. |
| key | The index MySQL actually used. |
| rows | Estimated number of rows to scan. |
| Extra | Additional details (e.g., “Using where”, “Using index”, “Using temporary”). |
Important type Values (Query Efficiency)
The type column shows how MySQL reads data. Ordered from best → worst:
system/const→ Only 1 row (very fast).eq_ref→ Unique index lookup.ref→ Non-unique index lookup.range→ Index range scan.index→ Full index scan.ALL→ Full table scan (⚠️ slowest).
Goal: Keep queries using ref, range, or const. Avoid ALL.