Skip to main content

Indexing

An index in a database is like the index of a book — instead of reading the whole book to find a topic, you can go directly to the page numbers listed.

An Index is a database object that improves the speed of data retrieval (SELECT queries) but can slow down INSERT, UPDATE, and DELETE operations because indexes must also be updated.

  • Index columns that are frequently used in WHERE, JOIN, ORDER BY, and GROUP BY clauses.

In SQL:

  • An index is a data structure that stores pointers to rows in a table, organized to make lookups faster.
  • Without an index, SQL must perform a full table scan (check every row).
  • With an index, SQL can jump directly to the matching rows.

Why Use Indexes?

Indexes improve performance for:

  • Searching (SELECT queries with WHERE)
  • Sorting (ORDER BY)
  • Joining multiple tables
  • Filtering data with conditions

Always index primary keys and foreign keys. Create composite indexes carefully (leftmost rule).

How Indexes Work Internally

Most relational databases (MySQL, PostgreSQL, SQL Server, Oracle) use a B-Tree or B+Tree structure for standard indexes:

  1. B-Tree structure keeps keys sorted.
  2. When you search, it’s logarithmic time (O(log n)), not linear (O(n)).
  3. Each node contains:
    • Key (indexed value)
    • Pointer to the actual row in the table (or to more index nodes)
  4. Databases navigate the tree like searching in a phone book.

Example for an index on last_name:

        [Brown]
/ \
[Adams] [Smith]

If you search for "Jones", the DB quickly jumps into the right branch instead of scanning all rows.

Types of Indexes

Index TypeDescriptionExample Use Case
Single-column indexIndex on one columnSearching by email
Composite indexIndex on multiple columnsSearching by (last_name, first_name)
Unique indexEnsures values are uniquePrimary key, unique email
Full-text indexFor text searchingSearch in articles or descriptions
Hash indexUses hash table for equality searchesExact lookups (PostgreSQL, MySQL MEMORY table)
Clustered indexPhysically orders table data by the indexPrimary key in SQL Server
Non-clustered indexSeparate from table data, points to rowsSecondary indexes

Creating an Index

Create a Table with Index

CREATE TABLE employees (
emp_id INT PRIMARY KEY,
name VARCHAR(100),
department VARCHAR(50),
salary DECIMAL(10,2),
INDEX (department), -- normal index
UNIQUE INDEX (name) -- unique index
);
  • emp_id → Primary Key index.
  • department → Regular index (useful for searching/filtering by department).
  • name → Unique index (prevents duplicate names).

Add Index to Existing Table

-- Add a simple index
CREATE INDEX idx_department ON employees(department);

-- Add a multi-column index
CREATE INDEX idx_dept_salary ON employees(department, salary);

-- Add a unique index
CREATE UNIQUE INDEX idx_name_unique ON employees(name);

Now, a query like:

SELECT * FROM employees WHERE department = 'IT';

Will use idx_department to find the row quickly.

Composite Index Example

CREATE INDEX idx_dept_salary ON employees(department, salary);

-- Efficient use
SELECT * FROM employees WHERE department = 'IT' AND salary > 50000;

-- Also works (prefix)
SELECT * FROM employees WHERE department = 'IT';

-- ❌ Not efficient
SELECT * FROM employees WHERE salary > 50000;

MySQL can only use the leftmost column(s) of a composite index.

Covering Index

If an index contains all the columns needed by a query, MySQL does not even read the table → it uses only the index. This is called a covering index.

CREATE INDEX idx_dept_salary_name ON employees(department, salary, name);

SELECT name, salary FROM employees WHERE department = 'IT';

MySQL can get department, salary, and name directly from the index → no table lookup.

How Indexes Improve Performance

Without index:

  • DB scans all rows (O(n) complexity).
  • Slower for large datasets.

With index:

  • DB uses B-Tree navigation (O(log n) complexity).
  • Much faster for lookups, filtering, sorting, joins.

Downsides of Indexes Indexes are not free:

  1. Storage cost — They take extra disk space.
  2. Write cost — Inserts, updates, and deletes become slower because the index must also be updated.
  3. Over-indexing — Too many indexes can harm performance.

Checking Index Usage

You can check if your query uses an index:

EXPLAIN SELECT * FROM employees WHERE department = 'IT';
  • Which tables are involved
  • The order in which tables are read
  • Whether indexes are used
  • How many rows MySQL expects to examine
  • The join type MySQL chooses

This helps you understand why a query is slow and how to optimize it.

EXPLAIN ANALYZE runs the query and shows actual execution time and rows scanned (not just estimates).

Key Columns in EXPLAIN Output

ColumnMeaning
idThe query step/order of execution. Higher = later execution.
select_typeThe type of query (SIMPLE, PRIMARY, SUBQUERY, etc.).
tableThe table being accessed.
typeThe join type (VERY important for performance).
possible_keysIndexes MySQL could use.
keyThe index MySQL actually used.
rowsEstimated number of rows to scan.
ExtraAdditional details (e.g., “Using where”, “Using index”, “Using temporary”).

Important type Values (Query Efficiency)

The type column shows how MySQL reads data. Ordered from best → worst:

  • system / const → Only 1 row (very fast).
  • eq_ref → Unique index lookup.
  • ref → Non-unique index lookup.
  • range → Index range scan.
  • index → Full index scan.
  • ALL → Full table scan (⚠️ slowest).

Goal: Keep queries using ref, range, or const. Avoid ALL.