Indexing

An index in a database is like the index of a book — instead of reading the whole book to find a topic, you can go directly to the page numbers listed.

An Index is a database object that improves the speed of data retrieval (SELECT queries) but can slow down INSERT, UPDATE, and DELETE operations because indexes must also be updated.

Index columns that are frequently used in WHERE, JOIN, ORDER BY, and GROUP BY clauses.

In SQL:

An index is a data structure that stores pointers to rows in a table, organized to make lookups faster.
Without an index, SQL must perform a full table scan (check every row).
With an index, SQL can jump directly to the matching rows.

Why Use Indexes?

Indexes improve performance for:

Searching (SELECT queries with WHERE)
Sorting (ORDER BY)
Joining multiple tables
Filtering data with conditions

Always index primary keys and foreign keys. Create composite indexes carefully (leftmost rule).

How Indexes Work Internally

Most relational databases (MySQL, PostgreSQL, SQL Server, Oracle) use a B-Tree or B+Tree structure for standard indexes:

B-Tree structure keeps keys sorted.
When you search, it’s logarithmic time (O(log n)), not linear (O(n)).
Each node contains:
- Key (indexed value)
- Pointer to the actual row in the table (or to more index nodes)
Databases navigate the tree like searching in a phone book.

Example for an index on last_name:

        [Brown]
       /       \
   [Adams]   [Smith]

If you search for "Jones", the DB quickly jumps into the right branch instead of scanning all rows.

Types of Indexes

Index Type	Description	Example Use Case
Single-column index	Index on one column	Searching by `email`
Composite index	Index on multiple columns	Searching by `(last_name, first_name)`
Unique index	Ensures values are unique	Primary key, unique email
Full-text index	For text searching	Search in articles or descriptions
Hash index	Uses hash table for equality searches	Exact lookups (PostgreSQL, MySQL MEMORY table)
Clustered index	Physically orders table data by the index	Primary key in SQL Server
Non-clustered index	Separate from table data, points to rows	Secondary indexes

Creating an Index

Create a Table with Index

CREATE TABLE employees (
    emp_id INT PRIMARY KEY,
    name VARCHAR(100),
    department VARCHAR(50),
    salary DECIMAL(10,2),
    INDEX (department),       -- normal index
    UNIQUE INDEX (name)       -- unique index
);

emp_id → Primary Key index.
department → Regular index (useful for searching/filtering by department).
name → Unique index (prevents duplicate names).

Add Index to Existing Table

-- Add a simple index
CREATE INDEX idx_department ON employees(department);

-- Add a multi-column index
CREATE INDEX idx_dept_salary ON employees(department, salary);

-- Add a unique index
CREATE UNIQUE INDEX idx_name_unique ON employees(name);

Now, a query like:

SELECT * FROM employees WHERE department = 'IT';

Will use idx_department to find the row quickly.

Composite Index Example

CREATE INDEX idx_dept_salary ON employees(department, salary);

-- Efficient use
SELECT * FROM employees WHERE department = 'IT' AND salary > 50000;

-- Also works (prefix)
SELECT * FROM employees WHERE department = 'IT';

-- ❌ Not efficient
SELECT * FROM employees WHERE salary > 50000;

MySQL can only use the leftmost column(s) of a composite index.

Covering Index

If an index contains all the columns needed by a query, MySQL does not even read the table → it uses only the index. This is called a covering index.

CREATE INDEX idx_dept_salary_name ON employees(department, salary, name);

SELECT name, salary FROM employees WHERE department = 'IT';

MySQL can get department, salary, and name directly from the index → no table lookup.

How Indexes Improve Performance

Without index:

DB scans all rows (O(n) complexity).
Slower for large datasets.

With index:

DB uses B-Tree navigation (O(log n) complexity).
Much faster for lookups, filtering, sorting, joins.

Downsides of Indexes Indexes are not free:

Storage cost — They take extra disk space.
Write cost — Inserts, updates, and deletes become slower because the index must also be updated.
Over-indexing — Too many indexes can harm performance.

Checking Index Usage

You can check if your query uses an index:

EXPLAIN SELECT * FROM employees WHERE department = 'IT';

Which tables are involved
The order in which tables are read
Whether indexes are used
How many rows MySQL expects to examine
The join type MySQL chooses

This helps you understand why a query is slow and how to optimize it.

EXPLAIN ANALYZE runs the query and shows actual execution time and rows scanned (not just estimates).

Key Columns in `EXPLAIN` Output

Column	Meaning
id	The query step/order of execution. Higher = later execution.
select_type	The type of query (SIMPLE, PRIMARY, SUBQUERY, etc.).
table	The table being accessed.
type	The join type (VERY important for performance).
possible_keys	Indexes MySQL could use.
key	The index MySQL actually used.
rows	Estimated number of rows to scan.
Extra	Additional details (e.g., “Using where”, “Using index”, “Using temporary”).

Important `type` Values (Query Efficiency)

The type column shows how MySQL reads data. Ordered from best → worst:

system / const → Only 1 row (very fast).
eq_ref → Unique index lookup.
ref → Non-unique index lookup.
range → Index range scan.
index → Full index scan.
ALL → Full table scan (⚠️ slowest).

Goal: Keep queries using ref, range, or const. Avoid ALL.

How Indexes Work Internally​

Types of Indexes​

Creating an Index​

Covering Index​

How Indexes Improve Performance​

Checking Index Usage​

Key Columns in EXPLAIN Output​

Important type Values (Query Efficiency)​