Indexes in RDBMS

An index in an RDBMS is a separate data structure that improves query performance by allowing the database to locate rows quickly without scanning the entire table.

Core idea:
Indexes speed up READ operations by trading extra storage and write overhead.


Why Indexes Are Needed

Without indexes:

  • Database performs full table scan
  • Every row is checked against the WHERE condition
  • Performance degrades as table size grows

With indexes:

  • Database uses an optimized lookup structure
  • Only required rows are accessed
  • Queries run much faster

How Indexes Work (High Level)

When an index is created:

  • Database creates a separate storage structure
  • Indexed column values are stored in sorted order
  • Each value points to the actual row location (ROWID)

Most RDBMS use B-Tree indexes by default.


What Indexes Store

✔ Indexed column value(s)
✔ Pointer (ROWID) to table row

❌ Full row data
❌ Non-indexed columns

Important:
Index does NOT rearrange table rows.
Table data and index data are stored separately.


What Happens Without an Index

SELECT * FROM customer WHERE email = 'abc@gmail.com';

Without index:

  • DB scans every row
  • Applies condition row by row
  • Time complexity ≈ O(N)

This is called a Full Table Scan.

What Happens With an Index

CREATE INDEX idx_customer_email ON customer(email);

With index:

  • DB searches index B-Tree (log N)
  • Finds matching ROWID
  • Fetches only required row(s)
  • Time complexity ≈ O(log N)

Types of Indexes (Common)

1️⃣ B-Tree Index (Default)

  • Balanced tree structure
  • Supports:
    • =
    • <, >
    • BETWEEN
    • ORDER BY
  • Most commonly used

2️⃣ Hash Index

  • Uses hash function
  • Very fast for = queries
  • ❌ Not suitable for range queries
  • Limited support in many RDBMS

3️⃣ Bitmap Index

  • Uses bitmaps
  • Good for low-cardinality columns
  • Common in data warehouses
  • ❌ Not ideal for OLTP systems

Where Indexes Help Most

Indexes are useful on columns used in:

  • WHERE
  • JOIN
  • ORDER BY
  • GROUP BY

Typical examples:

  • Primary keys
  • Foreign keys
  • Email, account number, reference IDs

Indexes and Write Operations (Important Trade-off)

Indexes slow down:

  • INSERT
  • UPDATE
  • DELETE

Because:

  • Table data must be updated
  • Index structure must also be updated and rebalanced

Golden rule:
Indexes speed up reads but slow down writes.


When You SHOULD Create Indexes

✔ Large tables
✔ Frequently queried columns
✔ Columns used in joins
✔ High-selectivity columns (many unique values)

When You SHOULD NOT Create Indexes

❌ Very small tables
❌ Columns rarely used in queries
❌ Columns with very few distinct values
❌ Write-heavy tables with frequent updates


Index Selectivity (Important Concept)

  • Selectivity = uniqueness of column values
  • High selectivity → good index candidate
  • Low selectivity → poor index candidate

Example:

  • email → high selectivity ✔
  • gender → low selectivity ❌

Indexes in JOINs (Very Important)

SELECT *
FROM orders o
JOIN customer c ON o.customer_id = c.id;

Indexes on:

  • customer.id
  • orders.customer_id

Help:

  • Avoid nested full scans
  • Improve join performance drastically

Banking / Enterprise Perspective

In banking systems:

  • Account number → indexed
  • Customer ID → indexed
  • Transaction reference → indexed

But:

  • Balance columns → careful indexing
  • Write-heavy tables → minimal indexes

Wrong indexing can:

  • Reduce throughput
  • Increase lock contention
  • Cause performance issues

Summary

  • Index is a separate data structure
  • Uses extra storage
  • Improves read performance
  • Slows down write operations
  • Most commonly implemented using B-Trees

Indexes are a performance optimization tool.
Use them thoughtfully to balance read speed, write cost, and storage


This site uses Just the Docs, a documentation theme for Jekyll.