Thodoris Kouleris
Software Engineer
SQL Indexes
What is an SQL Index?
Indexes in SQL are data structures that help us retrieve data from a database more quickly. They can be compared to the table of contents of a book. For example, if you have a book about C programming and you want to learn about pointers, you don’t need to read the entire book to find the relevant chapter. You look at the table of contents, see that the chapter on pointers is on page 110, and go directly there. SQL indexes work in a similar way.
You create indexes on a column of a database table. When you run a query based on that column, the SQL engine can return results faster using the index.
How SQL Indexes Work
The most common data structure for SQL indexes is the B-Tree, also known as a Balanced Tree. In this data structure, each leaf of the tree can have more than one key. For instance, if each leaf has two keys, it will have three branches:
The left branch contains values smaller than the left key. The middle branch contains values between the left and right keys. The right branch contains values greater than the right key. A more detailed analysis of balanced trees can be found here:
Types of SQL Indexes
Clustered Indexes:
A clustered index determines the physical order of data in a table. This means that the rows are stored on the disk in the same order as the index. A table can have only one clustered index, as rows can be sorted in only one way. Clustered indexes are typically created based on the primary key of the table.
Example:
CREATE CLUSTERED INDEX idx_employee_id ON Employees(EmployeeID);
Non-Clustered Indexes:
A non-clustered index is independent of the table's data. It contains a sorted list of values from the indexed columns and pointers that reference the actual rows in the table. Unlike clustered indexes, a table can have multiple non-clustered indexes.
Example:
CREATE NONCLUSTERED INDEX idx_employee_name ON Employees(LastName);
A more detailed comparison between clustered and non-clustered indexes for SQL Server can be found here:
Unique Indexes:
A unique index ensures that the values in the column or columns that are indexed are unique. This type of index is often used to enforce data uniqueness, similar to a unique constraint.
Example:
CREATE UNIQUE INDEX idx_unique_email ON Employees(Email);
Full-Text Indexes:
A full-text index is used to perform full-text searches, which involve searching for words or phrases within large text columns. This type of index is ideal for applications such as search engines or document management systems.
Example:
CREATE FULLTEXT INDEX idx_fulltext_description ON Products(Description);
When to Use Indexes
-
Right Column Rule: Create an index on a column that is frequently used
in queries with WHERE, JOIN, or ORDER BY clauses.
- Index Limitation Rule: Excessive use of indexes can have the opposite effect of what we expect. This is because more indexes increase the time required for INSERT operations and also increase the database size.
- Type of Data: The type of data being indexed is very important. For example, integers perform better when indexed.
- No Indexes on Small Tables: If the table is small, a full scan might be faster for finding a record than creating indexes.
Using indexes in a database is a critical tool that can significantly enhance data retrieval speed. However, it is essential to identify which data is worth indexing, avoiding a situation where all columns in all tables are indexed, which could lead to adverse results.