Skip to content

Clustered index

A clustered index determines the physical order of data storage in a database table.^[600-developer-mysql.md] In MySQL's InnoDB engine, the table structure is organized using a B+ tree^[600-developer-mysql.md]. In this structure, leaf nodes act as data pages containing the actual table rows^[600-developer-mysql.md], organized as a "directory + data" composite^[600-developer-mysql.md].

Characteristics

  • Physical Sorting: Because the data pages are the leaf nodes of the tree, the data is sorted physically based on the index key^[600-developer-mysql.md].
  • Primary Key: In InnoDB, the primary key is typically the clustered index^[600-developer-mysql.md].
  • Capacity: A B+ tree with a height of 3 can store data at the "ten million level"^[600-developer-mysql.md].
  • Links: The leaf nodes of the B+ tree are connected via a doubly linked list^[600-developer-mysql.md], which facilitates efficient sequential access and range scans.

Clustered vs. Non-clustered Indexes

The primary difference between clustered and non-clustered indexes lies in how the data is stored:

  • Clustered Index: The leaf nodes store the actual data rows^[600-developer-mysql.md]. There can be only one clustered index per table because the data can only be sorted in one physical order.
  • Non-clustered Index: The leaf nodes do not contain the full data row. Instead, they typically store the primary key value (in InnoDB) which serves as a pointer to the actual data row.^[600-developer-mysql.md]

Sources

  • 600-developer-mysql.md