Multilevel Indexes

In this article, we will discuss multilevel indexes in RDBMS, their types, and examples.

In Relational Database Management Systems (RDBMS), indexes are essential data structures that allow faster data retrieval by reducing the number of disk accesses required to retrieve data. But, traditional indexes can become inefficient as the database size grows. Multilevel indexes provide a solution to this problem by dividing the index into smaller, manageable pieces.

Indexing

Indexing helps to optimize the performance of a database. It minimizes the number of disk accesses required when a query is processed. It is a data structure technique which is used to quickly locate and access the data in a database.

There are two things used in indexing, these are : Search Key or Candidate key and Data Reference or Pointer.

Search key or Candidate key

It contains a copy of the primary key or candidate key of the table. These values are stored in sorted order so that the corresponding data can be accessed quickly. The data may or may not be stored in sorted order.

Data Reference or Pointer

It contains a set of pointers holding the address of the disk block where that particular key value can be found.

There are several categories of indexing that can be used in a database. These are Single- Column Indexing, Multi-Column Indexing, Clustered Indexing, Non-Clustered Indexing, Bitmap Indexing, Full-Text Indexing.

Types of File Organization Mechanism

There are two types of file organization mechanism as given below. We have also listed their indexing methods to store the data for these organization mechanisms.

Sequential File Organization or Ordered Index File

These ordered or sequential file organizations can store data in a dense or sparse format −

Dense Index
Sparse Index

Hash File organization

There are primarily three methods of indexing −

Clustered Indexing
Non-Clustered or Secondary Indexing
Multilevel Indexing

Need of Multilevel Indexes

As the size of the database increases, the size of the index also increases. But, storing a single- level index in main memory may not be practical due to its large size, which requires multiple disk accesses.

To solve this problem multilevel indexing is used. This technique breaks down the main block into smaller blocks. Each smaller block can be stored in a single block. The outer block is further divided into inner blocks, each of which is associated with a data block. This approach reduces overhead and makes it easier to store the index in main memory.

Introduction to Multilevel Indexes

Multilevel indexes refer to a hierarchical structure of indexes. Here, each level of the index provides a more detailed reference to the data. It allows faster data retrieval, reduces disk access, and improves query performance. Multilevel indexes are essential in large databases where traditional indexes are not efficient.

A multi-level index can be created for any first-level index (primary, secondary, or clustering) that has more than one disk block. It's like a search tree, but adding or removing new index entries is challenging because every level of the index is an ordered file.

To solve this problem, most multi-level indexes use B-tree or B+ tree data structures. These structures leave some space in each tree node (disk block) to allow for new index entries. A multi-level index treats the index file like an ordered file with a unique entry for each K(i).

Types of Multilevel Indexes

There are two main types of multilevel indexes: B-Tree Index, B+Tree Index. These are explained as follows below in brief.

B Tree Index

This type of index is used in most database management systems. It is a balanced tree data structure, in which each node in the tree contains a set of keys and pointers to its child nodes.

B- trees are highly efficient for range queries and support insertion and deletion of records without requiring a complete restructuring of the index.

B-T ree Advantages

These are highly efficient for range queries as they allow for quick traversal of the tree structure.
These are balanced and therefore provide predictable search and retrieval times.
These can handle a large number of inserts and deletes efficiently without requiring a complete reorganization of the index.
These work well for both primary and secondary indexes.

B-T ree Disadvantages

These have a higher overhead than other index types, which can become a problem for very large databases.
These can be complex to implement and manage.
These may not be as efficient for point queries as they are for range queries.

B+ Tree Index

This type of index is similar to B-trees, but with some modifications to improve the performance of range queries. In a B+ tree, only the leaf nodes contain actual data records, while the non- leaf nodes act as keys for the child nodes. B+ trees are often used for large databases with high read/write operations as they can handle large amounts of data with relatively low overhead.

B+ Tree Advantages

These are optimized for range queries and are therefore faster for queries that require range searches.
These have a lower overhead than B-trees, which makes them more efficient for large databases.
These perform well for secondary indexes as they store the data only in leaf nodes.
These have a simpler implementation than B-trees.

B+ Tree Disadvantages

These require more disk access than B-trees for point queries, which can impact performance.
These require more overhead for inserts and deletes than B-trees.
These can be more complex to implement and manage than other index types.

Comparison of B-tree and B+ tree indexing methods

This table provides a general comparison. But specific advantages and disadvantages may depend on the particular use case and database system.

	B-Tree	B+ Tree
Data storage	Both internal and leaf nodes store key-value pairs	Only leaf nodes store key- value pairs
Fanout	Lower fanout due to internal nodes storing data	Higher fanout due to internal nodes only storing keys
Range queries	Requires additional disk access to retrieve data from internal nodes	Can perform range queries more efficiently by only accessing leaf nodes
Insertion/deletion	Requires more overhead due to internal nodes storing data	Requires less overhead due to internal nodes only storing keys
Disk space usage	Uses more disk space due to internal nodes storing data	Uses less disk space due to internal nodes only storing keys
Implementation	Simpler to implement and manage	More complex to implement and manage
Query performance	Better for point queries and small datasets	Better for range queries and larger datasets

Summary

In this article, we have discussed an overview of multilevel indexes in Relational Database Management Systems (RDBMS). We explained how traditional indexes become inefficient as the database size grows, and multilevel indexes provide a solution to this problem by dividing the index into smaller pieces.

We have also covered different categories of indexing and file organization mechanisms. It then explains the need for multilevel indexes and introduces the concept of hierarchical structure of indexes. Discussed the two main types of multilevel indexes, i.e., B-tree and B+Tree indexes, their advantages, disadvantages, and a general comparison between the two.

Mithlesh Upadhyay

Updated on: 17-May-2023

5K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started