Multilevel Indexes


In this article, we will discuss multilevel indexes in RDBMS, their types, and examples.

In Relational Database Management Systems (RDBMS), indexes are essential data structures that allow faster data retrieval by reducing the number of disk accesses required to retrieve data. But, traditional indexes can become inefficient as the database size grows. Multilevel indexes provide a solution to this problem by dividing the index into smaller, manageable pieces.

Indexing

Indexing helps to optimize the performance of a database. It minimizes the number of disk accesses required when a query is processed. It is a data structure technique which is used to quickly locate and access the data in a database.

There are two things used in indexing, these are : Search Key or Candidate key and Data Reference or Pointer.

Search key or Candidate key

It contains a copy of the primary key or candidate key of the table. These values are stored in sorted order so that the corresponding data can be accessed quickly. The data may or may not be stored in sorted order.

Data Reference or Pointer

It contains a set of pointers holding the address of the disk block where that particular key value can be found.

There are several categories of indexing that can be used in a database. These are Single- Column Indexing, Multi-Column Indexing, Clustered Indexing, Non-Clustered Indexing, Bitmap Indexing, Full-Text Indexing.

Types of File Organization Mechanism

There are two types of file organization mechanism as given below. We have also listed their indexing methods to store the data for these organization mechanisms.

Sequential File Organization or Ordered Index File

These ordered or sequential file organizations can store data in a dense or sparse format −

  • Dense Index

  • Sparse Index

Hash File organization

There are primarily three methods of indexing −

  • Clustered Indexing

  • Non-Clustered or Secondary Indexing

  • Multilevel Indexing

Need of Multilevel Indexes

As the size of the database increases, the size of the index also increases. But, storing a single- level index in main memory may not be practical due to its large size, which requires multiple disk accesses.

To solve this problem multilevel indexing is used. This technique breaks down the main block into smaller blocks. Each smaller block can be stored in a single block. The outer block is further divided into inner blocks, each of which is associated with a data block. This approach reduces overhead and makes it easier to store the index in main memory.

Introduction to Multilevel Indexes

Multilevel indexes refer to a hierarchical structure of indexes. Here, each level of the index provides a more detailed reference to the data. It allows faster data retrieval, reduces disk access, and improves query performance. Multilevel indexes are essential in large databases where traditional indexes are not efficient.

A multi-level index can be created for any first-level index (primary, secondary, or clustering) that has more than one disk block. It's like a search tree, but adding or removing new index entries is challenging because every level of the index is an ordered file.

To solve this problem, most multi-level indexes use B-tree or B+ tree data structures. These structures leave some space in each tree node (disk block) to allow for new index entries. A multi-level index treats the index file like an ordered file with a unique entry for each K(i).

Types of Multilevel Indexes

There are two main types of multilevel indexes: B-Tree Index, B+Tree Index. These are explained as follows below in brief.

B Tree Index

This type of index is used in most database management systems. It is a balanced tree data structure, in which each node in the tree contains a set of keys and pointers to its child nodes.

B- trees are highly efficient for range queries and support insertion and deletion of records without requiring a complete restructuring of the index.

B-T ree Advantages

  • These are highly efficient for range queries as they allow for quick traversal of the tree structure.

  • These are balanced and therefore provide predictable search and retrieval times.

  • These can handle a large number of inserts and deletes efficiently without requiring a complete reorganization of the index.

  • These work well for both primary and secondary indexes.

B-T ree Disadvantages

  • These have a higher overhead than other index types, which can become a problem for very large databases.

  • These can be complex to implement and manage.

  • These may not be as efficient for point queries as they are for range queries.

B+ Tree Index

This type of index is similar to B-trees, but with some modifications to improve the performance of range queries. In a B+ tree, only the leaf nodes contain actual data records, while the non- leaf nodes act as keys for the child nodes. B+ trees are often used for large databases with high read/write operations as they can handle large amounts of data with relatively low overhead.

B+ Tree Advantages

  • These are optimized for range queries and are therefore faster for queries that require range searches.

  • These have a lower overhead than B-trees, which makes them more efficient for large databases.

  • These perform well for secondary indexes as they store the data only in leaf nodes.

  • These have a simpler implementation than B-trees.

B+ Tree Disadvantages

  • These require more disk access than B-trees for point queries, which can impact performance.

  • These require more overhead for inserts and deletes than B-trees.

  • These can be more complex to implement and manage than other index types.

Comparison of B-tree and B+ tree indexing methods

This table provides a general comparison. But specific advantages and disadvantages may depend on the particular use case and database system.

B-Tree

B+ Tree

Data storage

Both internal and leaf nodes store key-value pairs

Only leaf nodes store key- value pairs

Fanout

Lower fanout due to internal nodes storing data

Higher fanout due to internal nodes only storing keys

Range queries

Requires additional disk access to retrieve data from internal nodes

Can perform range queries more efficiently by only accessing leaf nodes

Insertion/deletion

Requires more overhead due to internal nodes storing data

Requires less overhead due to internal nodes only storing keys

Disk space usage

Uses more disk space due to internal nodes storing data

Uses less disk space due to internal nodes only storing keys

Implementation

Simpler to implement and manage

More complex to implement and manage

Query performance

Better for point queries and small datasets

Better for range queries and larger datasets

Summary

In this article, we have discussed an overview of multilevel indexes in Relational Database Management Systems (RDBMS). We explained how traditional indexes become inefficient as the database size grows, and multilevel indexes provide a solution to this problem by dividing the index into smaller pieces.

We have also covered different categories of indexing and file organization mechanisms. It then explains the need for multilevel indexes and introduces the concept of hierarchical structure of indexes. Discussed the two main types of multilevel indexes, i.e., B-tree and B+Tree indexes, their advantages, disadvantages, and a general comparison between the two.

Updated on: 17-May-2023

5K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements