Hilbert Tree in Data Structure

Hilbert R-tree, an R-tree variant, is defined as an index for multidimensional objects such as lines, regions, 3-D objects, or high-dimensional feature-based parametric objects. It can be imagined as an extension to B+-tree for multidimensional objects.

R-trees' performance depends on the quality of the algorithm that clusters the data rectangles on a node. Hilbert R-trees implement space-filling curves, and specifically the Hilbert curve, for imposing a linear ordering on the data rectangles.

Hilbert R-trees are of two types: one for static databases, and one for dynamic databases. In both cases Hilbert space-filling curves are implemented to achieve better ordering of multidimensional objects in the node. This ordering has to be treated as ‘good,’ in that sense that it should group ‘similar’ data rectangles together, to lessen the area and perimeter of the resulting minimum bounding rectangles (MBRs). Packed Hilbert R-trees are useful for static databases in which updates are very rare or in which there are no updates at all.

The basic idea

Although the following example is meant for a static environment, it discusses the intuitive principles for good R-tree design. These principles are legal for both static and dynamic databases.

Roussopoulos and Leifker proposed a technique for constructing a packed R-tree that achieves almost 100% space utilization.

The idea is developed to sort the data on the x or y coordinates of one of the corners of the rectangles. Sorting on any of the four coordinates gives same results. In this discussion either points or rectangles are sorted on the x coordinate of the lower left corner of the rectangle, denoted to as a "lowx packed R-tree." Rectangles' sorted list is scanned; successive rectangles are assigned to the similar R-tree leaf node until and unless that node is full; a new leaf node is then built, and the scanning of the sorted list continues. Thus, resulting R-tree's node will be fully packed, with the possible exception of the last node at each level. This leads to incident so that space utilization ≈100%. Higher levels of the tree are built in a similar way.

Algorithm Hilbert-Pack

(packing rectangles into an R-tree)

Step 1. The Hilbert value for each data rectangle is calculated.

Step 2. Data rectangles on ascending Hilbert values are sorted.

Step 3. /* Creating leaf nodes (level l=0) */

  • While (there are more rectangles)
  • A new R-tree node is generated
  • The next C rectangles to this node are assigned

Step 4. /* Creating nodes at higher level (l + 1) */

  • While (there are > 1 nodes at level l)
  • Nodes at level l ≥ 0 on ascending creation time are sorted
  • Step 3 is repeated

The assumption here is that either the data are static or the frequency of modification is low. This is a simple heuristic for building an R-tree with ~100% space utilization which at the same time will have a good response time.