Multidimensional Binary Search Trees

Basic concept

The multidimensional binary search tree (abbreviated k-d tree) is defined as a data structure for storing multikey records. This structure has been implemented to solve a number of "geometric" problems in statistics and data analysis.

A k-d tree (short for k-dimensional tree) is defined as a space-partitioning data structure for organizing points in a k-dimensional space. Data structure k-d trees are implemented for several applications, for example, searches involving a multidimensional search key (e.g. range searches and nearest neighbor searches). k-d trees are treated as a special case of binary space partitioning trees.

Informal description

The k-d tree is a binary tree in which every leaf node is treated as a k-dimensional point. Every non-leaf node can be imagined as implicitly generating a splitting hyperplane (used as median) that divides the space into two parts, called as half-spaces. Points to the left of this hyperplane are treated by the left subtree of that node and points to the right of the hyperplane are treated by the right subtree. We can select the hyperplane direction in the following way: every node in the tree is associated with one of the k dimensions, along with the hyperplane perpendicular to that dimension's axis. So, for example, if for a particular split the "x" axis is selected, all points in the subtree with a less "x" value than the node will appear in the left subtree and all points with higher "x" value will be in the right subtree. In such a case, the hyperplane would be set by the x-value of the point, and its normal indicates the unit x-axis. A popular practice is to sort a fixed number of randomly selected points, and implement the median of those points to serve as the splitting plane.

Given a list of n points, the following Algorithm uses a median-finding sort to build a balanced k-d tree containing those points.

function KDtree (list of points PointList, int Depth) {
   // Choose axis based on Depth so that axis cycles through all valid values
   var int axis := Depth mod k;
   // Sort point list and select median as pivot element
   choose median by axis from PointList;
   // Node is created as node1 and construct subtree
   node1.location := median;
   node1.leftChild := KDtree(points in PointList before median, Depth+1);
   node1.rightChild := KDtree(points in PointList after median, Depth+1);
   return node1;