Design and Analysis of Algorithms
Home
Basics of Algorithms
DAA - Introduction to Algorithms
DAA - Analysis of Algorithms
DAA - Methodology of Analysis
DAA - Asymptotic Notations & Apriori Analysis
DAA - Time Complexity
DAA - Master's Theorem
DAA - Space Complexities
Divide & Conquer
DAA - Divide & Conquer Algorithm
DAA - Max-Min Problem
DAA - Merge Sort Algorithm
DAA - Strassen's Matrix Multiplication
DAA - Karatsuba Algorithm
DAA - Towers of Hanoi
Greedy Algorithms
DAA - Greedy Algorithms
DAA - Travelling Salesman Problem
DAA - Prim's Minimal Spanning Tree
DAA - Kruskal's Minimal Spanning Tree
DAA - Dijkstra's Shortest Path Algorithm
DAA - Map Colouring Algorithm
DAA - Fractional Knapsack
DAA - Job Sequencing with Deadline
DAA - Optimal Merge Pattern
Dynamic Programming
DAA - Dynamic Programming
DAA - Matrix Chain Multiplication
DAA - Floyd Warshall Algorithm
DAA - 0-1 Knapsack Problem
DAA - Longest Common Subsequence Algorithm
DAA - Travelling Salesman Problem using Dynamic Programming
Randomized Algorithms
DAA - Randomized Algorithms
DAA - Randomized Quick Sort Algorithm
DAA - Karger's Minimum Cut Algorithm
DAA - Fisher-Yates Shuffle Algorithm
Approximation Algorithms
DAA - Approximation Algorithms
DAA - Vertex Cover Problem
DAA - Set Cover Problem
DAA - Travelling Salesperson Approximation Algorithm
Sorting Techniques
DAA - Bubble Sort Algorithm
DAA - Insertion Sort Algorithm
DAA - Selection Sort Algorithm
DAA - Shell Sort Algorithm
DAA - Heap Sort Algorithm
DAA - Bucket Sort Algorithm
DAA - Counting Sort Algorithm
DAA - Radix Sort Algorithm
DAA - Quick Sort Algorithm
Searching Techniques
DAA - Searching Techniques Introduction
DAA - Linear Search
DAA - Binary Search
DAA - Interpolation Search
DAA - Jump Search
DAA - Exponential Search
DAA - Fibonacci Search
DAA - Sublist Search
DAA - Hash Table
Graph Theory
DAA - Shortest Paths
DAA - Multistage Graph
DAA - Optimal Cost Binary Search Trees
Heap Algorithms
DAA - Binary Heap
DAA - Insert Method
DAA - Heapify Method
DAA - Extract Method
Complexity Theory
DAA - Deterministic vs. Nondeterministic Computations
DAA - Max Cliques
DAA - Vertex Cover
DAA - P and NP Class
DAA - Cook's Theorem
DAA - NP Hard & NP-Complete Classes
DAA - Hill Climbing Algorithm
DAA Useful Resources
DAA - Quick Guide
DAA - Useful Resources
DAA - Discussion

Optimal Cost Binary Search Trees

Table of content

Analysis
Example

A Binary Search Tree (BST) is a tree where the key values are stored in the internal nodes. The external nodes are null nodes. The keys are ordered lexicographically, i.e. for each internal node all the keys in the left sub-tree are less than the keys in the node, and all the keys in the right sub-tree are greater.

When we know the frequency of searching each one of the keys, it is quite easy to compute the expected cost of accessing each node in the tree. An optimal binary search tree is a BST, which has minimal expected cost of locating each node

Search time of an element in a BST is O(n), whereas in a Balanced-BST search time is O(log n). Again the search time can be improved in Optimal Cost Binary Search Tree, placing the most frequently used data in the root and closer to the root element, while placing the least frequently used data near leaves and in leaves.

Here, the Optimal Binary Search Tree Algorithm is presented. First, we build a BST from a set of provided n number of distinct keys < k₁, k₂, k₃, ... k_n >. Here we assume, the probability of accessing a key K_i is p_i. Some dummy keys (d₀, d₁, d₂, ... d_n) are added as some searches may be performed for the values which are not present in the Key set K. We assume, for each dummy key d_i probability of access is q_i.

Optimal-Binary-Search-Tree(p, q, n) 
e[1…n + 1, 0…n],  
w[1…n + 1, 0…n], 
root[1…n + 1, 0…n]  
for i = 1 to n + 1 do 
   e[i, i - 1] := q_i - 1 
   w[i, i - 1] := q_i - 1  
for l = 1 to n do 
   for i = 1 to n – l + 1 do 
      j = i + l – 1 e[i, j] := ∞ 
      w[i, i] := w[i, i -1] + p_j + q_j 
      for r = i to j do 
         t := e[i, r - 1] + e[r + 1, j] + w[i, j] 
         if t < e[i, j] 
            e[i, j] := t 
            root[i, j] := r 
return e and root

Analysis

The algorithm requires O (n³) time, since three nested for loops are used. Each of these loops takes on at most n values.

Example

Considering the following tree, the cost is 2.80, though this is not an optimal result.

Node	Depth	Probability	Contribution
k₁	1	0.15	0.30
k₂	0	0.10	0.10
k₃	2	0.05	0.15
k₄	1	0.10	0.20
k₅	2	0.20	0.60
d₀	2	0.05	0.15
d₁	2	0.10	0.30
d₂	3	0.05	0.20
d₃	3	0.05	0.20
d₄	3	0.05	0.20
d₅	3	0.10	0.40
Total			2.80

To get an optimal solution, using the algorithm discussed in this chapter, the following tables are generated.

In the following tables, column index is i and row index is j.

e	1	2	3	4	5	6
5	2.75	2.00	1.30	0.90	0.50	0.10
4	1.75	1.20	0.60	0.30	0.05
3	1.25	0.70	0.25	0.05
2	0.90	0.40	0.05
1	0.45	0.10
0	0.05

w	1	2	3	4	5	6
5	1.00	0.80	0.60	0.50	0.35	0.10
4	0.70	0.50	0.30	0.20	0.05
3	0.55	0.35	0.15	0.05
2	0.45	0.25	0.05
1	0.30	0.10
0	0.05

root	1	2	3	4	5
5	2	4	5	5	5
4	2	2	4	4
3	2	2	3
2	1	2
1	1

From these tables, the optimal tree can be formed.