 
 Data Structure Data Structure
 Networking Networking
 RDBMS RDBMS
 Operating System Operating System
 Java Java
 MS Excel MS Excel
 iOS iOS
 HTML HTML
 CSS CSS
 Android Android
 Python Python
 C Programming C Programming
 C++ C++
 C# C#
 MongoDB MongoDB
 MySQL MySQL
 Javascript Javascript
 PHP PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Optimal Binary Search Tree
In this article, we will discuss a classic Dynamic Programming problem that involves constructing an optimal binary search tree for a given set of keys with their search probabilities.
Before diving into the problem, let us understand what are Binary Search Trees and Dynamic Programming.
Optimal Binary Search Tree Problem
In this problem, you are given:
- A sorted array of keys[] of size n that contains the keys to make the binary search tree.
- An array freq[] of size n, where freq[i] is how many times keys[i] is searched.
The task is to find the minimum cost of a binary search tree that can be constructed using the given keys and their frequencies. The cost of a binary search tree is defined as the sum of the frequencies of all keys multiplied by their depth in the tree. We must keep the BST property (keys sorted in an in-order traversal).
$$ \text{Cost}(\text{node}) = \text{freq}(\text{node}) \times \text{level}(\text{node}) $$
$$ \text{Total Search Cost} = \sum_{i=1}^{n} \text{freq}(i) \cdot \text{level}(i) $$
Where level(i) is the depth of the node with key keys[i] in the binary search tree, and freq(i) is the frequency of that key. Return the minimum value possible for the total search cost.
Scenario
Input: Keys = {10, 12, 20}, Frequency = {34, 8, 50}
Output: Minimum cost: 142
Explanation: The Following are possible binary search trees for the given keys:
 
For case 1, the cost is: (34*1) + (8*2) + (50*3) = 200
        For case 2, the cost is: (8*1) + (34*2) + (50*2) = 176.
    Similarly, for case 5, the cost is: (50*1) + (34 * 2) + (8 * 3) = 142 (Minimum)
We have three algorithms to solve this problem. We will discuss each of them in detail.
- Brute Force: Using Recursion
- Dynamic Programming: Using Memoization
- Dynamic Programming: Using Tabulation
Recursive Algorithm to Find Optimal Binary Search Tree
In this approach, we will use recursion to explore all possible binary search trees that can be formed with the given keys and their frequencies. Here are the steps to implement the recursive algorithm:
- Choose each key between i and j as the root
- Recursively find the optimal cost of the left subtree (i to r-1)
- Recursively find the optimal cost of the right subtree (r+1 to j)
- Add the sum of frequencies for all keys in [i..j] (because when you go deeper, every node's level increases by 1, so we add the total frequency for the subproblem).
Example
Following is the C++ implementation for finding the optimal binary search tree using recursion:
#include <iostream>
# define INT_MAX 1000
using namespace std;
int sum(int freq[], int i, int j) {
   int total = 0;
   for (int k = i; k <= j; k++) {
      total += freq[k];
   }
   return total;
}
int optimalBST(int keys[], int freq[], int i, int j) {
   if (i > j) return 0; // Base case: no keys in this range
   if (i == j) return freq[i]; // Base case: only one key
    int minCost = INT_MAX;
   // Try each key as root
   for (int r = i; r <= j; r++) {
      // Cost of left subtree
      int leftCost = optimalBST(keys, freq, i, r - 1);
      // Cost of right subtree
      int rightCost = optimalBST(keys, freq, r + 1, j);
      // Total cost with current root
      int totalCost = leftCost + rightCost + sum(freq, i, j);
      minCost = min(minCost, totalCost);
   }
   return minCost;
}
int main() {
   int keys[] = {10, 12, 20};
   int freq[] = {34, 8, 50};
   int n = sizeof(keys) / sizeof(keys[0]);
   int minCost = optimalBST(keys, freq, 0, n - 1);
   cout << "Minimum cost: " << minCost << endl;
   return 0;
}
The output of the above code will be:
Minimum cost: 142
DP Memoization to Find Optimal Binary Search Tree
If you look at the recursive solution, you will notice that it has overlapping sub-problems. For example, the cost of sub tree (1, 1) is calculated while evaluating the cost of sub tree (0, 2) and again while evaluating the cost of sub tree (1, 2). To optimize this, we can use DP memoization to store the results of previously calculated sub-problems.
- Use a 2D array, dp, to store the minimum cost for each range of keys.
- Check if the value for the current range is already calculated. If yes, return it.
- If not, calculate it using the same logic as in the recursive approach and store the result in dp.
Example
Here is the C++ code implementation for finding the optimal binary search tree using dynamic programming memoization:
#include <iostream>
#include <cstring>
# define INT_MAX 1000
using namespace std;
int sum(int freq[], int i, int j) {
   int total = 0;
   for (int k = i; k <= j; k++) {
      total += freq[k];
   }
   return total;
}
int optimalBST(int keys[], int freq[], int i, int j, int dp[][100]) {
   if (i > j) return 0; // Base case: no keys in this range
   if (i == j) return freq[i]; // Base case: only one key
   if (dp[i][j] != -1) return dp[i][j]; // Return cached result
   int minCost = INT_MAX;
   // Try each key as root
   for (int r = i; r <= j; r++) {
      // Cost of left subtree
      int leftCost = optimalBST(keys, freq, i, r - 1, dp);
      // Cost of right subtree
      int rightCost = optimalBST(keys, freq, r + 1, j, dp);
      // Total cost with current root
      int totalCost = leftCost + rightCost + sum(freq, i, j);
      minCost = min(minCost, totalCost);
   }
   dp[i][j] = minCost; // Cache the result
   return minCost;
}
int main() {
   int keys[] = {10, 12, 20};
   int freq[] = {34, 8, 50};
   int n = sizeof(keys) / sizeof(keys[0]);
   int dp[100][100];
   memset(dp, -1, sizeof(dp)); // Initialize dp array
   int minCost = optimalBST(keys, freq, 0, n - 1, dp);
   cout << "Minimum cost: " << minCost << endl;
   return 0;
}
The output of the above code will be:
Minimum cost: 142
DP Tabulation to Find Optimal Binary Search Tree
In this approach, we will use a bottom-up dynamic programming technique to fill the 2D table that stores the minimum cost for each range of keys. This approach is more memory efficient because it uses an iterative method instead of recursion.
- Create a 2D array, dp of size n x n, where dp[i][j] will store the minimum cost for keys from i to j.
- Initialize the diagonal elements (when i == j) with freq[i].
- Iterate over lengths of subarrays from 2 to n.
- For each subarray, calculate the minimum cost by trying each key as the root and updating the DP table.
Example
Here is the C++ implementation for finding the optimal binary search tree using dynamic programming tabulation:
#include <iostream>
#include <cstring>
# define INT_MAX 1000
using namespace std;
int sum(int freq[], int i, int j) {
   int total = 0;
   for (int k = i; k <= j; k++) {
      total += freq[k];
   }
   return total;
}
void optimalBST(int keys[], int freq[], int n) {
   int dp[100][100] = {0};
   // Initialize the diagonal elements
   for (int i = 0; i < n; i++) {
      dp[i][i] = freq[i];
   }
   // Fill the dp table
   for (int len = 2; len <= n; len++) { // length of subarray
      for (int i = 0; i <= n - len; i++) {
         int j = i + len - 1;
         dp[i][j] = INT_MAX;
         // Try each key as root
         for (int r = i; r <= j; r++) {
            int leftCost = (r > i) ? dp[i][r - 1] : 0;
            int rightCost = (r < j) ? dp[r + 1][j] : 0;
            int totalCost = leftCost + rightCost + sum(freq, i, j);
            dp[i][j] = min(dp[i][j], totalCost);
         }
      }
   }
   cout << "Minimum cost: " << dp[0][n - 1] << endl;
}
int main() {
   int keys[] = {10, 12};
   int freq[] = {34, 50};
   int n = sizeof(keys) / sizeof(keys[0]);
   optimalBST(keys, freq, n);
   return 0;
}
The output of the above code will be:
Minimum cost: 118
Time Complexity and Space Complexity
Here is a comparison of the time complexity and space complexity of the above-mentioned approaches.
| Approach | Time Complexity | Space Complexity | 
|---|---|---|
| Brute Force | O(N!) | O(N) | 
| DP (Memoization) | O(N3) | O(N2 + N) | 
| DP (Tabulation) | O(N3) | O(N2) | 
Note that the memoization technique will take recursion stack space, which is O(N) in the worst case; hence, tabulation is the most space-efficient approach.
