Design and Analysis of Algorithms
Home
Basics of Algorithms
DAA - Introduction to Algorithms
DAA - Analysis of Algorithms
DAA - Methodology of Analysis
DAA - Asymptotic Notations & Apriori Analysis
DAA - Time Complexity
DAA - Master's Theorem
DAA - Space Complexities
Divide & Conquer
DAA - Divide & Conquer Algorithm
DAA - Max-Min Problem
DAA - Merge Sort Algorithm
DAA - Strassen's Matrix Multiplication
DAA - Karatsuba Algorithm
DAA - Towers of Hanoi
Greedy Algorithms
DAA - Greedy Algorithms
DAA - Travelling Salesman Problem
DAA - Prim's Minimal Spanning Tree
DAA - Kruskal's Minimal Spanning Tree
DAA - Dijkstra's Shortest Path Algorithm
DAA - Map Colouring Algorithm
DAA - Fractional Knapsack
DAA - Job Sequencing with Deadline
DAA - Optimal Merge Pattern
Dynamic Programming
DAA - Dynamic Programming
DAA - Matrix Chain Multiplication
DAA - Floyd Warshall Algorithm
DAA - 0-1 Knapsack Problem
DAA - Longest Common Subsequence Algorithm
DAA - Travelling Salesman Problem using Dynamic Programming
Randomized Algorithms
DAA - Randomized Algorithms
DAA - Randomized Quick Sort Algorithm
DAA - Karger's Minimum Cut Algorithm
DAA - Fisher-Yates Shuffle Algorithm
Approximation Algorithms
DAA - Approximation Algorithms
DAA - Vertex Cover Problem
DAA - Set Cover Problem
DAA - Travelling Salesperson Approximation Algorithm
Sorting Techniques
DAA - Bubble Sort Algorithm
DAA - Insertion Sort Algorithm
DAA - Selection Sort Algorithm
DAA - Shell Sort Algorithm
DAA - Heap Sort Algorithm
DAA - Bucket Sort Algorithm
DAA - Counting Sort Algorithm
DAA - Radix Sort Algorithm
DAA - Quick Sort Algorithm
Searching Techniques
DAA - Searching Techniques Introduction
DAA - Linear Search
DAA - Binary Search
DAA - Interpolation Search
DAA - Jump Search
DAA - Exponential Search
DAA - Fibonacci Search
DAA - Sublist Search
DAA - Hash Table
Graph Theory
DAA - Shortest Paths
DAA - Multistage Graph
DAA - Optimal Cost Binary Search Trees
Heap Algorithms
DAA - Binary Heap
DAA - Insert Method
DAA - Heapify Method
DAA - Extract Method
Complexity Theory
DAA - Deterministic vs. Nondeterministic Computations
DAA - Max Cliques
DAA - Vertex Cover
DAA - P and NP Class
DAA - Cook's Theorem
DAA - NP Hard & NP-Complete Classes
DAA - Hill Climbing Algorithm
DAA Useful Resources
DAA - Quick Guide
DAA - Useful Resources
DAA - Discussion

Design and Analysis Quick Guide

Introduction to Algorithms

An algorithm is a set of steps of operations to solve a problem performing calculation, data processing, and automated reasoning tasks. An algorithm is an efficient method that can be expressed within finite amount of time and space.

An algorithm is the best way to represent the solution of a particular problem in a very simple and efficient way. If we have an algorithm for a specific problem, then we can implement it in any programming language, meaning that the algorithm is independent from any programming languages.

Algorithm Design

The important aspects of algorithm design include creating an efficient algorithm to solve a problem in an efficient way using minimum time and space.

To solve a problem, different approaches can be followed. Some of them can be efficient with respect to time consumption, whereas other approaches may be memory efficient. However, one has to keep in mind that both time consumption and memory usage cannot be optimized simultaneously. If we require an algorithm to run in lesser time, we have to invest in more memory and if we require an algorithm to run with lesser memory, we need to have more time.

Problem Development Steps

The following steps are involved in solving computational problems.

Problem definition
Development of a model
Specification of an Algorithm
Designing an Algorithm
Checking the correctness of an Algorithm
Analysis of an Algorithm
Implementation of an Algorithm
Program testing
Documentation

How to Write an Algorithm?

There are no well-defined standards for writing algorithms. Rather, it is problem and resource dependent. Algorithms are never written to support a particular programming code.

As we know that all programming languages share basic code constructs like loops (do, for, while), flow-control (if-else), etc. These common constructs can be used to write an algorithm.

We write algorithms in a step-by-step manner, but it is not always the case. Algorithm writing is a process and is executed after the problem domain is well-defined. That is, we should know the problem domain, for which we are designing a solution.

Example

Let's try to learn algorithm-writing by using an example.

Problem − Design an algorithm to add two numbers and display the result.

Step 1 − START
Step 2 − declare three integers a, b & c
Step 3 − define values of a & b
Step 4 − add values of a & b
Step 5 − store output of step 4 to c
Step 6 − print c
Step 7 − STOP

Algorithms tell the programmers how to code the program. Alternatively, the algorithm can be written as −

Step 1 − START ADD
Step 2 − get values of a & b
Step 3 − c ← a + b
Step 4 − display c
Step 5 − STOP

In design and analysis of algorithms, usually the second method is used to describe an algorithm. It makes it easy for the analyst to analyze the algorithm ignoring all unwanted definitions. He can observe what operations are being used and how the process is flowing.

Characteristics of Algorithms

Not all procedures can be called an algorithm. An algorithm should have the following characteristics −

Unambiguous − Algorithm should be clear and unambiguous. Each of its steps (or phases), and their inputs/outputs should be clear and must lead to only one meaning.
Input − An algorithm should have 0 or more well-defined inputs.
Output − An algorithm should have 1 or more well-defined outputs, and should match the desired output.
Finiteness − Algorithms must terminate after a finite number of steps.
Feasibility − Should be feasible with the available resources.
Independent − An algorithm should have step-by-step directions, which should be independent of any programming code.

Pseudocode

Pseudocode gives a high-level description of an algorithm without the ambiguity associated with plain text but also without the need to know the syntax of a particular programming language.

The running time can be estimated in a more general manner by using Pseudocode to represent the algorithm as a set of fundamental operations which can then be counted.

Difference between Algorithm and Pseudocode

An algorithm is a formal definition with some specific characteristics that describes a process, which could be executed by a Turing-complete computer machine to perform a specific task. Generally, the word "algorithm" can be used to describe any high level task in computer science.

On the other hand, pseudocode is an informal and (often rudimentary) human readable description of an algorithm leaving many granular details of it. Writing a pseudocode has no restriction of styles and its only objective is to describe the high level steps of algorithm in a much realistic manner in natural language.

For example, following is an algorithm for Insertion Sort.

Algorithm: Insertion-Sort 
Input: A list L of integers of length n  
Output: A sorted list L1 containing those integers present in L 
Step 1: Keep a sorted list L1 which starts off empty  
Step 2: Perform Step 3 for each element in the original list L  
Step 3: Insert it into the correct position in the sorted list L1.  
Step 4: Return the sorted list 
Step 5: Stop

Here is a pseudocode which describes how the high level abstract process mentioned above in the algorithm Insertion-Sort could be described in a more realistic way.

for i <- 1 to length(A) 
   x <- A[i] 
   j <- i 
   while j > 0 and A[j-1] > x 
      A[j] <- A[j-1] 
      j <- j - 1 
   A[j] <- x

In this tutorial, algorithms will be presented in the form of pseudocode, that is similar in many respects to C, C++, Java, Python, and other programming languages.

Example

C C++ Java Python

#include <stdio.h>
void insertionSort(int arr[], int n) {
    int i, j, key;
    for (i = 1; i < n; i++) {
        key = arr[i];
        j = i - 1;
        // Move elements of arr[0..i-1] that are greater than key,
        // to one position ahead of their current position.
        while (j >= 0 && arr[j] > key) {
            arr[j + 1] = arr[j];
            j = j - 1;
        }
        arr[j + 1] = key; // Insert the current element (key) in the correct position.
    }
}
int main() {
    int arr[] = {6, 4, 26, 14, 33, 64, 46};
    int n = sizeof(arr) / sizeof(arr[0]);
    insertionSort(arr, n);
    printf("Sorted array: ");
    for (int i = 0; i < n; i++) {
        printf("%d ", arr[i]);
    }
    printf("\n");
    return 0;
}

Output

Sorted array: 4 6 14 26 33 46 64

#include <iostream>
using namespace std;
void insertionSort(int arr[], int n) {
    int i, j, key;
    for (i = 1; i < n; i++) {
        key = arr[i];
        j = i - 1;

        // Move elements of arr[0..i-1] that are greater than key,
        // to one position ahead of their current position.
        while (j >= 0 && arr[j] > key) {
            arr[j + 1] = arr[j];
            j = j - 1;
        }
        arr[j + 1] = key; // Insert the current element (key) in the correct position.
    }
}
int main() {
    int arr[] = {6, 4, 26, 14, 33, 64, 46};
    int n = sizeof(arr) / sizeof(arr[0]);
    insertionSort(arr, n);

    cout << "Sorted array: ";
    for (int i = 0; i < n; i++) {
        cout << arr[i] << " ";
    }
    cout << endl;
    return 0;
}

Output

Sorted array: 4 6 14 26 33 46 64

import java.util.Arrays;
public class InsertionSort {
    public static void insertionSort(int arr[]) {
        int n = arr.length;
        for (int i = 1; i < n; i++) {
            int key = arr[i];
            int j = i - 1;
            // Move elements of arr[0..i-1] that are greater than key,
            // to one position ahead of their current position.
            while (j >= 0 && arr[j] > key) {
                arr[j + 1] = arr[j];
                j = j - 1;
            }
            arr[j + 1] = key; // Insert the current element (key) in the correct position.
        }
    }
    public static void main(String[] args) {
        int[] arr = {64, 34, 25, 12, 22, 11, 90};
        insertionSort(arr);
        System.out.println("Sorted array: " + Arrays.toString(arr));
    }
}

Output

Sorted array: [11, 12, 22, 25, 34, 64, 90]

def insertion_sort(arr):
    for i in range(1, len(arr)):
        key = arr[i]
        j = i - 1

        # Move elements of arr[0..i-1] that are greater than key,
        # to one position ahead of their current position.
        while j >= 0 and arr[j] > key:
            arr[j + 1] = arr[j]
            j -= 1
        arr[j + 1] = key  # Insert the current element (key) in the correct position.

arr = [64, 34, 25, 12, 22, 11, 90]
insertion_sort(arr)
print("Sorted array:", arr)

Output

Sorted array: [11, 12, 22, 25, 34, 64, 90]

Analysis of Algorithms

In theoretical analysis of algorithms, it is common to estimate their complexity in the asymptotic sense, i.e., to estimate the complexity function for arbitrarily large input. The term "analysis of algorithms" was coined by Donald Knuth.

Algorithm analysis is an important part of computational complexity theory, which provides theoretical estimation for the required resources of an algorithm to solve a specific computational problem. Most algorithms are designed to work with inputs of arbitrary length. Analysis of algorithms is the determination of the amount of time and space resources required to execute it.

Usually, the efficiency or running time of an algorithm is stated as a function relating the input length to the number of steps, known as time complexity, or volume of memory, known as space complexity.

The Need for Analysis

In this chapter, we will discuss the need for analysis of algorithms and how to choose a better algorithm for a particular problem as one computational problem can be solved by different algorithms.

By considering an algorithm for a specific problem, we can begin to develop pattern recognition so that similar types of problems can be solved by the help of this algorithm.

Algorithms are often quite different from one another, though the objective of these algorithms are the same. For example, we know that a set of numbers can be sorted using different algorithms. Number of comparisons performed by one algorithm may vary with others for the same input. Hence, time complexity of those algorithms may differ. At the same time, we need to calculate the memory space required by each algorithm.

Analysis of algorithm is the process of analyzing the problem-solving capability of the algorithm in terms of the time and size required (the size of memory for storage while implementation). However, the main concern of analysis of algorithms is the required time or performance. Generally, we perform the following types of analysis −

Worst-case − The maximum number of steps taken on any instance of size a.
Best-case − The minimum number of steps taken on any instance of size a.
Average case − An average number of steps taken on any instance of size a.
Amortized − A sequence of operations applied to the input of size a averaged over time.

To solve a problem, we need to consider time as well as space complexity as the program may run on a system where memory is limited but adequate space is available or may be vice-versa. In this context, if we compare bubble sort and merge sort. Bubble sort does not require additional memory, but merge sort requires additional space. Though time complexity of bubble sort is higher compared to merge sort, we may need to apply bubble sort if the program needs to run in an environment, where memory is very limited.

Rate of Growth

Rate of growth is defined as the rate at which the running time of the algorithm is increased when the input size is increased.

The growth rate could be categorized into two types: linear and exponential. If the algorithm is increased in a linear way with an increasing in input size, it is linear growth rate. And if the running time of the algorithm is increased exponentially with the increase in input size, it is exponential growth rate.

Proving Correctness of an Algorithm

Once an algorithm is designed to solve a problem, it becomes very important that the algorithm always returns the desired output for every input given. So, there is a need to prove the correctness of an algorithm designed. This can be done using various methods −

Proof by Counterexample

Identify a case for which the algorithm might not be true and apply. If the counterexample works for the algorithm, then the correctness is proved. Otherwise, another algorithm that solves this counterexample must be designed.

Proof by Induction

Using mathematical induction, we can prove an algorithm is correct for all the inputs by proving it is correct for a base case input, say 1, and assume it is correct for another input k, and then prove it is true for k+1.

Proof by Loop Invariant

Find a loop invariant k, prove that the base case holds true for the loop invariant in the algorithm. Then apply mathematical induction to prove the rest of algorithm true.

Methodology of Analysis

To measure resource consumption of an algorithm, different strategies are used as discussed in this chapter.

Asymptotic Analysis

The asymptotic behavior of a function f(n) refers to the growth of f(n) as n gets large.

We typically ignore small values of n, since we are usually interested in estimating how slow the program will be on large inputs.

A good rule of thumb is that the slower the asymptotic growth rate, the better the algorithm. Though it’s not always true.

For example, a linear algorithm $f(n) = d * n + k$ is always asymptotically better than a quadratic one, $f(n) = c.n^2 + q$.

Solving Recurrence Equations

A recurrence is an equation or inequality that describes a function in terms of its value on smaller inputs. Recurrences are generally used in divide-and-conquer paradigm.

Let us consider T(n) to be the running time on a problem of size n.

If the problem size is small enough, say n < c where c is a constant, the straightforward solution takes constant time, which is written as θ(1). If the division of the problem yields a number of sub-problems with size $\frac{n}{b}$.

To solve the problem, the required time is a.T(n/b). If we consider the time required for division is D(n) and the time required for combining the results of sub-problems is C(n), the recurrence relation can be represented as −

$$T(n)=\begin{cases}\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\theta(1) & if\:n\leqslant c\\a T(\frac{n}{b})+D(n)+C(n) & otherwise\end{cases}$$

A recurrence relation can be solved using the following methods −

Substitution Method − In this method, we guess a bound and using mathematical induction we prove that our assumption was correct.
Recursion Tree Method − In this method, a recurrence tree is formed where each node represents the cost.
Master’s Theorem − This is another important technique to find the complexity of a recurrence relation.

Amortized Analysis

Amortized analysis is generally used for certain algorithms where a sequence of similar operations are performed.

Amortized analysis provides a bound on the actual cost of the entire sequence, instead of bounding the cost of sequence of operations separately.
Amortized analysis differs from average-case analysis; probability is not involved in amortized analysis. Amortized analysis guarantees the average performance of each operation in the worst case.

It is not just a tool for analysis, it’s a way of thinking about the design, since designing and analysis are closely related.

Aggregate Method

The aggregate method gives a global view of a problem. In this method, if n operations takes worst-case time T(n) in total. Then the amortized cost of each operation is T(n)/n. Though different operations may take different time, in this method varying cost is neglected.

Accounting Method

In this method, different charges are assigned to different operations according to their actual cost. If the amortized cost of an operation exceeds its actual cost, the difference is assigned to the object as credit. This credit helps to pay for later operations for which the amortized cost less than actual cost.

If the actual cost and the amortized cost of i^th operation are $c_{i}$ and $\hat{c_{l}}$, then

$$\displaystyle\sum\limits_{i=1}^n \hat{c_{l}}\geqslant\displaystyle\sum\limits_{i=1}^n c_{i}$$

Potential Method

This method represents the prepaid work as potential energy, instead of considering prepaid work as credit. This energy can be released to pay for future operations.

If we perform n operations starting with an initial data structure D₀. Let us consider, c_i as the actual cost and D_i as data structure of i^th operation. The potential function Ф maps to a real number Ф(D_i), the associated potential of D_i. The amortized cost $\hat{c_{l}}$ can be defined by

$$\hat{c_{l}}=c_{i}+\Phi (D_{i})-\Phi (D_{i-1})$$

Hence, the total amortized cost is

$$\displaystyle\sum\limits_{i=1}^n \hat{c_{l}}=\displaystyle\sum\limits_{i=1}^n (c_{i}+\Phi (D_{i})-\Phi (D_{i-1}))=\displaystyle\sum\limits_{i=1}^n c_{i}+\Phi (D_{n})-\Phi (D_{0})$$

Dynamic Table

If the allocated space for the table is not enough, we must copy the table into larger size table. Similarly, if large number of members are erased from the table, it is a good idea to reallocate the table with a smaller size.

Using amortized analysis, we can show that the amortized cost of insertion and deletion is constant and unused space in a dynamic table never exceeds a constant fraction of the total space.

In the next chapter of this tutorial, we will discuss Asymptotic Notations in brief.

Asymptotic Notations & Apriori Analysis

In designing of Algorithm, complexity analysis of an algorithm is an essential aspect. Mainly, algorithmic complexity is concerned about its performance, how fast or slow it works.

The complexity of an algorithm describes the efficiency of the algorithm in terms of the amount of the memory required to process the data and the processing time.

Complexity of an algorithm is analyzed in two perspectives: Time and Space.

Time Complexity

It’s a function describing the amount of time required to run an algorithm in terms of the size of the input. "Time" can mean the number of memory accesses performed, the number of comparisons between integers, the number of times some inner loop is executed, or some other natural unit related to the amount of real time the algorithm will take.

Space Complexity

It’s a function describing the amount of memory an algorithm takes in terms of the size of input to the algorithm. We often speak of "extra" memory needed, not counting the memory needed to store the input itself. Again, we use natural (but fixed-length) units to measure this.

Space complexity is sometimes ignored because the space used is minimal and/or obvious, however sometimes it becomes as important an issue as time.

Asymptotic Analysis

Asymptotic analysis of an algorithm refers to defining the mathematical foundation/framing of its run-time performance. Using asymptotic analysis, we can very well conclude the best case, average case, and worst case scenario of an algorithm.

Asymptotic analysis is input bound i.e., if there's no input to the algorithm, it is concluded to work in a constant time. Other than the "input" all other factors are considered constant.

Asymptotic analysis refers to computing the running time of any operation in mathematical units of computation. For example, the running time of one operation is computed as f(n) and may be for another operation it is computed as g(n²). This means the first operation running time will increase linearly with the increase in n and the running time of the second operation will increase exponentially when n increases. Similarly, the running time of both operations will be nearly the same if n is significantly small.

Usually, the time required by an algorithm falls under three types −

Best Case − Minimum time required for program execution.
Average Case − Average time required for program execution.
Worst Case − Maximum time required for program execution.

Asymptotic Notations

Execution time of an algorithm depends on the instruction set, processor speed, disk I/O speed, etc. Hence, we estimate the efficiency of an algorithm asymptotically.

Time function of an algorithm is represented by T(n), where n is the input size.

Different types of asymptotic notations are used to represent the complexity of an algorithm. Following asymptotic notations are used to calculate the running time complexity of an algorithm.

O − Big Oh
Ω − Big omega
θ − Big theta
o − Little Oh
ω − Little omega

O: Asymptotic Upper Bound

‘O’ (Big Oh) is the most commonly used notation. A function f(n) can be represented is the order of g(n) that is O(g(n)), if there exists a value of positive integer n as n₀ and a positive constant c such that −

$f(n)\leqslant c.g(n)$ for $n > n_{0}$ in all case

Hence, function g(n) is an upper bound for function f(n), as g(n) grows faster than f(n).

Example

Let us consider a given function, $f(n) = 4.n^3 + 10.n^2 + 5.n + 1$

Considering $g(n) = n^3$,

$f(n)\leqslant 5.g(n)$ for all the values of $n > 2$

Hence, the complexity of f(n) can be represented as $O(g(n))$, i.e. $O(n^3)$

Ω: Asymptotic Lower Bound

We say that $f(n) = \Omega (g(n))$ when there exists constant c that $f(n)\geqslant c.g(n)$ for all sufficiently large value of n. Here n is a positive integer. It means function g is a lower bound for function f; after a certain value of n, f will never go below g.

Example

Let us consider a given function, $f(n) = 4.n^3 + 10.n^2 + 5.n + 1$.

Considering $g(n) = n^3$, $f(n)\geqslant 4.g(n)$ for all the values of $n > 0$.

Hence, the complexity of f(n) can be represented as $\Omega (g(n))$, i.e. $\Omega (n^3)$

θ: Asymptotic Tight Bound

We say that $f(n) = \theta(g(n))$ when there exist constants c₁ and c₂ that $c_{1}.g(n) \leqslant f(n) \leqslant c_{2}.g(n)$ for all sufficiently large value of n. Here n is a positive integer.

This means function g is a tight bound for function f.

Example

Let us consider a given function, $f(n) = 4.n^3 + 10.n^2 + 5.n + 1$

Considering $g(n) = n^3$, $4.g(n) \leqslant f(n) \leqslant 5.g(n)$ for all the large values of n.

Hence, the complexity of f(n) can be represented as $\theta (g(n))$, i.e. $\theta (n^3)$.

O - Notation

The asymptotic upper bound provided by O-notation may or may not be asymptotically tight. The bound $2.n^2 = O(n^2)$ is asymptotically tight, but the bound $2.n = O(n^2)$ is not.

We use o-notation to denote an upper bound that is not asymptotically tight.

We formally define o(g(n)) (little-oh of g of n) as the set f(n) = o(g(n)) for any positive constant $c > 0$ and there exists a value $n_{0} > 0$, such that $0 \leqslant f(n) \leqslant c.g(n)$.

Intuitively, in the o-notation, the function f(n) becomes insignificant relative to g(n) as n approaches infinity; that is,

$$\lim_{n \rightarrow \infty}\left(\frac{f(n)}{g(n)}\right) = 0$$

Example

Let us consider the same function, $f(n) = 4.n^3 + 10.n^2 + 5.n + 1$

Considering $g(n) = n^{4}$,

$$\lim_{n \rightarrow \infty}\left(\frac{4.n^3 + 10.n^2 + 5.n + 1}{n^4}\right) = 0$$

Hence, the complexity of f(n) can be represented as $o(g(n))$, i.e. $o(n^4)$.

ω – Notation

We use ω-notation to denote a lower bound that is not asymptotically tight. Formally, however, we define ω(g(n)) (little-omega of g of n) as the set f(n) = ω(g(n)) for any positive constant C > 0 and there exists a value $n_{0} > 0$, such that $0 \leqslant c.g(n) < f(n)$.

For example, $\frac{n^2}{2} = \omega (n)$, but $\frac{n^2}{2} \neq \omega (n^2)$. The relation $f(n) = \omega (g(n))$ implies that the following limit exists

$$\lim_{n \rightarrow \infty}\left(\frac{f(n)}{g(n)}\right) = \infty$$

That is, f(n) becomes arbitrarily large relative to g(n) as n approaches infinity.

Example

Let us consider same function, $f(n) = 4.n^3 + 10.n^2 + 5.n + 1$

Considering $g(n) = n^2$,

$$\lim_{n \rightarrow \infty}\left(\frac{4.n^3 + 10.n^2 + 5.n + 1}{n^2}\right) = \infty$$

Hence, the complexity of f(n) can be represented as $o(g(n))$, i.e. $\omega (n^2)$.

Apriori and Apostiari Analysis

Apriori analysis means, analysis is performed prior to running it on a specific system. This analysis is a stage where a function is defined using some theoretical model. Hence, we determine the time and space complexity of an algorithm by just looking at the algorithm rather than running it on a particular system with a different memory, processor, and compiler.

Apostiari analysis of an algorithm means we perform analysis of an algorithm only after running it on a system. It directly depends on the system and changes from system to system.

In an industry, we cannot perform Apostiari analysis as the software is generally made for an anonymous user, which runs it on a system different from those present in the industry.

In Apriori, it is the reason that we use asymptotic notations to determine time and space complexity as they change from computer to computer; however, asymptotically they are the same.

Time Complexity

In this chapter, let us discuss the time complexity of algorithms and the factors that influence it.

Time complexity of an algorithm, in general, is simply defined as the time taken by an algorithm to implement each statement in the code. It is not the execution time of an algorithm. This entity can be influenced by various factors like the input size, the methods used and the procedure. An algorithm is said to be the most efficient when the output is produced in the minimal time possible.

The most common way to find the time complexity for an algorithm is to deduce the algorithm into a recurrence relation. Let us look into it further below.

Solving Recurrence Relations

A recurrence relation is an equation (or an inequality) that is defined by the smaller inputs of itself. These relations are solved based on Mathematical Induction. In both of these processes, a condition allows the problem to be broken into smaller pieces that execute the same equation with lower valued inputs.

These recurrence relations can be solved using multiple methods; they are −

Substitution Method
Recurrence Tree Method
Iteration Method
Master Theorem

Substitution Method

The substitution method is a trial and error method; where the values that we might think could be the solution to the relation are substituted and check whether the equation is valid. If it is valid, the solution is found. Otherwise, another value is checked.

Procedure

The steps to solve recurrences using the substitution method are −

Guess the form of solution based on the trial and error method
Use Mathematical Induction to prove the solution is correct for all the cases.

Example

Let us look into an example to solve a recurrence using the substitution method,

T(n) = 2T(n/2) + n

Here, we assume that the time complexity for the equation is O(nlogn). So according the mathematical induction phenomenon, the time complexity for T(n/2) will be O(n/2logn/2); substitute the value into the given equation, and we need to prove that T(n) must be greater than or equal to nlogn.

≤ 2n/2Log(n/2) + n
= nLogn – nLog2 + n
= nLogn – n + n
≤ nLogn

Recurrence Tree Method

In the recurrence tree method, we draw a recurrence tree until the program cannot be divided into smaller parts further. Then we calculate the time taken in each level of the recurrence tree.

Procedure

Draw the recurrence tree for the program
Calculate the time complexity in every level and sum them up to find the total time complexity.

Example

Consider the binary search algorithm and construct a recursion tree for it −

Since the algorithm follows divide and conquer technique, the recursion tree is drawn until it reaches the smallest input level $\mathrm{T\left ( \frac{n}{2^{k}} \right )}$.

$$\mathrm{T\left ( \frac{n}{2^{k}} \right )=T\left ( 1 \right )}$$

$$\mathrm{n=2^{k}}$$

Applying logarithm on both sides of the equation,

$$\mathrm{log\: n=log\: 2^{k}}$$

$$\mathrm{k=log_{2}\:n}$$

Therefore, the time complexity of a binary search algorithm is O(log n).

Master’s Method

Master’s method or Master’s theorem is applied on decreasing or dividing recurrence relations to find the time complexity. It uses a set of formulae to deduce the time complexity of an algorithm.

To learn more about Master’s theorem, please click here

Master’s Theorem

Master’s theorem is one of the many methods that are applied to calculate time complexities of algorithms. In analysis, time complexities are calculated to find out the best optimal logic of an algorithm. Master’s theorem is applied on recurrence relations.

But before we get deep into the master’s theorem, let us first revise what recurrence relations are −

Recurrence relations are equations that define the sequence of elements in which a term is a function of its preceding term. In algorithm analysis, the recurrence relations are usually formed when loops are present in an algorithm.

Problem Statement

Master’s theorem can only be applied on decreasing and dividing recurring functions. If the relation is not decreasing or dividing, master’s theorem must not be applied.

Master’s Theorem for Dividing Functions

Consider a relation of type −

T(n) = aT(n/b) + f(n)

where, a >= 1 and b > 1,

n − size of the problem

a − number of sub-problems in the recursion

n/b − size of the sub problems based on the assumption that all sub-problems are of the same size.

f(n) − represents the cost of work done outside the recursion -> Θ(nk logn p) ,where k >= 0 and p is a real number;

If the recurrence relation is in the above given form, then there are three cases in the master theorem to determine the asymptotic notations −

If a > b^k , then T(n)= Θ (n^{logb a} ) [ log_b a = log a / log b. ]
If a = b^k
- If p > -1, then T(n) = Θ (n^{logb a} log^p+1 n)
- If p = -1, then T(n) = Θ (n ^{logb a} log log n)
- If p < -1, then T(n) = Θ (n ^{logb a})
If a < b^k,
- If p >= 0, then T(n) = Θ (n^k log^p n).
- If p < 0, then T(n) = Θ (n^k)

Master’s Theorem for Decreasing Functions

Consider a relation of type −

T(n) = aT(n-b) + f(n)
where, a >= 1 and b > 1, f(n) is asymptotically positive

Here,

n − size of the problem

a − number of sub-problems in the recursion

n-b − size of the sub problems based on the assumption that all sub-problems are of the same size.

f(n) − represents the cost of work done outside the recursion -> Θ(n^k), where k >= 0.

If the recurrence relation is in the above given form, then there are three cases in the master theorem to determine the asymptotic notations −

if a = 1, T(n) = O (n^k+1)
if a > 1, T(n) = O (a^n/b * n^k)
if a < 1, T(n) = O (n^k)

Examples

Few examples to apply master’s theorem on dividing recurrence relations −

Example 1

Consider a recurrence relation given as T(n) = 8T(n/2) + n²

In this problem, a = 8, b = 2 and f(n) = Θ(n^k log_n p) = n², giving us k = 2 and p = 0.
a = 8 > b^k = 22 = 4,
Hence, case 1 must be applied for this equation.
To calculate, T(n) = Θ (n^{logb a} )
   = n^log₂⁸
   = n^{( log 8 / log 2 )}
   = n³
Therefore, T(n) = Θ(n³) is the tight bound for this equation.

Example 2

Consider a recurrence relation given as T(n) = 4T(n/2) + n²

In this problem, a = 4, b = 2 and f(n) = Θ(n^k log_n p) = n², giving us k = 2 and p = 0.
a = 4 = b^k = 2² = 4, p > -1
Hence, case 2(i) must be applied for this equation.
To calculate, T(n) = Θ (n^{logb a} log^p+1 n)
   = n^log₂⁴ log⁰⁺¹n
   = n²logn
Therefore, T(n) = Θ(n²logn) is the tight bound for this equation.

Example 3

Consider a recurrence relation given as T(n) = 2T(n/2) + n/log n

In this problem, a = 2, b = 2 and f(n) = Θ(n^k log_n p) = n/log n, giving us k = 1 and p = -1.
a = 2 = b^k = 2¹ = 2, p = -1
Hence, case 2(ii) must be applied for this equation.
To calculate, T(n) = Θ (n ^{logb a} log log n)
   = n^log₄⁴ log logn
   = n.log(logn)
Therefore, T(n) = Θ(n.log(logn)) is the tight bound for this equation.

Example 4

Consider a recurrence relation given as T(n) = 16T(n/4) + n²/log²n

In this problem, a = 16, b = 4 and f(n) = Θ(n^k log_n p) = n²/log²n, giving us k = 2 and p = -2.
a = 16 = b^k = 4² = 16, p < -1
Hence, case 2(iii) must be applied for this equation.
To calculate, T(n) = Θ (n ^{logb a})
   = n^log₄¹⁶
   = n²
Therefore, T(n) = Θ(n²) is the tight bound for this equation.

Example 5

Consider a recurrence relation given as T(n) = 2T(n/2) + n²

In this problem, a = 2, b = 2 and f(n) = Θ(n^k log_n p) = n², giving us k = 2 and p = 0.
a = 2 < b^k = 2² = 4, p = 0
Hence, case 3(i) must be applied for this equation.
To calculate, T(n) = Θ (n^k log^p n)
   = n² log⁰n
   = n²
Therefore, T(n) = Θ(n²) is the tight bound for this equation.

Example 6

Consider a recurrence relation given as T(n) = 2T(n/2) + n³/log n

In this problem, a = 2, b = 2 and f(n) = Θ(n^k log_n p) = n³/log n, giving us k = 3 and p = -1.
a = 2 < b^k = 2³ = 8, p < 0
Hence, case 3(ii) must be applied for this equation.
To calculate, T(n) = Θ (n^k)
   = n³
   = n³
Therefore, T(n) = Θ(n³) is the tight bound for this equation.

Few examples to apply master’s theorem in decreasing recurrence relations −

Example 1

Consider a recurrence relation given as T(n) = T(n-1) + n²

In this problem, a = 1, b = 1 and f(n) = O(n^k) = n², giving us k = 2.
Since a = 1, case 1 must be applied for this equation.
To calculate, T(n) = O(n^k+1)
   = n²⁺¹
   = n³
Therefore, T(n) = O(n³) is the tight bound for this equation.

Example 2

Consider a recurrence relation given as T(n) = 2T(n-1) + n

In this problem, a = 2, b = 1 and f(n) = O(n^k) = n, giving us k = 1.
Since a > 1, case 2 must be applied for this equation.
To calculate, T(n) = O(a^n/b * n^k)
   = O(2^n/1 * n¹)
   = O(n2ⁿ)
Therefore, T(n) = O(n2ⁿ) is the tight bound for this equation.

Example 3

Consider a recurrence relation given as T(n) = n⁴

In this problem, a = 0 and f(n) = O(n^k) = n⁴, giving us k = 4
Since a < 1, case 3 must be applied for this equation.
To calculate, T(n) = O(n^k)
   = O(n⁴)
   = O(n⁴)
Therefore, T(n) = O(n⁴) is the tight bound for this equation.

Space Complexities

In this chapter, we will discuss the complexity of computational problems with respect to the amount of space an algorithm requires.

Space complexity shares many of the features of time complexity and serves as a further way of classifying problems according to their computational difficulties.

What is Space Complexity?

Space complexity is a function describing the amount of memory (space) an algorithm takes in terms of the amount of input to the algorithm.

We often speak of extra memory needed, not counting the memory needed to store the input itself. Again, we use natural (but fixed-length) units to measure this.

We can use bytes, but it's easier to use, say, the number of integers used, the number of fixed-sized structures, etc.

In the end, the function we come up with will be independent of the actual number of bytes needed to represent the unit.

Space complexity is sometimes ignored because the space used is minimal and/or obvious, however sometimes it becomes as important issue as time complexity

Definition

Let M be a deterministic Turing machine (TM) that halts on all inputs. The space complexity of M is the function $f \colon N \rightarrow N$, where f(n) is the maximum number of cells of tape and M scans any input of length M. If the space complexity of M is f(n), we can say that M runs in space f(n).

We estimate the space complexity of Turing machine by using asymptotic notation.

Let $f \colon N \rightarrow R^+$ be a function. The space complexity classes can be defined as follows −

SPACE = {L | L is a language decided by an O(f(n)) space deterministic TM}

SPACE = {L | L is a language decided by an O(f(n)) space non-deterministic TM}

PSPACE is the class of languages that are decidable in polynomial space on a deterministic Turing machine.

In other words, PSPACE = U_k SPACE (n^k)

Savitch’s Theorem

One of the earliest theorem related to space complexity is Savitch’s theorem. According to this theorem, a deterministic machine can simulate non-deterministic machines by using a small amount of space.

For time complexity, such a simulation seems to require an exponential increase in time. For space complexity, this theorem shows that any non-deterministic Turing machine that uses f(n) space can be converted to a deterministic TM that uses f²(n) space.

Hence, Savitch’s theorem states that, for any function, $f \colon N \rightarrow R^+$, where $f(n) \geqslant n$

NSPACE(f(n)) ⊆ SPACE(f(n))

Relationship Among Complexity Classes

The following diagram depicts the relationship among different complexity classes.

Till now, we have not discussed P and NP classes in this tutorial. These will be discussed later.

Divide & Conquer Algorithm

Using divide and conquer approach, the problem in hand, is divided into smaller sub-problems and then each problem is solved independently. When we keep dividing the sub-problems into even smaller sub-problems, we may eventually reach a stage where no more division is possible. Those smallest possible sub-problems are solved using original solution because it takes lesser time to compute. The solution of all sub-problems is finally merged in order to obtain the solution of the original problem.

Broadly, we can understand divide-and-conquer approach in a three-step process.

Divide/Break

This step involves breaking the problem into smaller sub-problems. Sub-problems should represent a part of the original problem. This step generally takes a recursive approach to divide the problem until no sub-problem is further divisible. At this stage, sub-problems become atomic in size but still represent some part of the actual problem.

Conquer/Solve

This step receives a lot of smaller sub-problems to be solved. Generally, at this level, the problems are considered 'solved' on their own.

Merge/Combine

When the smaller sub-problems are solved, this stage recursively combines them until they formulate a solution of the original problem. This algorithmic approach works recursively and conquer & merge steps works so close that they appear as one.

Arrays as Input

There are various ways in which various algorithms can take input such that they can be solved using the divide and conquer technique. Arrays are one of them. In algorithms that require input to be in the form of a list, like various sorting algorithms, array data structures are most commonly used.

In the input for a sorting algorithm below, the array input is divided into subproblems until they cannot be divided further.

Then, the subproblems are sorted (the conquer step) and are merged to form the solution of the original array back (the combine step).

Since arrays are indexed and linear data structures, sorting algorithms most popularly use array data structures to receive input.

Linked Lists as Input

Another data structure that can be used to take input for divide and conquer algorithms is a linked list (for example, merge sort using linked lists). Like arrays, linked lists are also linear data structures that store data sequentially.

Consider the merge sort algorithm on linked list; following the very popular tortoise and hare algorithm, the list is divided until it cannot be divided further.

Then, the nodes in the list are sorted (conquered). These nodes are then combined (or merged) in recursively until the final solution is achieved.

Various searching algorithms can also be performed on the linked list data structures with a slightly different technique as linked lists are not indexed linear data structures. They must be handled using the pointers available in the nodes of the list.

Pros and cons of Divide and Conquer Approach

Divide and conquer approach supports parallelism as sub-problems are independent. Hence, an algorithm, which is designed using this technique, can run on the multiprocessor system or in different machines simultaneously.

In this approach, most of the algorithms are designed using recursion, hence memory management is very high. For recursive function stack is used, where function state needs to be stored.

Examples of Divide and Conquer Approach

The following computer algorithms are based on divide-and-conquer programming approach −

Merge Sort
Quick Sort
Binary Search
Strassen's Matrix Multiplication
Closest pair (points)
Karatsuba

There are various ways available to solve any computer problem, but the mentioned are a good example of divide and conquer approach.

Max-Min Problem

Let us consider a simple problem that can be solved by divide and conquer technique.

Problem Statement

The Max-Min Problem in algorithm analysis is finding the maximum and minimum value in an array.

Solution

To find the maximum and minimum numbers in a given array numbers[] of size n, the following algorithm can be used. First we are representing the naive method and then we will present divide and conquer approach.

Naïve Method

Naïve method is a basic method to solve any problem. In this method, the maximum and minimum number can be found separately. To find the maximum and minimum numbers, the following straightforward algorithm can be used.

Algorithm: Max-Min-Element (numbers[]) 
max := numbers[1] 
min := numbers[1] 

for i = 2 to n do 
   if numbers[i] > max then  
      max := numbers[i] 
   if numbers[i] < min then  
      min := numbers[i] 
return (max, min)

Example

C C++ Java Python

#include <stdio.h>
struct Pair {
    int max;
    int min;
};
// Function to find maximum and minimum using the naive algorithm
struct Pair maxMinNaive(int arr[], int n) {
    struct Pair result;
    result.max = arr[0];
    result.min = arr[0];
    // Loop through the array to find the maximum and minimum values
    for (int i = 1; i < n; i++) {
        if (arr[i] > result.max) {
            result.max = arr[i]; // Update the maximum value if a larger element is found
        }
        if (arr[i] < result.min) {
            result.min = arr[i]; // Update the minimum value if a smaller element is found
        }
    }
    return result; // Return the pair of maximum and minimum values
}
int main() {
    int arr[] = {6, 4, 26, 14, 33, 64, 46};
    int n = sizeof(arr) / sizeof(arr[0]);
    struct Pair result = maxMinNaive(arr, n);
    printf("Maximum element is: %d\n", result.max);
    printf("Minimum element is: %d\n", result.min);
    return 0;
}

Output

Maximum element is: 64
Minimum element is: 4

#include <iostream>
using namespace std;
struct Pair {
    int max;
    int min;
};
// Function to find maximum and minimum using the naive algorithm
Pair maxMinNaive(int arr[], int n) {
    Pair result;
    result.max = arr[0];
    result.min = arr[0];
    // Loop through the array to find the maximum and minimum values
    for (int i = 1; i < n; i++) {
        if (arr[i] > result.max) {
            result.max = arr[i]; // Update the maximum value if a larger element is found
        }
        if (arr[i] < result.min) {
            result.min = arr[i]; // Update the minimum value if a smaller element is found
        }
    }
    return result; // Return the pair of maximum and minimum values
}
int main() {
    int arr[] = {6, 4, 26, 14, 33, 64, 46};
    int n = sizeof(arr) / sizeof(arr[0]);
    Pair result = maxMinNaive(arr, n);
    cout << "Maximum element is: " << result.max << endl;
    cout << "Minimum element is: " << result.min << endl;
    return 0;
}

Output

Maximum element is: 64
Minimum element is: 4

public class MaxMinNaive {
    static class Pair {
        int max;
        int min;
    }
    // Function to find maximum and minimum using the naive algorithm
    static Pair maxMinNaive(int[] arr) {
        Pair result = new Pair();
        result.max = arr[0];
        result.min = arr[0];
        // Loop through the array to find the maximum and minimum values
        for (int i = 1; i < arr.length; i++) {
            if (arr[i] > result.max) {
                result.max = arr[i]; // Update the maximum value if a larger element is found
            }
            if (arr[i] < result.min) {
                result.min = arr[i]; // Update the minimum value if a smaller element is found
            }
        }
        return result; // Return the pair of maximum and minimum values
    }
    public static void main(String[] args) {
        int[] arr = {6, 4, 26, 14, 33, 64, 46};
        Pair result = maxMinNaive(arr);
        System.out.println("Maximum element is: " + result.max);
        System.out.println("Minimum element is: " + result.min);
    }
}

Output

Maximum element is: 64
Minimum element is: 4

def max_min_naive(arr):
    max_val = arr[0]
    min_val = arr[0]
    # Loop through the array to find the maximum and minimum values
    for i in range(1, len(arr)):
        if arr[i] > max_val:
            max_val = arr[i]  # Update the maximum value if a larger element is found
        if arr[i] < min_val:
            min_val = arr[i]  # Update the minimum value if a smaller element is found
    return max_val, min_val  # Return the pair of maximum and minimum values
arr = [6, 4, 26, 14, 33, 64, 46]
max_val, min_val = max_min_naive(arr)
print("Maximum element is:", max_val)
print("Minimum element is:", min_val)

Output

Maximum element is: 64
Minimum element is: 4

Analysis

The number of comparison in Naive method is 2n - 2.

The number of comparisons can be reduced using the divide and conquer approach. Following is the technique.

Divide and Conquer Approach

In this approach, the array is divided into two halves. Then using recursive approach maximum and minimum numbers in each halves are found. Later, return the maximum of two maxima of each half and the minimum of two minima of each half.

In this given problem, the number of elements in an array is $y - x + 1$, where y is greater than or equal to x.

$\mathbf{\mathit{Max - Min(x, y)}}$ will return the maximum and minimum values of an array $\mathbf{\mathit{numbers[x...y]}}$.

Algorithm: Max - Min(x, y) 
if y – x ≤ 1 then  
   return (max(numbers[x], numbers[y]), min((numbers[x], numbers[y])) 
else 
   (max1, min1):= maxmin(x, ⌊((x + y)/2)⌋) 
   (max2, min2):= maxmin(⌊((x + y)/2) + 1)⌋,y) 
return (max(max1, max2), min(min1, min2))

Example

C C++ Java Python

#include <stdio.h>
// Structure to store both maximum and minimum elements
struct Pair {
    int max;
    int min;
};
struct Pair maxMinDivideConquer(int arr[], int low, int high) {
    struct Pair result;
    struct Pair left;
    struct Pair right;
    int mid;
    // If only one element in the array
    if (low == high) {
        result.max = arr[low];
        result.min = arr[low];
        return result;
    }
    // If there are two elements in the array
    if (high == low + 1) {
        if (arr[low] < arr[high]) {
            result.min = arr[low];
            result.max = arr[high];
        } else {
            result.min = arr[high];
            result.max = arr[low];
        }
        return result;
    }
    // If there are more than two elements in the array
    mid = (low + high) / 2;
    left = maxMinDivideConquer(arr, low, mid);
    right = maxMinDivideConquer(arr, mid + 1, high);
    // Compare and get the maximum of both parts
    result.max = (left.max > right.max) ? left.max : right.max;
    // Compare and get the minimum of both parts
    result.min = (left.min < right.min) ? left.min : right.min;
    return result;
}
int main() {
    int arr[] = {6, 4, 26, 14, 33, 64, 46};
    int n = sizeof(arr) / sizeof(arr[0]);
    struct Pair result = maxMinDivideConquer(arr, 0, n - 1);
    printf("Maximum element is: %d\n", result.max);
    printf("Minimum element is: %d\n", result.min);
    return 0;
}

Output

Maximum element is: 64
Minimum element is: 4

#include <iostream>
using namespace std;
// Structure to store both maximum and minimum elements
struct Pair {
    int max;
    int min;
};
Pair maxMinDivideConquer(int arr[], int low, int high) {
    Pair result, left, right;
    int mid;
    // If only one element in the array
    if (low == high) {
        result.max = arr[low];
        result.min = arr[low];
        return result;
    }
    // If there are two elements in the array
    if (high == low + 1) {
        if (arr[low] < arr[high]) {
            result.min = arr[low];
            result.max = arr[high];
        } else {
            result.min = arr[high];
            result.max = arr[low];
        }
        return result;
    }
    // If there are more than two elements in the array
    mid = (low + high) / 2;
    left = maxMinDivideConquer(arr, low, mid);
    right = maxMinDivideConquer(arr, mid + 1, high);
    // Compare and get the maximum of both parts
    result.max = (left.max > right.max) ? left.max : right.max;
    // Compare and get the minimum of both parts
    result.min = (left.min < right.min) ? left.min : right.min;
    return result;
}
int main() {
    int arr[] = {6, 4, 26, 14, 33, 64, 46};
    int n = sizeof(arr) / sizeof(arr[0]);
    Pair result = maxMinDivideConquer(arr, 0, n - 1);
    cout << "Maximum element is: " << result.max << endl;
    cout << "Minimum element is: " << result.min << endl;
    return 0;
}

Output

Maximum element is: 64
Minimum element is: 4

public class MaxMinDivideConquer {
    // Class to store both maximum and minimum elements
    static class Pair {
        int max;
        int min;
    }
    static Pair maxMinDivideConquer(int[] arr, int low, int high) {
        Pair result = new Pair();
        Pair left, right;
        int mid;
        // If only one element in the array
        if (low == high) {
            result.max = arr[low];
            result.min = arr[low];
            return result;
        }
        // If there are two elements in the array
        if (high == low + 1) {
            if (arr[low] < arr[high]) {
                result.min = arr[low];
                result.max = arr[high];
            } else {
                result.min = arr[high];
                result.max = arr[low];
            }
            return result;
        }
        // If there are more than two elements in the array
        mid = (low + high) / 2;
        left = maxMinDivideConquer(arr, low, mid);
        right = maxMinDivideConquer(arr, mid + 1, high);
        // Compare and get the maximum of both parts
        result.max = Math.max(left.max, right.max);
        // Compare and get the minimum of both parts
        result.min = Math.min(left.min, right.min);
        return result;
    }
    public static void main(String[] args) {
        int[] arr = {6, 4, 26, 14, 33, 64, 46};
        Pair result = maxMinDivideConquer(arr, 0, arr.length - 1);
        System.out.println("Maximum element is: " + result.max);
        System.out.println("Minimum element is: " + result.min);
    }
}

Output

Maximum element is: 64
Minimum element is: 4

def max_min_divide_conquer(arr, low, high):
    # Structure to store both maximum and minimum elements
    class Pair:
        def __init__(self):
            self.max = 0
            self.min = 0
    result = Pair()
    # If only one element in the array
    if low == high:
        result.max = arr[low]
        result.min = arr[low]
        return result
    # If there are two elements in the array
    if high == low + 1:
        if arr[low] < arr[high]:
            result.min = arr[low]
            result.max = arr[high]
        else:
            result.min = arr[high]
            result.max = arr[low]
        return result
    # If there are more than two elements in the array
    mid = (low + high) // 2
    left = max_min_divide_conquer(arr, low, mid)
    right = max_min_divide_conquer(arr, mid + 1, high)
    # Compare and get the maximum of both parts
    result.max = max(left.max, right.max)
    # Compare and get the minimum of both parts
    result.min = min(left.min, right.min)
    return result
arr = [6, 4, 26, 14, 33, 64, 46]
result = max_min_divide_conquer(arr, 0, len(arr) - 1)
print("Maximum element is:", result.max)
print("Minimum element is:", result.min)

Output

Maximum element is: 64
Minimum element is: 4

Analysis

Let T(n) be the number of comparisons made by $\mathbf{\mathit{Max - Min(x, y)}}$, where the number of elements $n = y - x + 1$.

If T(n) represents the numbers, then the recurrence relation can be represented as

$$T(n) = \begin{cases}T\left(\lfloor\frac{n}{2}\rfloor\right)+T\left(\lceil\frac{n}{2}\rceil\right)+2 & for\: n>2\\1 & for\:n = 2 \\0 & for\:n = 1\end{cases}$$

Let us assume that n is in the form of power of 2. Hence, n = 2^k where k is height of the recursion tree.

So,

$$T(n) = 2.T (\frac{n}{2}) + 2 = 2.\left(\begin{array}{c}2.T(\frac{n}{4}) + 2\end{array}\right) + 2 ..... = \frac{3n}{2} - 2$$

Compared to Naïve method, in divide and conquer approach, the number of comparisons is less. However, using the asymptotic notation both of the approaches are represented by O(n).

Merge Sort Algorithm

Merge sort is a sorting technique based on divide and conquer technique. With worst-case time complexity being Ο(n log n), it is one of the most used and approached algorithms.

Merge sort first divides the array into equal halves and then combines them in a sorted manner.

How Merge Sort Works?

To understand merge sort, we take an unsorted array as the following −

We know that merge sort first divides the whole array iteratively into equal halves unless the atomic values are achieved. We see here that an array of 8 items is divided into two arrays of size 4.

This does not change the sequence of appearance of items in the original. Now we divide these two arrays into halves.

We further divide these arrays and we achieve atomic value which can no more be divided.

Now, we combine them in exactly the same manner as they were broken down. Please note the color codes given to these lists.

We first compare the element for each list and then combine them into another list in a sorted manner. We see that 14 and 33 are in sorted positions. We compare 27 and 10 and in the target list of 2 values we put 10 first, followed by 27. We change the order of 19 and 35 whereas 42 and 44 are placed sequentially.

In the next iteration of the combining phase, we compare lists of two data values, and merge them into a list of found data values placing all in a sorted order.

After the final merging, the list becomes sorted and is considered the final solution.

Merge Sort Algorithm

Merge sort keeps on dividing the list into equal halves until it can no more be divided. By definition, if it is only one element in the list, it is considered sorted. Then, merge sort combines the smaller sorted lists keeping the new list sorted too.

Step 1 − if it is only one element in the list, consider it already sorted, so return.

Step 2 − divide the list recursively into two halves until it can no more be divided.

Step 3 − merge the smaller lists into new list in sorted order.

Pseudocode

We shall now see the pseudocodes for merge sort functions. As our algorithms point out two main functions − divide & merge.

Merge sort works with recursion and we shall see our implementation in the same way.

procedure mergesort( var a as array )
   if ( n == 1 ) return a
      var l1 as array = a[0] ... a[n/2]
      var l2 as array = a[n/2+1] ... a[n]
      l1 = mergesort( l1 )
      l2 = mergesort( l2 )
      return merge( l1, l2 )
end procedure
procedure merge( var a as array, var b as array )
   var c as array
   while ( a and b have elements )
      if ( a[0] > b[0] )
         add b[0] to the end of c
         remove b[0] from b
      else
         add a[0] to the end of c
         remove a[0] from a
      end if
   end while
   while ( a has elements )
      add a[0] to the end of c
      remove a[0] from a
   end while
   while ( b has elements )
      add b[0] to the end of c
      remove b[0] from b
   end while
   return c
end procedure

Example

In the following example, we have shown Merge-Sort algorithm step by step. First, every iteration array is divided into two sub-arrays, until the sub-array contains only one element. When these sub-arrays cannot be divided further, then merge operations are performed.

Analysis

Let us consider, the running time of Merge-Sort as T(n). Hence,

$$\mathrm{T\left ( n \right )=\left\{\begin{matrix} c & if\, n\leq 1 \\ 2\, xT\left ( \frac{n}{2} \right )+dxn &otherwise \\ \end{matrix}\right.}\:where\: c\: and\: d\: are\: constants$$

Therefore, using this recurrence relation,

$$T\left ( n \right )=2^{i}\, T\left ( n/2^{i} \right )+i\cdot d\cdot n$$

$$As,\:\: i=log\: n,\: T\left ( n \right )=2^{log\, n}T\left ( n/2^{log\, n} \right )+log\, n\cdot d\cdot n$$

$$=c\cdot n+d\cdot n\cdot log\: n$$

$$Therefore,\: \: T\left ( n \right ) = O(n\: log\: n ).$$

Example

Following are the implementations of this operation in various programming languages −

C C++ Java Python

#include <stdio.h>
#define max 10
int a[11] = { 10, 14, 19, 26, 27, 31, 33, 35, 42, 44, 0 };
int b[10];
void merging(int low, int mid, int high){
   int l1, l2, i;
   for(l1 = low, l2 = mid + 1, i = low; l1 <= mid && l2 <= high; i++) {
      if(a[l1] <= a[l2])
         b[i] = a[l1++];
      else
         b[i] = a[l2++];
   }
   while(l1 <= mid)
      b[i++] = a[l1++];
   while(l2 <= high)
      b[i++] = a[l2++];
   for(i = low; i <= high; i++)
      a[i] = b[i];
}
void sort(int low, int high){
   int mid;
   if(low < high) {
      mid = (low + high) / 2;
      sort(low, mid);
      sort(mid+1, high);
      merging(low, mid, high);
   } else {
      return;
   }
}
int main(){
   int i;
   printf("Array before sorting\n");
   for(i = 0; i <= max; i++)
      printf("%d ", a[i]);
   sort(0, max);
   printf("\nArray after sorting\n");
   for(i = 0; i <= max; i++)
      printf("%d ", a[i]);
}

Output

Array before sorting
10 14 19 26 27 31 33 35 42 44 0 
Array after sorting
0 10 14 19 26 27 31 33 35 42 44

#include <iostream>
using namespace std;
#define max 10
int a[11] = { 10, 14, 19, 26, 27, 31, 33, 35, 42, 44, 0 };
int b[10];
void merging(int low, int mid, int high){
   int l1, l2, i;
   for(l1 = low, l2 = mid + 1, i = low; l1 <= mid && l2 <= high; i++) {
      if(a[l1] <= a[l2])
         b[i] = a[l1++];
      else
         b[i] = a[l2++];
   }
   while(l1 <= mid)
      b[i++] = a[l1++];
   while(l2 <= high)
      b[i++] = a[l2++];
   for(i = low; i <= high; i++)
      a[i] = b[i];
}
void sort(int low, int high){
   int mid;
   if(low < high) {
      mid = (low + high) / 2;
      sort(low, mid);
      sort(mid+1, high);
      merging(low, mid, high);
   } else {
      return;
   }
}
int main(){
   int i;
   cout << "Array before sorting\n";
   for(i = 0; i <= max; i++)
      cout<<a[i]<<" ";
   sort(0, max);
   cout<< "\nArray after sorting\n";
   for(i = 0; i <= max; i++)
      cout<<a[i]<<" ";
}

Output

Array before sorting
10 14 19 26 27 31 33 35 42 44 0 
Array after sorting
0 10 14 19 26 27 31 33 35 42 44

public class Merge_Sort {
   static int a[] = { 10, 14, 19, 26, 27, 31, 33, 35, 42, 44, 0 };
   static int b[] = new int[a.length];
   static void merging(int low, int mid, int high) {
      int l1, l2, i;
      for(l1 = low, l2 = mid + 1, i = low; l1 <= mid && l2 <= high; i++) {
         if(a[l1] <= a[l2])
            b[i] = a[l1++];
         else
            b[i] = a[l2++];
      }
      while(l1 <= mid)
         b[i++] = a[l1++];
      while(l2 <= high)
         b[i++] = a[l2++];
      for(i = low; i <= high; i++)
         a[i] = b[i];
   }
   static void sort(int low, int high) {
      int mid;
      if(low < high) {
         mid = (low + high) / 2;
         sort(low, mid);
         sort(mid+1, high);
         merging(low, mid, high);
      } else {
         return;
      }
   }
   public static void main(String args[]) {
      int i;
      int n = a.length;
      System.out.println("Array before sorting");
      for(i = 0; i < n; i++)
         System.out.print(a[i] + " ");
      sort(0, n-1);
      System.out.println("\nArray after sorting");
      for(i = 0; i < n; i++)
         System.out.print(a[i]+" ");
   }
}

Output

Array before sorting
10 14 19 26 27 31 33 35 42 44 0 
Array after sorting
0 10 14 19 26 27 31 33 35 42 44

def merge_sort(a, n):
   if n > 1:
      m = n // 2
      #divide the list in two sub lists
      l1 = a[:m]
      n1 = len(l1)
      l2 = a[m:]
      n2 = len(l2)
      #recursively calling the function for sub lists
      merge_sort(l1, n1)
      merge_sort(l2, n2)
      i = j = k = 0
      while i < n1 and j < n2:
         if l1[i] <= l2[j]:
            a[k] = l1[i]
            i = i + 1
         else:
            a[k] = l2[j]
            j = j + 1
         k = k + 1
      while i < n1:
         a[k] = l1[i]
         i = i + 1
         k = k + 1    
      while j < n2:
         a[k]=l2[j]
         j = j + 1
         k = k + 1

a = [10, 14, 19, 26, 27, 31, 33, 35, 42, 44, 0]
n = len(a)
print("Array before Sorting")
print(a)
merge_sort(a, n)
print("Array after Sorting")
print(a)

Output

Array before Sorting
[10, 14, 19, 26, 27, 31, 33, 35, 42, 44, 0]
Array after Sorting
[0, 10, 14, 19, 26, 27, 31, 33, 35, 42, 44]

Strassen’s Matrix Multiplication

Strassen’s Matrix Multiplication is the divide and conquer approach to solve the matrix multiplication problems. The usual matrix multiplication method multiplies each row with each column to achieve the product matrix. The time complexity taken by this approach is O(n³), since it takes two loops to multiply. Strassen’s method was introduced to reduce the time complexity from O(n³) to O(n^{log 7}).

Naïve Method

First, we will discuss naïve method and its complexity. Here, we are calculating Z=𝑿X × Y. Using Naïve method, two matrices (X and Y) can be multiplied if the order of these matrices are p × q and q × r and the resultant matrix will be of order p × r. The following pseudocode describes the naïve multiplication −

Algorithm: Matrix-Multiplication (X, Y, Z) 
for i = 1 to p do 
   for j = 1 to r do 
      Z[i,j] := 0 
      for k = 1 to q do 
         Z[i,j] := Z[i,j] + X[i,k] × Y[k,j]

Complexity

Here, we assume that integer operations take O(1) time. There are three for loops in this algorithm and one is nested in other. Hence, the algorithm takes O(n³) time to execute.

Strassen’s Matrix Multiplication Algorithm

In this context, using Strassen’s Matrix multiplication algorithm, the time consumption can be improved a little bit.

Strassen’s Matrix multiplication can be performed only on square matrices where n is a power of 2. Order of both of the matrices are n × n.

Divide X, Y and Z into four (n/2)×(n/2) matrices as represented below −

$Z = \begin{bmatrix}I & J \\K & L \end{bmatrix}$ $X = \begin{bmatrix}A & B \\C & D \end{bmatrix}$ and $Y = \begin{bmatrix}E & F \\G & H \end{bmatrix}$

Using Strassen’s Algorithm compute the following −

$$M_{1} \: \colon= (A+C) \times (E+F)$$

$$M_{2} \: \colon= (B+D) \times (G+H)$$

$$M_{3} \: \colon= (A-D) \times (E+H)$$

$$M_{4} \: \colon= A \times (F-H)$$

$$M_{5} \: \colon= (C+D) \times (E)$$

$$M_{6} \: \colon= (A+B) \times (H)$$

$$M_{7} \: \colon= D \times (G-E)$$

Then,

$$I \: \colon= M_{2} + M_{3} - M_{6} - M_{7}$$

$$J \: \colon= M_{4} + M_{6}$$

$$K \: \colon= M_{5} + M_{7}$$

$$L \: \colon= M_{1} - M_{3} - M_{4} - M_{5}$$

Analysis

$$T(n)=\begin{cases}c & if\:n= 1\\7\:x\:T(\frac{n}{2})+d\:x\:n^2 & otherwise\end{cases} \:where\: c\: and \:d\:are\: constants$$

Using this recurrence relation, we get $T(n) = O(n^{log7})$

Hence, the complexity of Strassen’s matrix multiplication algorithm is $O(n^{log7})$.

Example

Let us look at the implementation of Strassen's Matrix Multiplication in various programming languages: C, C++, Java, Python.

C C++ Java Python

#include<stdio.h>
int main(){
   int z[2][2];
   int i, j;
   int m1, m2, m3, m4 , m5, m6, m7;
   int x[2][2] = {
       {12, 34}, 
       {22, 10}
       };
   int y[2][2] = {
       {3, 4}, 
       {2, 1}
   };
   printf("\nThe first matrix is\n");
   for(i = 0; i < 2; i++) {
      printf("\n");
      for(j = 0; j < 2; j++)
         printf("%d\t", x[i][j]);
   }
   printf("\nThe second matrix is\n");
   for(i = 0; i < 2; i++) {
      printf("\n");
      for(j = 0; j < 2; j++)
         printf("%d\t", y[i][j]);
   }
   m1= (x[0][0] + x[1][1]) * (y[0][0] + y[1][1]);
   m2= (x[1][0] + x[1][1]) * y[0][0];
   m3= x[0][0] * (y[0][1] - y[1][1]);
   m4= x[1][1] * (y[1][0] - y[0][0]);
   m5= (x[0][0] + x[0][1]) * y[1][1];
   m6= (x[1][0] - x[0][0]) * (y[0][0]+y[0][1]);
   m7= (x[0][1] - x[1][1]) * (y[1][0]+y[1][1]);
   z[0][0] = m1 + m4- m5 + m7;
   z[0][1] = m3 + m5;
   z[1][0] = m2 + m4;
   z[1][1] = m1 - m2 + m3 + m6;
   printf("\nProduct achieved using Strassen's algorithm \n");
   for(i = 0; i < 2 ; i++) {
      printf("\n");
      for(j = 0; j < 2; j++)
         printf("%d\t", z[i][j]);
   }
   return 0;
}

Output

The first matrix is

12	34	
22	10	
The second matrix is

3	4	
2	1	
Product achieved using Strassen's algorithm 

104	82	
86	98

#include<iostream>
using namespace std;
int main() {
   int z[2][2];
   int i, j;
   int m1, m2, m3, m4 , m5, m6, m7;
      int x[2][2] = {
         {12, 34}, 
         {22, 10}
      };
   int y[2][2] = {
      {3, 4}, 
      {2, 1}
   };
   cout<<"\nThe first matrix is\n";
   for(i = 0; i < 2; i++) {
      cout<<endl;
      for(j = 0; j < 2; j++)
         cout<<x[i][j]<<" ";
   }
   cout<<"\nThe second matrix is\n";
   for(i = 0;i < 2; i++){
      cout<<endl;
      for(j = 0;j < 2; j++)
         cout<<y[i][j]<<" ";
   }

   m1 = (x[0][0] + x[1][1]) * (y[0][0] + y[1][1]);
   m2 = (x[1][0] + x[1][1]) * y[0][0];
   m3 = x[0][0] * (y[0][1] - y[1][1]);
   m4 = x[1][1] * (y[1][0] - y[0][0]);
   m5 = (x[0][0] + x[0][1]) * y[1][1];
   m6 = (x[1][0] - x[0][0]) * (y[0][0]+y[0][1]);
   m7 = (x[0][1] - x[1][1]) * (y[1][0]+y[1][1]);

   z[0][0] = m1 + m4- m5 + m7;
   z[0][1] = m3 + m5;
   z[1][0] = m2 + m4;
   z[1][1] = m1 - m2 + m3 + m6;

   cout<<"\nProduct achieved using Strassen's algorithm \n";
   for(i = 0; i < 2 ; i++) {
      cout<<endl;
      for(j = 0; j < 2; j++)
         cout<<z[i][j]<<" ";
   }
   return 0;
}

Output

The first matrix is

12 34 
22 10 
The second matrix is

3 4 
2 1 
Product achieved using Strassen's algorithm 

104 82 
86 98

public class Strassens {
   public static void main(String[] args) {
      int[][] x = {{12, 34}, {22, 10}};
      int[][] y = {{3, 4}, {2, 1}};
      int z[][] = new int[2][2];
      int m1, m2, m3, m4 , m5, m6, m7;
      System.out.println("The first matrix is: ");
      for(int i = 0; i<2; i++) {
         System.out.println();//new line
         for(int j = 0; j<2; j++) {
            System.out.print(x[i][j] + "\t");
         }
      }
      System.out.println("\nThe second matrix is: ");
      for(int i = 0; i<2; i++) {
         System.out.println();//new line
         for(int j = 0; j<2; j++) {
            System.out.print(y[i][j] + "\t");
         }
      }
      m1 = (x[0][0] + x[1][1]) * (y[0][0] + y[1][1]);
      m2 = (x[1][0] + x[1][1]) * y[0][0];
      m3 = x[0][0] * (y[0][1] - y[1][1]);
      m4 = x[1][1] * (y[1][0] - y[0][0]);
      m5 = (x[0][0] + x[0][1]) * y[1][1];
      m6 = (x[1][0] - x[0][0]) * (y[0][0]+y[0][1]);
      m7 = (x[0][1] - x[1][1]) * (y[1][0]+y[1][1]);
      z[0][0] = m1 + m4- m5 + m7;
      z[0][1] = m3 + m5;
      z[1][0] = m2 + m4;
      z[1][1] = m1 - m2 + m3 + m6;
      System.out.println("\nProduct achieved using Strassen's algorithm: ");
      for(int i = 0; i<2; i++) {
         System.out.println();//new line
         for(int j = 0; j<2; j++) {
            System.out.print(z[i][j] + "\t");
         }
      }
   }
}

Output

The first matrix is: 

12	34	
22	10	
The second matrix is: 

3	4	
2	1	
Product achieved using Strassen's algorithm: 

104	82	
86	98

import numpy as np
x = np.array([[12, 34], [22, 10]])
y = np.array([[3, 4], [2, 1]])
z = np.zeros((2, 2))
m1, m2, m3, m4, m5, m6, m7 = 0, 0, 0, 0, 0, 0, 0
print("The first matrix is: ")
for i in range(2):
    print()
    for j in range(2):
        print(x[i][j], end="\t")
print("\nThe second matrix is: ")
for i in range(2):
    print()
    for j in range(2):
        print(y[i][j], end="\t")
m1 = (x[0][0] + x[1][1]) * (y[0][0] + y[1][1])
m2 = (x[1][0] + x[1][1]) * y[0][0]
m3 = x[0][0] * (y[0][1] - y[1][1])
m4 = x[1][1] * (y[1][0] - y[0][0])
m5 = (x[0][0] + x[0][1]) * y[1][1]
m6 = (x[1][0] - x[0][0]) * (y[0][0] + y[0][1])
m7 = (x[0][1] - x[1][1]) * (y[1][0] + y[1][1])

z[0][0] = m1 + m4 - m5 + m7
z[0][1] = m3 + m5
z[1][0] = m2 + m4
z[1][1] = m1 - m2 + m3 + m6

print("\nProduct achieved using Strassen's algorithm: ")
for i in range(2):
    print()
    for j in range(2):
        print(z[i][j], end="\t")

Output

The first matrix is: 

12	34	
22	10	
The second matrix is: 

3	4	
2	1	
Product achieved using Strassen's algorithm: 

104.0	82.0	
86.0	98.0

Karatsuba Algorithm

The Karatsuba algorithm is used by the system to perform fast multiplication on two n-digit numbers, i.e. the system compiler takes lesser time to compute the product than the time-taken by a normal multiplication.

The usual multiplication approach takes n² computations to achieve the final product, since the multiplication has to be performed between all digit combinations in both the numbers and then the sub-products are added to obtain the final product. This approach of multiplication is known as Naïve Multiplication.

To understand this multiplication better, let us consider two 4-digit integers: 1456 and 6533, and find the product using naïve approach.

So, 1456 × 6533 =?

In this method of naïve multiplication, given the number of digits in both numbers is 4, there are 16 single-digit × single-digit multiplications being performed. Thus, the time complexity of this approach is O(4²) since it takes 4² steps to calculate the final product.

But when the value of n keeps increasing, the time complexity of the problem also keeps increasing. Hence, Karatsuba algorithm is adopted to perform faster multiplications.

Karatsuba Algorithm

The main idea of the Karatsuba Algorithm is to reduce multiplication of multiple sub problems to multiplication of three sub problems. Arithmetic operations like additions and subtractions are performed for other computations.

For this algorithm, two n-digit numbers are taken as the input and the product of the two number is obtained as the output.

Step 1 − In this algorithm we assume that n is a power of 2.

Step 2 − If n = 1 then we use multiplication tables to find P = XY.

Step 3 − If n > 1, the n-digit numbers are split in half and represent the number using the formulae −

X = 10^n/2X₁ + X₂
Y = 10^n/2Y₁ + Y₂

where, X₁, X₂, Y₁, Y₂ each have n/2 digits.

Step 4 − Take a variable Z = W – (U + V),

where,

U = X₁Y₁, V = X₂Y₂
W = (X₁ + X₂) (Y₁ + Y₂), Z = X₁Y₂ + X₂Y₁.

Step 5 − Then, the product P is obtained after substituting the values in the formula −

P = 10ⁿ(U) + 10n/2(Z) + V
P = 10ⁿ (X₁Y₁) + 10n/2 (X₁Y₂ + X₂Y₁) + X₂Y₂.

Step 6 − Recursively call the algorithm by passing the sub problems (X₁, Y₁), (X₂, Y₂) and (X₁ + X₂, Y₁ + Y₂) separately. Store the returned values in variables U, V and W respectively.

Example

Let us solve the same problem given above using Karatsuba method, 1456 × 6533 −

The Karatsuba method takes the divide and conquer approach by dividing the problem into multiple sub-problems and applies recursion to make the multiplication simpler.

Step 1

Assuming that n is the power of 2, rewrite the n-digit numbers in the form of −

X = 10^n/2X₁ + X₂ Y = 10^n/2Y₁ + Y₂

That gives us,

1456 = 10²(14) + 56 
6533 = 10²(65) + 33

First let us try simplifying the mathematical expression, we get,

(1400 × 6500) + (56 × 33) + (1400 × 33) + (6500 × 56) = 10⁴ (14 × 65) + 10² [(14 × 33) + (56 × 65)] + (33 × 56)

The above expression is the simplified version of the given multiplication problem, since multiplying two double-digit numbers can be easier to solve rather than multiplying two four-digit numbers.

However, that holds true for the human mind. But for the system compiler, the above expression still takes the same time complexity as the normal naïve multiplication. Since it has 4 double-digit × double-digit multiplications, the time complexity taken would be −

14 × 65 → O(4)
14 × 33 → O(4)
65 × 56 → O(4)
56 × 33 → O(4)
= O (16)

Thus, the calculation needs to be simplified further.

Step 2

X = 1456 
Y = 6533

Since n is not equal to 1, the algorithm jumps to step 3.

X = 10^n/2X₁ + X₂ 
Y = 10^n/2Y₁ + Y₂

That gives us,

1456 = 10²(14) + 56 
6533 = 10²(65) + 33

Calculate Z = W – (U + V) −

Z = (X₁ + X₂) (Y₁ + Y₂) – (X₁Y₁ + X₂Y₂) 
Z = X₁Y₂ + X₂Y₁ 
Z = (14 × 33) + (65 × 56)

The final product,

P = 10ⁿ. U + 10^n/2. Z + V 
   = 10ⁿ (X₁Y₁) + 10^n/2 (X₁Y₂ + X₂Y₁) + X₂Y₂ 
   = 10⁴ (14 × 65) + 10² [(14 × 33) + (65 × 56)] + (56 × 33)

The sub-problems can be further divided into smaller problems; therefore, the algorithm is again called recursively.

Step 3

X₁ and Y₁ are passed as parameters X and Y.

So now, X = 14, Y = 65

X = 10^n/2X₁ + X₂ 
Y = 10^n/2Y₁ + Y₂ 
14 = 10(1) + 4 
65 = 10(6) + 5

Calculate Z = W – (U + V) −

Z = (X₁ + X₂) (Y₁ + Y₂) – (X₁Y₁ + X₂Y₂) 
Z = X₁Y₂ + X₂Y₁ 
Z = (1 × 5) + (6 × 4) = 29 

P = 10ⁿ (X₁Y₁) + 10^n/2 (X₁Y₂ + X₂Y₁) + X₂Y₂ 
   = 10² (1 × 6) + 101 (29) + (4 × 5) 
   = 910

Step 4

X₂ and Y₂ are passed as parameters X and Y.

So now, X = 56, Y = 33

X = 10^n/2X₁ + X₂ 
Y = 10^n/2Y₁ + Y₂ 
56 = 10(5) + 6 
33 = 10(3) + 3

Calculate Z = W – (U + V) −

Z = (X₁ + X₂) (Y₁ + Y₂) – (X₁Y₁ + X₂Y₂) 
Z = X₁Y₂ + X₂Y₁ 
Z = (5 × 3) + (6 × 3) = 33 

P = 10ⁿ (X₁Y₁) + 10^n/2 (X₁Y₂ + X₂Y₁) + X₂Y₂ 
= 10² (5 × 3) + 101 (33) + (6 × 3) 
= 1848

Step 5

X₁ + X₂ and Y₁ + Y₂ are passed as parameters X and Y.

So now, X = 70, Y = 98

X = 10^n/2X₁ + X₂ 
Y = 10^n/2Y₁ + Y₂ 
70 = 10(7) + 0 
98 = 10(9) + 8

Calculate Z = W – (U + V) −

Z = (X₁ + X₂) (Y₁ + Y₂) – (X₁Y₁ + X₂Y₂) 
Z = X₁Y₂ + X₂Y₁ 
Z = (7 × 8) + (0 × 9) = 56 

P = 10ⁿ (X₁Y₁) + 10^n/2 (X₁Y₂ + X₂Y₁) + X₂Y₂ 
= 10² (7 × 9) + 101 (56) + (0 × 8) 
=

Step 6

The final product,

P = 10ⁿ. U + 10^n/2. Z + V

U = 910 
V = 1848 
Z = W – (U + V) = 6860 – (1848 + 910) = 4102

Substituting the values in equation,

P = 10ⁿ. U + 10^n/2. Z + V 
P = 10⁴ (910) + 10² (4102) + 1848 
P = 91,00,000 + 4,10,200 + 1848 
P = 95,12,048

Analysis

The Karatsuba algorithm is a recursive algorithm; since it calls smaller instances of itself during execution.

According to the algorithm, it calls itself only thrice on n/2-digit numbers in order to achieve the final product of two n-digit numbers.

Now, if T(n) represents the number of digit multiplications required while performing the multiplication,

T(n) = 3T(n/2)

This equation is a simple recurrence relation which can be solved as −

Apply T(n/2) = 3T(n/4) in the above equation, we get:
T(n) = 9T(n/4)
T(n) = 27T(n/8)
T(n) = 81T(n/16)
.
.
.
.
T(n) = 3ⁱ T(n/2ⁱ) is the general form of the recurrence relation of Karatsuba algorithm.

Recurrence relations can be solved using the master’s theorem, since we have a dividing function in the form of −

T(n) = aT(n/b) + f(n), where, a = 3, b = 2 and f(n) = 0 which leads to k = 0.

Since f(n) represents work done outside the recursion, which are addition and subtraction arithmetic operations in Karatsuba, these arithmetic operations do not contribute to time complexity.

Check the relation between ‘a’ and ‘b^k’.

a > b^k = 3 > 2⁰

According to master’s theorem, apply case 1.

T(n) = O(n^{logb a})
T(n) = O(n^{log 3})

The time complexity of Karatsuba algorithm for fast multiplication is O(n^{log 3}).

Example

In the complete implementation of Karatsuba Algorithm, we are trying to multiply two higher-valued numbers. Here, since the long data type accepts decimals upto 18 places, we take the inputs as long values. The Karatsuba function is called recursively until the final product is obtained.

C C++ Java Python

#include <stdio.h>
#include <math.h>
int get_size(long);
long karatsuba(long X, long Y){
   
   // Base Case
   if (X < 10 && Y < 10)
      return X * Y;
   
   // determine the size of X and Y
   int size = fmax(get_size(X), get_size(Y));
   if(size < 10)
      return X * Y;
   
   // rounding up the max length
   size = (size/2) + (size%2);
   long multiplier = pow(10, size);
   long b = X/multiplier;
   long a = X - (b * multiplier);
   long d = Y / multiplier;
   long c = Y - (d * size);
   long u = karatsuba(a, c);
   long z = karatsuba(a + b, c + d);
   long v = karatsuba(b, d);
   return u + ((z - u - v) * multiplier) + (v * (long)(pow(10, 2 * size)));
}
int get_size(long value){
   int count = 0;
   while (value > 0) {
      count++;
      value /= 10;
   }
   return count;
}
int main(){

   // two numbers
   long x = 145623;
   long y = 653324;
   printf("The final product is: %ld\n", karatsuba(x, y));
   return 0;
}

Output

The final product is: 95139000852

#include <iostream>
#include <cmath>
using namespace std;
int get_size(long);
long karatsuba(long X, long Y){

   // Base Case
   if (X < 10 && Y < 10)
      return X * Y;

   // determine the size of X and Y
   int size = fmax(get_size(X), get_size(Y));
   if(size < 10)
      return X * Y;

   // rounding up the max length
   size = (size/2) + (size%2);
   long multiplier = pow(10, size);
   long b = X/multiplier;
   long a = X - (b * multiplier);
   long d = Y / multiplier;
   long c = Y - (d * size);
   long u = karatsuba(a, c);
   long z = karatsuba(a + b, c + d);
   long v = karatsuba(b, d);
   return u + ((z - u - v) * multiplier) + (v * (long)(pow(10, 2 * size)));
}
int get_size(long value){
   int count = 0;
   while (value > 0) {
      count++;
      value /= 10;
   }
   return count;
}
int main(){

   // two numbers
   long x = 145623;
   long y = 653324;
   cout << "The final product is: " << karatsuba(x, y) << endl;
   return 0;
}

Output

The final product is: 95139000852

import java.io.*;
public class Main {
   static long karatsuba(long X, long Y) {
      // Base Case
      if (X < 10 && Y < 10)
         return X * Y;
      // determine the size of X and Y
      int size = Math.max(get_size(X), get_size(Y));
      if(size < 10)
         return X * Y;
      // rounding up the max length
      size = (size/2) + (size%2);
      long multiplier = (long)Math.pow(10, size);
      long b = X/multiplier;
      long a = X - (b * multiplier);
      long d = Y / multiplier;
      long c = Y - (d * size);
      long u = karatsuba(a, c);
      long z = karatsuba(a + b, c + d);
      long v = karatsuba(b, d);
      return u + ((z - u - v) * multiplier) + (v * (long)(Math.pow(10, 2 * size)));
   }
   static int get_size(long value) {
      int count = 0;
      while (value > 0) {
         count++;
         value /= 10;
      }
      return count;
   }
   public static void main(String args[]) {
      // two numbers
      long x = 145623;
      long y = 653324;
      System.out.print("The final product is: ");
      long product = karatsuba(x, y);
      System.out.println(product);
   }
}

Output

The final product is: 95139000852

import math
def karatsuba(X, Y):
    if X < 10 and Y < 10:
        return X * Y
    size = max(get_size(X), get_size(Y))
    if size < 10:
        return X * Y
    size = (size // 2) + (size % 2)
    multiplier = 10 ** size
    b = X // multiplier
    a = X - (b * multiplier)
    d = Y // multiplier
    c = Y - (d * size)
    u = karatsuba(a, c)
    z = karatsuba(a + b, c + d)
    v = karatsuba(b, d)
    return u + ((z - u - v) * multiplier) + (v * (10 ** (2 * size)))
def get_size(value):
    count = 0
    while value > 0:
        count += 1
        value //= 10
    return count
x = 145623
y = 653324
print("The final product is: ", end="")
product = karatsuba(x, y)
print(product)

Output

The final product is: 95139000852

Towers of Hanoi

Tower of Hanoi, is a mathematical puzzle which consists of three towers (pegs/rods) and more than one rings is as depicted −

These rings are of different sizes and stacked upon in an ascending order, i.e. the smaller one sits over the larger one. There are other variations of the puzzle where the number of disks increase, but the tower count remains the same.

Rules in Towers of Hanoi

The mission is to move all the disks to some another tower without violating the sequence of arrangement. A few rules to be followed for Tower of Hanoi are −

Only one disk can be moved among the towers at any given time.
Only the "top" disk can be removed.
No large disk can sit over a small disk.

Following is an animated representation of solving a Tower of Hanoi puzzle with three disks.

Tower of Hanoi puzzle with n disks can be solved in minimum 2ⁿ−1 steps. This presentation shows that a puzzle with 3 disks has taken 2³−1 = 7 steps.

Towers of Hanoi Algorithm

To write an algorithm for Tower of Hanoi, first we need to learn how to solve this problem with lesser amount of disks, say → 1 or 2. We mark three towers with name, source, destination and aux (only to help moving the disks). If we have only one disk, then it can easily be moved from source to destination peg.

If we have 2 disks −

First, we move the smaller (top) disk to aux peg.
Then, we move the larger (bottom) disk to destination peg.
And finally, we move the smaller disk from aux to destination peg.

So now, we are in a position to design an algorithm for Tower of Hanoi with more than two disks. We divide the stack of disks in two parts. The largest disk (n^th disk) is in one part and all other (n-1) disks are in the second part.

Our ultimate aim is to move disk n from source to destination and then put all other (n-1) disks onto it. We can imagine to apply the same in a recursive way for all given set of disks.

The steps to follow are −

Step 1 − Move n-1 disks from source to aux
Step 2 − Move n^th disk from source to dest
Step 3 − Move n-1 disks from aux to dest

A recursive algorithm for Tower of Hanoi can be driven as follows −

START
Procedure Hanoi(disk, source, dest, aux)
   IF disk == 0, THEN
      move disk from source to dest
   ELSE
      Hanoi(disk - 1, source, aux, dest) // Step 1
      move disk from source to dest // Step 2
      Hanoi(disk - 1, aux, dest, source) // Step 3
   END IF
END Procedure
STOP

Example

Following is the iterative approach to implement Towers of Hanoi in various languages.

C C++ Java Python

#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <limits.h>
// structure to store data of a stack
struct Stack {
   unsigned size;
   int top;
   int *arr;
};
// function to create a stack of given size.
struct Stack* stack_creation(unsigned size){
   struct Stack* stack = (struct Stack*) malloc(sizeof(struct Stack));
   stack -> size = size;
   stack -> top = -1;
   stack -> arr = (int*) malloc(stack -> size * sizeof(int));
   return stack;
}
// to check if stack is full
int isFull(struct Stack* stack){
   return (stack->top == stack->size - 1);
}
// to check if stack is empty
int isEmpty(struct Stack* stack){
   return (stack->top == -1);
}
// insertion in stack
void push(struct Stack *stack, int item){
   if (isFull(stack))
      return;
   stack -> arr[++stack -> top] = item;
}
// deletion in stack
int pop(struct Stack* stack){
   if (isEmpty(stack))
      return INT_MIN;
   return stack -> arr[stack -> top--];
}
//printing the movement of disks
void movement(char src, char dest, int disk){
   printf("Move the disk %d from \'%c\' to \'%c\'\n",disk, src, dest);
}
//Moving disks between two poles
void DiskMovement(struct Stack *src,
              struct Stack *dest, char s, char d){
   int pole1Disk1 = pop(src);
   int pole2Disk1 = pop(dest);
   if (pole1Disk1 == INT_MIN) {
      push(src, pole2Disk1);
      movement(d, s, pole2Disk1);
   } else if (pole2Disk1 == INT_MIN) {
      push(dest, pole1Disk1);
      movement(s, d, pole1Disk1);
   } else if (pole1Disk1 > pole2Disk1) {
      push(src, pole1Disk1);
      push(src, pole2Disk1);
      movement(d, s, pole2Disk1);
   } else {
      push(dest, pole2Disk1);
      push(dest, pole1Disk1);
      movement(s, d, pole1Disk1);
   }
}
//Towers of Hanoi implementation
void Iterative_TOH(int disk_count, struct Stack *src, struct Stack *aux, struct Stack *dest){
   int i, total_moves;
   char s = 'S', d = 'D', a = 'A';
   if (disk_count % 2 == 0) {
      char temp = d;
      d = a;
      a = temp;
   }
   total_moves = pow(2, disk_count) - 1;
   for (i = disk_count; i >= 1; i--)
      push(src, i);
   for (i = 1; i <= total_moves; i++) {
      if (i % 3 == 1)
         DiskMovement(src, dest, s, d);
      else if (i % 3 == 2)
         DiskMovement(src, aux, s, a);
      else if (i % 3 == 0)
         DiskMovement(aux, dest, a, d);
   }
}
int main(){
   unsigned disk_count = 3;
   struct Stack *src, *dest, *aux;

   // Three stacks are created with number of buckets equal to number of disks
   src = stack_creation(disk_count);
   aux = stack_creation(disk_count);
   dest = stack_creation(disk_count);
   Iterative_TOH(disk_count, src, aux, dest);
   return 0;
}

Output

Move the disk 1 from 'S' to 'D'
Move the disk 2 from 'S' to 'A'
Move the disk 1 from 'D' to 'A'
Move the disk 3 from 'S' to 'D'
Move the disk 1 from 'A' to 'S'
Move the disk 2 from 'A' to 'D'
Move the disk 1 from 'S' to 'D'

#include <iostream>
#include <cmath>
#include <climits>
using namespace std;
// structure to store data of a stack
struct Stack {
   unsigned size;
   int top;
   int *arr;
};
// function to create a stack of given size.
struct Stack* stack_creation(unsigned size){
   struct Stack* stack = (struct Stack*) malloc(sizeof(struct Stack));
   stack -> size = size;
   stack -> top = -1;
   stack -> arr = (int*) malloc(stack -> size * sizeof(int));
   return stack;
}
// to check if stack is full
int isFull(struct Stack* stack){
   return (stack->top == stack->size - 1);
}
// to check if stack is empty
int isEmpty(struct Stack* stack){
   return (stack->top == -1);
}
// insertion in stack
void push(struct Stack *stack, int item){
   if (isFull(stack))
      return;
   stack -> arr[++stack -> top] = item;
}
// deletion in stack
int pop(struct Stack* stack){
   if (isEmpty(stack))
      return INT_MIN;
   return stack -> arr[stack -> top--];
}
//printing the movement of disks
void movement(char src, char dest, int disk){
   cout << "Move the disk " << disk << " from " << src << " to " << dest <<endl;
}
//Moving disks between two poles
void DiskMovement(struct Stack *src,
              struct Stack *dest, char s, char d){
   int pole1Disk1 = pop(src);
   int pole2Disk1 = pop(dest);
   if (pole1Disk1 == INT_MIN) {
      push(src, pole2Disk1);
      movement(d, s, pole2Disk1);
   } else if (pole2Disk1 == INT_MIN) {
      push(dest, pole1Disk1);
      movement(s, d, pole1Disk1);
   } else if (pole1Disk1 > pole2Disk1) {
      push(src, pole1Disk1);
      push(src, pole2Disk1);
      movement(d, s, pole2Disk1);
   } else {
      push(dest, pole2Disk1);
      push(dest, pole1Disk1);
      movement(s, d, pole1Disk1);
   }
}
//Towers of Hanoi implementation
void Iterative_TOH(int disk_count, struct Stack *src, struct Stack *aux, struct Stack *dest){
   int i, total_moves;
   char s = 'S', d = 'D', a = 'A';
   if (disk_count % 2 == 0) {
      char temp = d;
      d = a;
      a = temp;
   }
   total_moves = pow(2, disk_count) - 1;
   for (i = disk_count; i >= 1; i--)
      push(src, i);
   for (i = 1; i <= total_moves; i++) {
      if (i % 3 == 1)
         DiskMovement(src, dest, s, d);
      else if (i % 3 == 2)
         DiskMovement(src, aux, s, a);
      else if (i % 3 == 0)
         DiskMovement(aux, dest, a, d);
   }
}
int main(){
   unsigned disk_count = 3;
   struct Stack *src, *dest, *aux;
// Three stacks are created with number of buckets equal to number of disks
   src = stack_creation(disk_count);
   aux = stack_creation(disk_count);
   dest = stack_creation(disk_count);
   Iterative_TOH(disk_count, src, aux, dest);
   return 0;
}

Output

Move the disk 1 from S to D
Move the disk 2 from S to A
Move the disk 1 from D to A
Move the disk 3 from S to D
Move the disk 1 from A to S
Move the disk 2 from A to D
Move the disk 1 from S to D

import java.util.*;
import java.lang.*;
import java.io.*;
// Tower of Hanoi
public class Iterative_TOH {
   //Stack
   class Stack {
      int size;
      int top;
      int arr[];
   }
   // Creating Stack
   Stack stack_creation(int size) {
      Stack stack = new Stack();
      stack.size = size;
      stack.top = -1;
      stack.arr = new int[size];
      return stack;
   }
   //to check if stack is full
   boolean isFull(Stack stack) {
      return (stack.top == stack.size - 1);
   }
   //to check if stack is empty
   boolean isEmpty(Stack stack) {
      return (stack.top == -1);
   }
   //Insertion in Stack
   void push(Stack stack, int item) {
      if (isFull(stack))
         return;
      stack.arr[++stack.top] = item;
   }
   //Deletion from Stack
   int pop(Stack stack) {
      if (isEmpty(stack))
         return Integer.MIN_VALUE;
      return stack.arr[stack.top--];
   }
   // Function to movement disks between the poles
   void Diskmovement(Stack src, Stack dest, char s, char d) {
      int pole1 = pop(src);
      int pole2 = pop(dest);
      // When pole 1 is empty
      if (pole1 == Integer.MIN_VALUE) {
         push(src, pole2);
         movement(d, s, pole2);
      }
      // When pole2 pole is empty
      else if (pole2 == Integer.MIN_VALUE) {
         push(dest, pole1);
         movement(s, d, pole1);
      }
      // When top disk of pole1 > top disk of pole2
      else if (pole1 > pole2) {
         push(src, pole1);
         push(src, pole2);
         movement(d, s, pole2);
      }
      // When top disk of pole1 < top disk of pole2
      else {
         push(dest, pole2);
         push(dest, pole1);
         movement(s, d, pole1);
      }
   }
   //Function to show the movementment of disks
   void movement(char source, char destination, int disk) {
      System.out.println("Move the disk " + disk + " from " + source + " to " + destination);
   }
   // Implementation
   void Iterative(int num, Stack src, Stack aux, Stack dest) {
      int i, total_count;
      char s = 'S', d = 'D', a = 'A';
      // Rules in algorithm will be followed
      if (num % 2 == 0) {
         char temp = d;
         d = a;
         a = temp;
      }
      total_count = (int)(Math.pow(2, num) - 1);
      // disks with large diameter are pushed first
      for (i = num; i >= 1; i--)
         push(src, i);
      for (i = 1; i <= total_count; i++) {
         if (i % 3 == 1)
            Diskmovement(src, dest, s, d);
         else if (i % 3 == 2)
            Diskmovement(src, aux, s, a);
         else if (i % 3 == 0)
            Diskmovement(aux, dest, a, d);
      }
   }
   // Main Function
   public static void main(String[] args) {
      // number of disks
      int num = 3;
      Iterative_TOH ob = new Iterative_TOH();
      Stack src, dest, aux;
      src = ob.stack_creation(num);
      dest = ob.stack_creation(num);
      aux = ob.stack_creation(num);
      ob.Iterative(num, src, aux, dest);
   }
}

Output

Move the disk 1 from S to D
Move the disk 2 from S to A
Move the disk 1 from D to A
Move the disk 3 from S to D
Move the disk 1 from A to S
Move the disk 2 from A to D
Move the disk 1 from S to D

#Iterative Towers of Hanoi
INT_MIN = -723489710
class Stack:
   def __init__(self, size):
      self.size = size
      self.top = -1
      self.arr = []
   # to check if the stack is full
   def isFull(self, stack):
      return stack.top == stack.size - 1
   # to check if the stack is empty
   def isEmpty(self, stack):
      return stack.top == -1
   # Insertion in Stack
   def push(self, stack, item):
      if self.isFull(stack):
         return
      stack.top+=1
      stack.arr.append(item)
   # Deletion from Stack
   def pop(self, stack):
      if self.isEmpty(stack):
         return INT_MIN
      stack.top-=1
      return stack.arr.pop()
   def DiskMovement(self, src, dest, s, d):
      pole1 = self.pop(src);
      pole2 = self.pop(dest);
      # When pole 1 is empty
      if(pole1 == INT_MIN):
         self.push(src, pole2)
         self.Movement(d, s, pole2)
      # When pole2 pole is empty
      elif (pole2 == INT_MIN):
         self.push(dest, pole1)
         self.Movement(s, d, pole1)
      # When top disk of pole1 > top disk of pole2
      elif (pole1 > pole2):
         self.push(src, pole1)
         self.push(src, pole2)
         self.Movement(d, s, pole2)
      # When top disk of pole1 < top disk of pole2
      else:
         self.push(dest, pole2)
         self.push(dest, pole1)
         self.Movement(s, d, pole1)
   # Function to show the Movementment of disks
   def Movement(self, source, destination, disk):
      print("Move the disk "+str(disk)+" from "+source+" to " + destination)
   # Implementation
   def Iterative(self, num, src, aux, dest):
      s, d, a = 'S', 'D', 'A'
      # Rules in algorithm will be followed
      if num % 2 == 0:
         temp = d
         d = a
         a = temp
      total_count = int(pow(2, num) - 1)
      # disks with large diameter are pushed first
      i = num
      while(i>=1):
         self.push(src, i)
         i-=1
      i = 1
      while(i <= total_count):
         if (i % 3 == 1):
            self.DiskMovement(src, dest, s, d)
         elif (i % 3 == 2):
            self.DiskMovement(src, aux, s, a)
         elif (i % 3 == 0):
            self.DiskMovement(aux, dest, a, d)
         i+=1

# number of disks
num = 3
# stacks created for src , dest, aux
src = Stack(num)
dest = Stack(num)
aux = Stack(num)
# solution for 3 disks
sol = Stack(0)
sol.Iterative(num, src, aux, dest)

Output

Move the disk 1 from S to D
Move the disk 2 from S to A
Move the disk 1 from D to A
Move the disk 3 from S to D
Move the disk 1 from A to S
Move the disk 2 from A to D
Move the disk 1 from S to D

Greedy Algorithms

Among all the algorithmic approaches, the simplest and straightforward approach is the Greedy method. In this approach, the decision is taken on the basis of current available information without worrying about the effect of the current decision in future.

Greedy algorithms build a solution part by part, choosing the next part in such a way, that it gives an immediate benefit. This approach never reconsiders the choices taken previously. This approach is mainly used to solve optimization problems. Greedy method is easy to implement and quite efficient in most of the cases. Hence, we can say that Greedy algorithm is an algorithmic paradigm based on heuristic that follows local optimal choice at each step with the hope of finding global optimal solution.

In many problems, it does not produce an optimal solution though it gives an approximate (near optimal) solution in a reasonable time.

Components of Greedy Algorithm

Greedy algorithms have the following five components −

A candidate set − A solution is created from this set.
A selection function − Used to choose the best candidate to be added to the solution.
A feasibility function − Used to determine whether a candidate can be used to contribute to the solution.
An objective function − Used to assign a value to a solution or a partial solution.
A solution function − Used to indicate whether a complete solution has been reached.

Areas of Application

Greedy approach is used to solve many problems, such as

Finding the shortest path between two vertices using Dijkstra’s algorithm.
Finding the minimal spanning tree in a graph using Prim’s /Kruskal’s algorithm, etc.

Counting Coins Problem

The Counting Coins problem is to count to a desired value by choosing the least possible coins and the greedy approach forces the algorithm to pick the largest possible coin. If we are provided coins of € 1, 2, 5 and 10 and we are asked to count € 18 then the greedy procedure will be −

1 − Select one € 10 coin, the remaining count is 8
2 − Then select one € 5 coin, the remaining count is 3
3 − Then select one € 2 coin, the remaining count is 1
4 − And finally, the selection of one € 1 coins solves the problem

Though, it seems to be working fine, for this count we need to pick only 4 coins. But if we slightly change the problem then the same approach may not be able to produce the same optimum result.

For the currency system, where we have coins of 1, 7, 10 value, counting coins for value 18 will be absolutely optimum but for count like 15, it may use more coins than necessary. For example, the greedy approach will use 10 + 1 + 1 + 1 + 1 + 1, total 6 coins. Whereas the same problem could be solved by using only 3 coins (7 + 7 + 1)

Hence, we may conclude that the greedy approach picks an immediate optimized solution and may fail where global optimization is a major concern.

Where Greedy Approach Fails

In many problems, Greedy algorithm fails to find an optimal solution, moreover it may produce a worst solution. Problems like Travelling Salesman and Knapsack cannot be solved using this approach.

Examples

Most networking algorithms use the greedy approach. Here is a list of few of them −

Travelling Salesman Problem
Prim's Minimal Spanning Tree Algorithm
Kruskal's Minimal Spanning Tree Algorithm
Dijkstra's Minimal Spanning Tree Algorithm
Graph − Map Coloring
Graph − Vertex Cover
Knapsack Problem
Job Scheduling Problem

We will discuss these examples elaborately in the further chapters of this tutorial.

Travelling Salesman Problem

The travelling salesman problem is a graph computational problem where the salesman needs to visit all cities (represented using nodes in a graph) in a list just once and the distances (represented using edges in the graph) between all these cities are known. The solution that is needed to be found for this problem is the shortest possible route in which the salesman visits all the cities and returns to the origin city.

If you look at the graph below, considering that the salesman starts from the vertex ‘a’, they need to travel through all the remaining vertices b, c, d, e, f and get back to ‘a’ while making sure that the cost taken is minimum.

There are various approaches to find the solution to the travelling salesman problem: naïve approach, greedy approach, dynamic programming approach, etc. In this tutorial we will be learning about solving travelling salesman problem using greedy approach.

Travelling Salesperson Algorithm

As the definition for greedy approach states, we need to find the best optimal solution locally to figure out the global optimal solution. The inputs taken by the algorithm are the graph G {V, E}, where V is the set of vertices and E is the set of edges. The shortest path of graph G starting from one vertex returning to the same vertex is obtained as the output.

Algorithm

Travelling salesman problem takes a graph G {V, E} as an input and declare another graph as the output (say G’) which will record the path the salesman is going to take from one node to another.
The algorithm begins by sorting all the edges in the input graph G from the least distance to the largest distance.
The first edge selected is the edge with least distance, and one of the two vertices (say A and B) being the origin node (say A).
Then among the adjacent edges of the node other than the origin node (B), find the least cost edge and add it onto the output graph.
Continue the process with further nodes making sure there are no cycles in the output graph and the path reaches back to the origin node A.
However, if the origin is mentioned in the given problem, then the solution must always start from that node only. Let us look at some example problems to understand this better.

Examples

Consider the following graph with six cities and the distances between them −

From the given graph, since the origin is already mentioned, the solution must always start from that node. Among the edges leading from A, A → B has the shortest distance.

Then, B → C has the shortest and only edge between, therefore it is included in the output graph.

There’s only one edge between C → D, therefore it is added to the output graph.

There’s two outward edges from D. Even though, D → B has lower distance than D → E, B is already visited once and it would form a cycle if added to the output graph. Therefore, D → E is added into the output graph.

There’s only one edge from e, that is E → F. Therefore, it is added into the output graph.

Again, even though F → C has lower distance than F → A, F → A is added into the output graph in order to avoid the cycle that would form and C is already visited once.

The shortest path that originates and ends at A is A → B → C → D → E → F → A

The cost of the path is: 16 + 21 + 12 + 15 + 16 + 34 = 114.

Even though, the cost of path could be decreased if it originates from other nodes but the question is not raised with respect to that.

Example

The complete implementation of Travelling Salesman Problem using Greedy Approach is given below −

C C++ Java Python

#include <stdio.h>
int tsp_g[10][10] = {
   {12, 30, 33, 10, 45},
   {56, 22, 9, 15, 18},
   {29, 13, 8, 5, 12},
   {33, 28, 16, 10, 3},
   {1, 4, 30, 24, 20}
};
int visited[10], n, cost = 0;

/* creating a function to generate the shortest path */
void travellingsalesman(int c){
   int k, adj_vertex = 999;
   int min = 999;
   
   /* marking the vertices visited in an assigned array */
   visited[c] = 1;
   
   /* displaying the shortest path */
   printf("%d ", c + 1);
   
   /* checking the minimum cost edge in the graph */
   for(k = 0; k < n; k++) {
      if((tsp_g[c][k] != 0) && (visited[k] == 0)) {
         if(tsp_g[c][k] < min) {
            min = tsp_g[c][k];
         }
         adj_vertex = k;
      }
   }
   if(min != 999) {
      cost = cost + min;
   }
   if(adj_vertex == 999) {
      adj_vertex = 0;
      printf("%d", adj_vertex + 1);
      cost = cost + tsp_g[c][adj_vertex];
      return;
   }
   travellingsalesman(adj_vertex);
}

/* main function */
int main(){
   int i, j;
   n = 5;
   for(i = 0; i < n; i++) {
      visited[i] = 0;
   }
   printf("\nShortest Path: ");
   travellingsalesman(0);
   printf("\nMinimum Cost: ");
   printf("%d\n", cost);
   return 0;
}

Output

Shortest Path: 1 5 4 3 2 1
Minimum Cost: 99

#include <iostream>
using namespace std;
int tsp_g[10][10] = {{12, 30, 33, 10, 45},
{56, 22, 9, 15, 18},
{29, 13, 8, 5, 12},
{33, 28, 16, 10, 3},
{1, 4, 30, 24, 20}
};
int visited[10], n, cost = 0;

/* creating a function to generate the shortest path */
void travellingsalesman(int c){
   int k, adj_vertex = 999;
   int min = 999;
   
   /* marking the vertices visited in an assigned array */
   visited[c] = 1;
   
   /* displaying the shortest path */
   cout<<c + 1<<" ";
   
   /* checking the minimum cost edge in the graph */
   for(k = 0; k < n; k++) {
      if((tsp_g[c][k] != 0) && (visited[k] == 0)) {
         if(tsp_g[c][k] < min) {
            min = tsp_g[c][k];
         }
      adj_vertex = k;
      }
   }
   if(min != 999) {
      cost = cost + min;
   }
   if(adj_vertex == 999) {
      adj_vertex = 0;
      cout<<adj_vertex + 1;
      cost = cost + tsp_g[c][adj_vertex];
      return;
   }
   travellingsalesman(adj_vertex);
}

/* main function */
int main(){
   int i, j;
   n = 5;
   for(i = 0; i < n; i++) {
      visited[i] = 0;
   }
   cout<<endl;
   cout<<"Shortest Path: ";
   travellingsalesman(0);
   cout<<endl;
   cout<<"Minimum Cost: ";
   cout<<cost;
   return 0;
}

Output

Shortest Path: 1 5 4 3 2 1
Minimum Cost: 99

import java.util.*;
public class Main {
    static int[][] tsp_g = {{12, 30, 33, 10, 45},
                            {56, 22, 9, 15, 18},
                            {29, 13, 8, 5, 12},
                            {33, 28, 16, 10, 3},
                            {1, 4, 30, 24, 20}};
    static int[] visited;
    static int n, cost;
    public static void travellingsalesman(int c) {
        int k, adj_vertex = 999;
        int min = 999;
        visited[c] = 1;
        System.out.print((c + 1) + " ");
        for (k = 0; k < n; k++) {
            if ((tsp_g[c][k] != 0) && (visited[k] == 0)) {
                if (tsp_g[c][k] < min) {
                    min = tsp_g[c][k];
                }
                adj_vertex = k;
            }
        }
        if (min != 999) {
            cost = cost + min;
        }
        if (adj_vertex == 999) {
            adj_vertex = 0;
            System.out.print((adj_vertex + 1));
            cost = cost + tsp_g[c][adj_vertex];
            return;
        }
        travellingsalesman(adj_vertex);
    }
    public static void main(String[] args) {
        int i, j;
        n = 5;
        visited = new int[n];
        Arrays.fill(visited, 0);
        System.out.println();
        System.out.print("Shortest Path: ");
        travellingsalesman(0);
        System.out.println();
        System.out.print("Minimum Cost: ");
        System.out.print(cost);
    }
}

Output

Shortest Path: 1 5 4 3 2 1
Minimum Cost: 99

import numpy as np
def travellingsalesman(c):
    global cost
    adj_vertex = 999
    min_val = 999
    visited[c] = 1
    print((c + 1), end=" ")
    for k in range(n):
        if (tsp_g[c][k] != 0) and (visited[k] == 0):
            if tsp_g[c][k] < min_val:
                min_val = tsp_g[c][k]
                adj_vertex = k
    if min_val != 999:
        cost = cost + min_val
    if adj_vertex == 999:
        adj_vertex = 0
        print((adj_vertex + 1), end=" ")
        cost = cost + tsp_g[c][adj_vertex]
        return
    travellingsalesman(adj_vertex)
n = 5
cost = 0
visited = np.zeros(n, dtype=int)
tsp_g = np.array([[12, 30, 33, 10, 45],
                  [56, 22, 9, 15, 18],
                  [29, 13, 8, 5, 12],
                  [33, 28, 16, 10, 3],
                  [1, 4, 30, 24, 20]])
print("Shortest Path:", end=" ")
travellingsalesman(0)
print()
print("Minimum Cost:", end=" ")
print(cost)

Output

Shortest Path: 1 4 5 2 3 1 
Minimum Cost: 55

Prim’s Minimal Spanning Tree

Prim’s minimal spanning tree algorithm is one of the efficient methods to find the minimum spanning tree of a graph. A minimum spanning tree is a subgraph that connects all the vertices present in the main graph with the least possible edges and minimum cost (sum of the weights assigned to each edge).

The algorithm, similar to any shortest path algorithm, begins from a vertex that is set as a root and walks through all the vertices in the graph by determining the least cost adjacent edges.

Prim’s Algorithm

To execute the prim’s algorithm, the inputs taken by the algorithm are the graph G {V, E}, where V is the set of vertices and E is the set of edges, and the source vertex S. A minimum spanning tree of graph G is obtained as an output.

Algorithm

Declare an array visited[] to store the visited vertices and firstly, add the arbitrary root, say S, to the visited array.
Check whether the adjacent vertices of the last visited vertex are present in the visited[] array or not.
If the vertices are not in the visited[] array, compare the cost of edges and add the least cost edge to the output spanning tree.
The adjacent unvisited vertex with the least cost edge is added into the visited[] array and the least cost edge is added to the minimum spanning tree output.
Steps 2 and 4 are repeated for all the unvisited vertices in the graph to obtain the full minimum spanning tree output for the given graph.
Calculate the cost of the minimum spanning tree obtained.

Examples

Find the minimum spanning tree using prim’s method (greedy approach) for the graph given below with S as the arbitrary root.

Solution

Step 1

Create a visited array to store all the visited vertices into it.

V = { }

The arbitrary root is mentioned to be S, so among all the edges that are connected to S we need to find the least cost edge.

S → B = 8
V = {S, B}

Step 2

Since B is the last visited, check for the least cost edge that is connected to the vertex B.

B → A = 9
B → C = 16
B → E = 14

Hence, B → A is the edge added to the spanning tree.

V = {S, B, A}

Step 3

Since A is the last visited, check for the least cost edge that is connected to the vertex A.

A → C = 22
A → B = 9
A → E = 11

But A → B is already in the spanning tree, check for the next least cost edge. Hence, A → E is added to the spanning tree.

V = {S, B, A, E}

Step 4

Since E is the last visited, check for the least cost edge that is connected to the vertex E.

E → C = 18
E → D = 3

Therefore, E → D is added to the spanning tree.

V = {S, B, A, E, D}

Step 5

Since D is the last visited, check for the least cost edge that is connected to the vertex D.

D → C = 15
E → D = 3

Therefore, D → C is added to the spanning tree.

V = {S, B, A, E, D, C}

The minimum spanning tree is obtained with the minimum cost = 46

Example

The final program implements Prim’s minimum spanning tree problem that takes the cost adjacency matrix as the input and prints the spanning tree as the output along with the minimum cost.

C C++ Java Python

#include<stdio.h>
#include<stdlib.h>
#define inf 99999
#define MAX 10
int G[MAX][MAX] = {
   {0, 19, 8},
   {21, 0, 13},
   {15, 18, 0}
};
int S[MAX][MAX], n;
int prims();
int main(){
   int i, j, cost;
   n = 3;
   cost=prims();
   printf("Spanning tree:");
   for(i=0; i<n; i++) {
      printf("\n");
      for(j=0; j<n; j++)
         printf("%d\t",S[i][j]);
   }
   printf("\nMinimum cost = %d", cost);
   return 0;
}
int prims(){
   int C[MAX][MAX];
   int u, v, min_dist, dist[MAX], from[MAX];
   int visited[MAX],ne,i,min_cost,j;

   //create cost[][] matrix,spanning[][]
   for(i=0; i<n; i++)
      for(j=0; j<n; j++) {
         if(G[i][j]==0)
            C[i][j]=inf;
         else
            C[i][j]=G[i][j];
         S[i][j]=0;
      }

   //initialise visited[],distance[] and from[]
   dist[0]=0;
   visited[0]=1;
   for(i=1; i<n; i++) {
      dist[i] = C[0][i];
      from[i] = 0;
      visited[i] = 0;
   }
   min_cost = 0; //cost of spanning tree
   ne = n-1; //no. of edges to be added
   while(ne > 0) {

      //find the vertex at minimum distance from the tree
      min_dist = inf;
      for(i=1; i<n; i++)
         if(visited[i] == 0 && dist[i] < min_dist) {
            v = i;
            min_dist = dist[i];
         }
      u = from[v];

      //insert the edge in spanning tree
      S[u][v] = dist[v];
      S[v][u] = dist[v];
      ne--;
      visited[v]=1;

      //updated the distance[] array
      for(i=1; i<n; i++)
         if(visited[i] == 0 && C[i][v] < dist[i]) {
            dist[i] = C[i][v];
            from[i] = v;
         }
      min_cost = min_cost + C[u][v];
   }
   return(min_cost);
}

Output

Spanning tree:
0	0	8	
0	0	13	
8	13	0	
Minimum cost = 26

#include<iostream>
#define inf 999999
#define MAX 10
using namespace std;
int G[MAX][MAX] = {
   {0, 19, 8},
   {21, 0, 13},
   {15, 18, 0}
};
int S[MAX][MAX], n;
int prims();
int main(){
   int i, j, cost;
   n = 3;
   cost=prims();
   cout <<"Spanning tree:";
   for(i=0; i<n; i++) {
      cout << endl;
      for(j=0; j<n; j++)
         cout << S[i][j] << " ";
   }
   cout << "\nMinimum cost = " << cost;
   return 0;
}
int prims(){
   int C[MAX][MAX];
   int u, v, min_dist, dist[MAX], from[MAX];
   int visited[MAX],ne,i,min_cost,j;

   //create cost matrix and spanning tree
   for(i=0; i<n; i++)
      for(j=0; j<n; j++) {
         if(G[i][j]==0)
            C[i][j]=inf;
         else
            C[i][j]=G[i][j];
         S[i][j]=0;
      }

   //initialise visited[],distance[] and from[]
   dist[0]=0;
   visited[0]=1;
   for(i=1; i<n; i++) {
      dist[i] = C[0][i];
      from[i] = 0;
      visited[i] = 0;
   }
   min_cost = 0; //cost of spanning tree
   ne = n-1; //no. of edges to be added
   while(ne > 0) {

      //find the vertex at minimum distance from the tree
      min_dist = inf;
      for(i=1; i<n; i++)
         if(visited[i] == 0 && dist[i] < min_dist) {
            v = i;
            min_dist = dist[i];
         }
      u = from[v];

      //insert the edge in spanning tree
      S[u][v] = dist[v];
      S[v][u] = dist[v];
      ne--;
      visited[v]=1;

      //updated the distance[] array
      for(i=1; i<n; i++)
         if(visited[i] == 0 && C[i][v] < dist[i]) {
            dist[i] = C[i][v];
            from[i] = v;
         }
      min_cost = min_cost + C[u][v];
   }
   return(min_cost);
}

Output

Spanning tree:
0 0 8 
0 0 13 
8 13 0 
Minimum cost = 26

public class prims {
   static int inf = 999999;
   static int MAX = 10;
   static int G[][] = {
       {0, 19, 8},
       {21, 0, 13},
       {15, 18, 0}
   };
   static int S[][] = new int[MAX][MAX];
   static int n;
   public static void main(String args[]) {
      int i, j, cost;
      n = 3;
      cost=prims();
      System.out.println("Spanning tree: ");
      for(i=0; i<n; i++) {
         System.out.println();
         for(j=0; j<n; j++)
            System.out.print(S[i][j] + " ");
      }
      System.out.println("\nMinimum cost = " + cost);
   }
   static int prims() {
      int C[][] = new int[MAX][MAX];
      int u, v = 0, min_dist;
      int dist[] = new int[MAX];
      int from[] = new int[MAX];
      int visited[] = new int[MAX];
      int ne,i,min_cost,j;
      //create cost matrix and spanning tree
      for(i=0; i<n; i++)
         for(j=0; j<n; j++) {
            if(G[i][j]==0)
               C[i][j]=inf;
            else
               C[i][j]=G[i][j];
            S[i][j]=0;
         }
      //initialise visited[],distance[] and from[]
      dist[0]=0;
      visited[0]=1;
      for(i=1; i<n; i++) {
         dist[i] = C[0][i];
         from[i] = 0;
         visited[i] = 0;
      }
      min_cost = 0; //cost of spanning tree
      ne = n-1; //no. of edges to be added
      while(ne > 0) {
         //find the vertex at minimum distance from the tree
         min_dist = inf;
         for(i=1; i<n; i++)
            if(visited[i] == 0 && dist[i] < min_dist) {
               v = i;
               min_dist = dist[i];
            }
         u = from[v];
         //insert the edge in spanning tree
         S[u][v] = dist[v];
         S[v][u] = dist[v];
         ne--;
         visited[v]=1;
         //updated the distance[] array
         for(i=1; i<n; i++)
            if(visited[i] == 0 && C[i][v] < dist[i]) {
               dist[i] = C[i][v];
               from[i] = v;
            }
         min_cost = min_cost + C[u][v];
      }
      return(min_cost);
   }
}

Output

Spanning tree: 

0 0 8 
0 0 13 
8 13 0 
Minimum cost = 26

inf = 999999
MAX = 10
G = [
    [0, 19, 8],
    [21, 0, 13],
    [15, 18, 0]
]
S = [[0 for _ in range(MAX)] for _ in range(MAX)]
n = 3
def main():
    global n
    cost = prims()
    print("Spanning tree: ")
    for i in range(n):
        print()
        for j in range(n):
            print(S[i][j], end=" ")
    print("\nMinimum cost =", cost)
def prims():
    global n
    C = [[0 for _ in range(MAX)] for _ in range(MAX)]
    u, v = 0, 0
    min_dist = 0
    dist = [0 for _ in range(MAX)]
    from_ = [0 for _ in range(MAX)]
    visited = [0 for _ in range(MAX)]
    ne = 0
    min_cost = 0
    i = 0
    j = 0
    for i in range(n):
        for j in range(n):
            if G[i][j] == 0:
                C[i][j] = inf
            else:
                C[i][j] = G[i][j]
            S[i][j] = 0
    dist[0] = 0
    visited[0] = 1
    for i in range(1, n):
        dist[i] = C[0][i]
        from_[i] = 0
        visited[i] = 0
    min_cost = 0
    ne = n - 1
    while ne > 0:
        min_dist = inf
        for i in range(1, n):
            if visited[i] == 0 and dist[i] < min_dist:
                v = i
                min_dist = dist[i]
        u = from_[v]
        S[u][v] = dist[v]
        S[v][u] = dist[v]
        ne -= 1
        visited[v] = 1
        for i in range(n):
            if visited[i] == 0 and C[i][v] < dist[i]:
                dist[i] = C[i][v]
                from_[i] = v
        min_cost += C[u][v]
    return min_cost
#calling  the main() method
main()

Output

Spanning tree: 

0 0 8 
0 0 13 
8 13 0 
Minimum cost = 26

Kruskal’s Minimal Spanning Tree

Kruskal’s minimal spanning tree algorithm is one of the efficient methods to find the minimum spanning tree of a graph. A minimum spanning tree is a subgraph that connects all the vertices present in the main graph with the least possible edges and minimum cost (sum of the weights assigned to each edge).

The algorithm first starts from the forest – which is defined as a subgraph containing only vertices of the main graph – of the graph, adding the least cost edges later until the minimum spanning tree is created without forming cycles in the graph.

Kruskal’s algorithm has easier implementation than prim’s algorithm, but has higher complexity.

Kruskal’s Algorithm

The inputs taken by the kruskal’s algorithm are the graph G {V, E}, where V is the set of vertices and E is the set of edges, and the source vertex S and the minimum spanning tree of graph G is obtained as an output.

Algorithm

Sort all the edges in the graph in an ascending order and store it in an array edge[].

Edge
Cost

Construct the forest of the graph on a plane with all the vertices in it.
Select the least cost edge from the edge[] array and add it into the forest of the graph. Mark the vertices visited by adding them into the visited[] array.
Repeat the steps 2 and 3 until all the vertices are visited without having any cycles forming in the graph
When all the vertices are visited, the minimum spanning tree is formed.
Calculate the minimum cost of the output spanning tree formed.

Examples

Construct a minimum spanning tree using kruskal’s algorithm for the graph given below −

Solution

As the first step, sort all the edges in the given graph in an ascending order and store the values in an array.

Edge	B→D	A→B	C→F	F→E	B→C	G→F	A→G	C→D	D→E	C→G
Cost	5	6	9	10	11	12	15	17	22	25

Then, construct a forest of the given graph on a single plane.

From the list of sorted edge costs, select the least cost edge and add it onto the forest in output graph.

B → D = 5
Minimum cost = 5
Visited array, v = {B, D}

Similarly, the next least cost edge is B → A = 6; so we add it onto the output graph.

Minimum cost = 5 + 6 = 11
Visited array, v = {B, D, A}

The next least cost edge is C → F = 9; add it onto the output graph.

Minimum Cost = 5 + 6 + 9 = 20
Visited array, v = {B, D, A, C, F}

The next edge to be added onto the output graph is F → E = 10.

Minimum Cost = 5 + 6 + 9 + 10 = 30
Visited array, v = {B, D, A, C, F, E}

The next edge from the least cost array is B → C = 11, hence we add it in the output graph.

Minimum cost = 5 + 6 + 9 + 10 + 11 = 41
Visited array, v = {B, D, A, C, F, E}

The last edge from the least cost array to be added in the output graph is F → G = 12.

Minimum cost = 5 + 6 + 9 + 10 + 11 + 12 = 53
Visited array, v = {B, D, A, C, F, E, G}

The obtained result is the minimum spanning tree of the given graph with cost = 53.

Example

The final program implements the Kruskal’s minimum spanning tree problem that takes the cost adjacency matrix as the input and prints the shortest path as the output along with the minimum cost.

C C++ Java Python

#include <stdio.h>
#include <stdlib.h>
const int inf = 999999;
int k, a, b, u, v, n, ne = 1;
int mincost = 0;
int cost[3][3] = {{0, 10, 20},{12, 0,15},{16, 18, 0}};
int  p[9] = {0};
int applyfind(int i)
{
    while(p[i] != 0)
        i=p[i];
    return i;
}
int applyunion(int i,int j)
{
    if(i!=j) {
        p[j]=i;
        return 1;
    }
    return 0;
}
int main()
{
    n = 3;
    int i, j;
    for (int i = 0; i < n; i++) {
        for (int j = 0; j < n; j++) {
            if (cost[i][j] == 0) {
                cost[i][j] = inf;
            }
        }
    }
    printf("Minimum Cost Spanning Tree: \n");
    while(ne < n) {
        int min_val = inf;
        for(i=0; i<n; i++) {
            for(j=0; j <n; j++) {
                if(cost[i][j] < min_val) {
                    min_val = cost[i][j];
                    a = u = i;
                    b = v = j;
                }
            }
        }
        u = applyfind(u);
        v = applyfind(v);
        if(applyunion(u, v) != 0) {
            printf("%d -> %d\n", a, b);
            mincost +=min_val;
        }
        cost[a][b] = cost[b][a] = 999;
        ne++;
    }
    printf("Minimum cost = %d",mincost);
    return 0;
}

Output

Minimum Cost Spanning Tree: 
0 -> 1
1 -> 2
Minimum cost = 25

#include <iostream>
using namespace std;
const int inf = 999999;
int k, a, b, u, v, n, ne = 1;
int mincost = 0;
int cost[3][3] = {{0, 10, 20}, {12, 0, 15}, {16, 18, 0}};
int p[9] = {0};
int applyfind(int i)
{
    while (p[i] != 0) {
        i = p[i];
    }
    return i;
}
int applyunion(int i, int j)
{
    if (i != j) {
        p[j] = i;
        return 1;
    }
    return 0;
}
int main()
{
    n = 3;
    for (int i = 0; i < n; i++) {
        for (int j = 0; j < n; j++) {
            if (cost[i][j] == 0) {
                cost[i][j] = inf;
            }
        }
    }
    cout << "Minimum Cost Spanning Tree:\n";
    while (ne < n) {
        int min_val = inf;
        for (int i = 0; i < n; i++) {
            for (int j = 0;
                    j < n; j++) {
                if (cost[i][j] < min_val) {
                    min_val = cost[i][j];
                    a = u = i;
                    b = v = j;
                }
            }
        }
        u = applyfind(u);
        v = applyfind(v);
        if (applyunion(u, v) != 0) {
            cout << a << " -> " << b << "\n";
            mincost += min_val;
        }
        cost[a][b] = cost[b][a] = 999;
        ne++;
    }
    cout << "Minimum cost = " << mincost << endl;
    return 0;
}

Output

Minimum Cost Spanning Tree:
0 -> 1
1 -> 2
Minimum cost = 25

import java.util.*;
public class Main {
   static int k, a, b, u, v, n, ne = 1, min, mincost = 0;
   static int cost[][] = {{0, 10, 20},{12, 0, 15},{16, 18, 0}};
   static int p[] = new int[9];
   static int inf = 999999;
   static int applyfind(int i) {
      while(p[i] != 0)
      i=p[i];
      return i;
   }
   static int applyunion(int i,int j) {
      if(i!=j) {
         p[j]=i;
         return 1;
      }
      return 0;
   }
   public static void main(String args[]) {
      int i, j;
      n = 3;
      for(i=0; i<n; i++)
      for(j=0; j<n; j++) {
         if(cost[i][j]==0)
         cost[i][j]= inf;
      }
      System.out.println("Minimum Cost Spanning Tree: ");
      while(ne < n) {
         min = inf;
         for(i=0; i<n; i++) {
            for(j=0; j<n; j++) {
               if(cost[i][j] < min) {
                  min=cost[i][j];
                  a=u=i;
                  b=v=j;
               }
            }
         }
         u=applyfind(u);
         v=applyfind(v);
         if(applyunion(u,v) != 0) {
            System.out.println(a + " -> " + b);
            mincost += min;
         }
         cost[a][b]=cost[b][a]=999;
         ne +=1;
      }
      System.out.println("Minimum cost = " + mincost);
   }
}

Output

Minimum Cost Spanning Tree: 
0 -> 1
1 -> 2
Minimum cost = 25

inf = 999999
k, a, b, u, v, n, ne = 0, 0, 0, 0, 0, 0, 1
mincost = 0
cost = [[0, 10, 20], [12, 0, 15], [16, 18, 0]]
p = [0] * 9
def applyfind(i):
    while p[i] != 0:
        i = p[i]
    return i
def applyunion(i, j):
    if i != j:
        p[j] = i
        return 1
    return 0
n = 3
for i in range(n):
    for j in range(n):
        if cost[i][j] == 0:
            cost[i][j] = inf
print("Minimum Cost Spanning Tree:\n")
while ne < n:
    min_val = inf
    for i in range(n):
        for j in range(n):
            if cost[i][j] < min_val:
                min_val = cost[i][j]
                a = u = i
                b = v = j
    u = applyfind(u)
    v = applyfind(v)
    if applyunion(u, v) != 0:
        print(f"{a} -> {b}")
        mincost += min_val
    cost[a][b] = cost[b][a] = 999
    ne += 1
print(f"\n\tMinimum cost = \n{mincost}")

Output

Minimum Cost Spanning Tree: 
0 -> 1
1 -> 2
Minimum cost = 25

Dijkstra’s Shortest Path Algorithm

Dijkstra’s shortest path algorithm is similar to that of Prim’s algorithm as they both rely on finding the shortest path locally to achieve the global solution. However, unlike prim’s algorithm, the dijkstra’s algorithm does not find the minimum spanning tree; it is designed to find the shortest path in the graph from one vertex to other remaining vertices in the graph. Dijkstra’s algorithm can be performed on both directed and undirected graphs.

Since the shortest path can be calculated from single source vertex to all the other vertices in the graph, Dijkstra’s algorithm is also called single-source shortest path algorithm. The output obtained is called shortest path spanning tree.

In this chapter, we will learn about the greedy approach of the dijkstra’s algorithm.

Dijkstra’s Algorithm

The dijkstra’s algorithm is designed to find the shortest path between two vertices of a graph. These two vertices could either be adjacent or the farthest points in the graph. The algorithm starts from the source. The inputs taken by the algorithm are the graph G {V, E}, where V is the set of vertices and E is the set of edges, and the source vertex S. And the output is the shortest path spanning tree.

Algorithm

Declare two arrays − distance[] to store the distances from the source vertex to the other vertices in graph and visited[] to store the visited vertices.
Set distance[S] to ‘0’ and distance[v] = ∞, where v represents all the other vertices in the graph.
Add S to the visited[] array and find the adjacent vertices of S with the minimum distance.
The adjacent vertex to S, say A, has the minimum distance and is not in the visited array yet. A is picked and added to the visited array and the distance of A is changed from ∞ to the assigned distance of A, say d₁, where d₁ < ∞.
Repeat the process for the adjacent vertices of the visited vertices until the shortest path spanning tree is formed.

Examples

To understand the dijkstra’s concept better, let us analyze the algorithm with the help of an example graph −

Step 1

Initialize the distances of all the vertices as ∞, except the source node S.

Vertex	S	A	B	C	D	E
Distance	0	∞	∞	∞	∞	∞

Now that the source vertex S is visited, add it into the visited array.

visited = {S}

Step 2

The vertex S has three adjacent vertices with various distances and the vertex with minimum distance among them all is A. Hence, A is visited and the dist[A] is changed from ∞ to 6.

S → A = 6
S → D = 8
S → E = 7

Vertex	S	A	B	C	D	E
Distance	0	6	∞	∞	8	7

Visited = {S, A}

Step 3

There are two vertices visited in the visited array, therefore, the adjacent vertices must be checked for both the visited vertices.

Vertex S has two more adjacent vertices to be visited yet: D and E. Vertex A has one adjacent vertex B.

Calculate the distances from S to D, E, B and select the minimum distance −

S → D = 8 and S → E = 7.
S → B = S → A + A → B = 6 + 9 = 15

Vertex	S	A	B	C	D	E
Distance	0	6	15	∞	8	7

Visited = {S, A, E}

Step 4

Calculate the distances of the adjacent vertices – S, A, E – of all the visited arrays and select the vertex with minimum distance.

S → D = 8
S → B = 15
S → C = S → E + E → C = 7 + 5 = 12

Vertex	S	A	B	C	D	E
Distance	0	6	15	12	8	7

Visited = {S, A, E, D}

Step 5

Recalculate the distances of unvisited vertices and if the distances minimum than existing distance is found, replace the value in the distance array.

S → C = S → E + E → C = 7 + 5 = 12
S → C = S → D + D → C = 8 + 3 = 11

dist[C] = minimum (12, 11) = 11

S → B = S → A + A → B = 6 + 9 = 15
S → B = S → D + D → C + C → B = 8 + 3 + 12 = 23

dist[B] = minimum (15,23) = 15

Vertex	S	A	B	C	D	E
Distance	0	6	15	11	8	7

Visited = { S, A, E, D, C}

Step 6

The remaining unvisited vertex in the graph is B with the minimum distance 15, is added to the output spanning tree.

Visited = {S, A, E, D, C, B}

The shortest path spanning tree is obtained as an output using the dijkstra’s algorithm.

Example

The program implements the dijkstra’s shortest path problem that takes the cost adjacency matrix as the input and prints the shortest path as the output along with the minimum cost.

C C++ Java Python

#include<stdio.h>
#include<limits.h>
#include<stdbool.h>
int min_dist(int[], bool[]);
void greedy_dijsktra(int[][6],int);
int min_dist(int dist[], bool visited[]){ // finding minimum dist
   int minimum=INT_MAX,ind;
   for(int k=0; k<6; k++) {
      if(visited[k]==false && dist[k]<=minimum) {
         minimum=dist[k];
         ind=k;
      }
   }
   return ind;
}
void greedy_dijsktra(int graph[6][6],int src){
   int dist[6];
   bool visited[6];
   for(int k = 0; k<6; k++) {
      dist[k] = INT_MAX;
      visited[k] = false;
   }
   dist[src] = 0; // Source vertex dist is set 0
   for(int k = 0; k<6; k++) {
      int m=min_dist(dist,visited);
      visited[m]=true;
      for(int k = 0; k<6; k++) {

         // updating the dist of neighbouring vertex
         if(!visited[k] && graph[m][k] && dist[m]!=INT_MAX && dist[m]+graph[m][k]<dist[k])
            dist[k]=dist[m]+graph[m][k];
      }
   }
   printf("Vertex\t\tdist from source vertex\n");
   for(int k = 0; k<6; k++) {
      char str=65+k;
      printf("%c\t\t\t%d\n", str, dist[k]);
   }
}
int main(){
   int graph[6][6]= {
      {0, 1, 2, 0, 0, 0},
      {1, 0, 0, 5, 1, 0},
      {2, 0, 0, 2, 3, 0},
      {0, 5, 2, 0, 2, 2},
      {0, 1, 3, 2, 0, 1},
      {0, 0, 0, 2, 1, 0}
   };
   greedy_dijsktra(graph,0);
   return 0;
}

Output

Vertex		dist from source vertex
A			   0
B			   1
C			   2
D			   4
E			   2
F			   3

#include<iostream>
#include<climits>
using namespace std;
int min_dist(int dist[], bool visited[]){ // finding minimum dist
   int minimum=INT_MAX,ind;
   for(int k=0; k<6; k++) {
      if(visited[k]==false && dist[k]<=minimum) {
         minimum=dist[k];
         ind=k;
      }
   }
   return ind;
}
void greedy_dijsktra(int graph[6][6],int src){
   int dist[6];
   bool visited[6];
   for(int k = 0; k<6; k++) {
      dist[k] = INT_MAX;
      visited[k] = false;
   }
   dist[src] = 0; // Source vertex dist is set 0
   for(int k = 0; k<6; k++) {
      int m=min_dist(dist,visited);
      visited[m]=true;
      for(int k = 0; k<6; k++) {

         // updating the dist of neighbouring vertex
         if(!visited[k] && graph[m][k] && dist[m]!=INT_MAX && dist[m]+graph[m][k]<dist[k])
            dist[k]=dist[m]+graph[m][k];
      }
   }
   cout<<"Vertex\t\tdist from source vertex"<<endl;
   for(int k = 0; k<6; k++) {
      char str=65+k;
      cout<<str<<"\t\t\t"<<dist[k]<<endl;
   }
}
int main(){
   int graph[6][6]= {
      {0, 1, 2, 0, 0, 0},
      {1, 0, 0, 5, 1, 0},
      {2, 0, 0, 2, 3, 0},
      {0, 5, 2, 0, 2, 2},
      {0, 1, 3, 2, 0, 1},
      {0, 0, 0, 2, 1, 0}
   };
   greedy_dijsktra(graph,0);
   return 0;
}

Output

Vertex		dist from source vertex
A			   0
B			   1
C			   2
D			   4
E			   2
F			   3

public class Main {
   static int min_dist(int dist[], boolean visited[]) { // finding minimum dist
      int minimum = Integer.MAX_VALUE;
      int ind = -1;
      for (int k = 0; k < 6; k++) {
         if (!visited[k] && dist[k] <= minimum) {
            minimum = dist[k];
            ind = k;
         }
      }
      return ind;
   }
   static void greedy_dijkstra(int graph[][], int src) {
      int dist[] = new int[6];
      boolean visited[] = new boolean[6];
      for (int k = 0; k < 6; k++) {
         dist[k] = Integer.MAX_VALUE;
         visited[k] = false;
      }
      dist[src] = 0; // Source vertex dist is set 0
      for (int k = 0; k < 6; k++) {
         int m = min_dist(dist, visited);
         visited[m] = true;
         for (int j = 0; j < 6; j++) {
            // updating the dist of neighboring vertex
            if (!visited[j] && graph[m][j] != 0 && dist[m] != Integer.MAX_VALUE
                  && dist[m] + graph[m][j] < dist[j])
               dist[j] = dist[m] + graph[m][j];
         }
      }
      System.out.println("Vertex\t\tdist from source vertex");
      for (int k = 0; k < 6; k++) {
         char str = (char) (65 + k);
         System.out.println(str + "\t\t\t" + dist[k]);
      }
   }
   public static void main(String args[]) {
      int graph[][] = { { 0, 1, 2, 0, 0, 0 }, { 1, 0, 0, 5, 1, 0 }, { 2, 0, 0, 2, 3, 0 },
            { 0, 5, 2, 0, 2, 2 }, { 0, 1, 3, 2, 0, 1 }, { 0, 0, 0, 2, 1, 0 } };
      greedy_dijkstra(graph, 0);
   }
}

Output

Vertex		dist from source vertex
A			0
B			1
C			2
D			4
E			2
F			3

import sys
def min_dist(dist, visited):  # finding minimum dist
    minimum = sys.maxsize
    ind = -1
    for k in range(6):
        if not visited[k] and dist[k] <= minimum:
            minimum = dist[k]
            ind = k
    return ind
def greedy_dijkstra(graph, src):
    dist = [sys.maxsize] * 6
    visited = [False] * 6
    dist[src] = 0  # Source vertex dist is set 0
    for _ in range(6):
        m = min_dist(dist, visited)
        visited[m] = True
        for k in range(6):
            #  updating the dist of neighbouring vertex
            if not visited[k] and graph[m][k] and dist[m] != sys.maxsize and dist[m] + graph[m][k] < dist[k]:
                dist[k] = dist[m] + graph[m][k]
    print("Vertex\t\tdist from source vertex")
    for k in range(6):
        str_val = chr(65 + k)  # Convert index to corresponding character
        print(str_val, "\t\t\t", dist[k])
# Main code
graph = [
    [0, 1, 2, 0, 0, 0],
    [1, 0, 0, 5, 1, 0],
    [2, 0, 0, 2, 3, 0],
    [0, 5, 2, 0, 2, 2],
    [0, 1, 3, 2, 0, 1],
    [0, 0, 0, 2, 1, 0]
]
greedy_dijkstra(graph, 0)

Output

Vertex		dist from source vertex
A 			 0
B 			 1
C 			 2
D 			 4
E 			 2
F 			 3

Map Colouring Algorithm

Map colouring problem states that given a graph G {V, E} where V and E are the set of vertices and edges of the graph, all vertices in in V need to be coloured in such a way that no two adjacent vertices must have the same colour.

The real-world applications of this algorithm are – assigning mobile radio frequencies, making schedules, designing Sudoku, allocating registers etc.

Map Colouring Algorithm

With the map colouring algorithm, a graph G and the colours to be added to the graph are taken as an input and a coloured graph with no two adjacent vertices having the same colour is achieved.

Algorithm

Initiate all the vertices in the graph.
Select the node with the highest degree to colour it with any colour.
Choose the colour to be used on the graph with the help of the selection colour function so that no adjacent vertex is having the same colour.
Check if the colour can be added and if it does, add it to the solution set.
Repeat the process from step 2 until the output set is ready.

Examples

Step 1

Find degrees of all the vertices −

A – 4
B – 2
C – 2
D – 3
E – 3

Step 2

Choose the vertex with the highest degree to colour first, i.e., A and choose a colour using selection colour function. Check if the colour can be added to the vertex and if yes, add it to the solution set.

Step 3

Select any vertex with the next highest degree from the remaining vertices and colour it using selection colour function.

D and E both have the next highest degree 3, so choose any one between them, say D.

D is adjacent to A, therefore it cannot be coloured in the same colour as A. Hence, choose a different colour using selection colour function.

Step 4

The next highest degree vertex is E, hence choose E.

E is adjacent to both A and D, therefore it cannot be coloured in the same colours as A and D. Choose a different colour using selection colour function.

Step 5

The next highest degree vertices are B and C. Thus, choose any one randomly.

B is adjacent to both A and E, thus not allowing to be coloured in the colours of A and E but it is not adjacent to D, so it can be coloured with D’s colour.

Step 6

The next and the last vertex remaining is C, which is adjacent to both A and D, not allowing it to be coloured using the colours of A and D. But it is not adjacent to E, so it can be coloured in E’s colour.

Example

Following is the complete implementation of Map Colouring Algorithm in various programming languages where a graph is coloured in such a way that no two adjacent vertices have same colour.

C C++ Java Python

#include<stdio.h>
#include<stdbool.h>
#define V 4
bool graph[V][V] = {
   {0, 1, 1, 0},
   {1, 0, 1, 1},
   {1, 1, 0, 1},
   {0, 1, 1, 0},
};
bool isValid(int v,int color[], int c){   //check whether putting a color valid for v
   for (int i = 0; i < V; i++)
      if (graph[v][i] && c == color[i])
         return false;
   return true;
}
bool mColoring(int colors, int color[], int vertex){
   if (vertex == V) //when all vertices are considered
      return true;
   for (int col = 1; col <= colors; col++) {
      if (isValid(vertex,color, col)) { //check whether color col is valid or not
         color[vertex] = col;
         if (mColoring (colors, color, vertex+1) == true) //go for additional vertices
            return true;
         color[vertex] = 0;
      }
   }
   return false; //when no colors can be assigned
}
int main(){
   int colors = 3; // Number of colors
   int color[V]; //make color matrix for each vertex
   for (int i = 0; i < V; i++)
      color[i] = 0; //initially set to 0
   if (mColoring(colors, color, 0) == false) { //for vertex 0 check graph coloring
      printf("Solution does not exist.");
   }
   printf("Assigned Colors are: \n");
   for (int i = 0; i < V; i++)
      printf("%d ", color[i]);
   return 0;
}

Output

Assigned Colors are:
1 2 3 1

#include<iostream>
using namespace std;
#define V 4
bool graph[V][V] = {
   {0, 1, 1, 0},
   {1, 0, 1, 1},
   {1, 1, 0, 1},
   {0, 1, 1, 0},
};
bool isValid(int v,int color[], int c){   //check whether putting a color valid for v
   for (int i = 0; i < V; i++)
      if (graph[v][i] && c == color[i])
         return false;
   return true;
}
bool mColoring(int colors, int color[], int vertex){
   if (vertex == V) //when all vertices are considered
      return true;
   for (int col = 1; col <= colors; col++) {
      if (isValid(vertex,color, col)) { //check whether color col is valid or not
         color[vertex] = col;
         if (mColoring (colors, color, vertex+1) == true) //go for additional vertices
            return true;
         color[vertex] = 0;
      }
   }
   return false; //when no colors can be assigned
}
int main(){
   int colors = 3; // Number of colors
   int color[V]; //make color matrix for each vertex
   for (int i = 0; i < V; i++)
      color[i] = 0; //initially set to 0
   if (mColoring(colors, color, 0) == false) { //for vertex 0 check graph coloring
      cout << "Solution does not exist.";
   }
   cout << "Assigned Colors are: \n";
   for (int i = 0; i < V; i++)
      cout << color[i] << " ";
   return 0;
}

Output

Assigned Colors are: 
1 2 3 1

public class mcolouring {
   static int V = 4;
   static int graph[][] = {
      {0, 1, 1, 0},
      {1, 0, 1, 1},
      {1, 1, 0, 1},
      {0, 1, 1, 0},
   };
   static boolean isValid(int v,int color[], int c) { //check whether putting a color valid for v
      for (int i = 0; i < V; i++)
         if (graph[v][i] != 0 && c == color[i])
            return false;
      return true;
   }
   static boolean mColoring(int colors, int color[], int vertex) {
      if (vertex == V) //when all vertices are considered
         return true;
      for (int col = 1; col <= colors; col++) {
         if (isValid(vertex,color, col)) { //check whether color col is valid or not
            color[vertex] = col;
            if (mColoring (colors, color, vertex+1) == true) //go for additional vertices
               return true;
            color[vertex] = 0;
         }
      }
      return false; //when no colors can be assigned
   }
   public static void main(String args[]) {
      int colors = 3; // Number of colors
      int color[] = new int[V]; //make color matrix for each vertex
      for (int i = 0; i < V; i++)
         color[i] = 0; //initially set to 0
      if (mColoring(colors, color, 0) == false) { //for vertex 0 check graph coloring
         System.out.println("Solution does not exist.");
      }
      System.out.println("Assigned Colors are: \n");
      for (int i = 0; i < V; i++)
         System.out.print(color[i] + " ");
   }
}

Output

Assigned Colors are:
1 2 3 1

V = 4
graph = [[0, 1, 1, 0], [1, 0, 1, 1], [1, 1, 0, 1], [0, 1, 1, 0]]
def isValid(v, color, c):  # check whether putting a color valid for v
    for i in range(V):
        if graph[v][i] and c == color[i]:
            return False
    return True
def mColoring(colors, color, vertex):
    if vertex == V:  # when all vertices are considered
        return True
    for col in range(1, colors + 1):
        if isValid(vertex, color,
                   col):  # check whether color col is valid or not
            color[vertex] = col
            if mColoring(colors, color, vertex + 1):
                return True  # go for additional vertices
            color[vertex] = 0
    return False  # when no colors can be assigned
colors = 3  # Number of colors
color = [0] * V  # make color matrix for each vertex
if not mColoring(
        colors, color,
        0):  # initially set to 0 and for Vertex 0 check graph coloring
    print("Solution does not exist.")
else:
    print("Assigned Colors are:")
    for i in range(V):
        print(color[i], end=" ")

Output

Assigned Colors are:
1 2 3 1

Fractional Knapsack Algorithm

The knapsack problem states that − given a set of items, holding weights and profit values, one must determine the subset of the items to be added in a knapsack such that, the total weight of the items must not exceed the limit of the knapsack and its total profit value is maximum.

It is one of the most popular problems that take greedy approach to be solved. It is called as the Fractional Knapsack Problem.

To explain this problem a little easier, consider a test with 12 questions, 10 marks each, out of which only 10 should be attempted to get the maximum mark of 100. The test taker now must calculate the highest profitable questions – the one that he’s confident in – to achieve the maximum mark. However, he cannot attempt all the 12 questions since there will not be any extra marks awarded for those attempted answers. This is the most basic real-world application of the knapsack problem.

Knapsack Algorithm

The weights (Wi) and profit values (Pi) of the items to be added in the knapsack are taken as an input for the fractional knapsack algorithm and the subset of the items added in the knapsack without exceeding the limit and with maximum profit is achieved as the output.

Algorithm

Consider all the items with their weights and profits mentioned respectively.
Calculate P_i/W_i of all the items and sort the items in descending order based on their P_i/W_i values.
Without exceeding the limit, add the items into the knapsack.
If the knapsack can still store some weight, but the weights of other items exceed the limit, the fractional part of the next time can be added.
Hence, giving it the name fractional knapsack problem.

Examples

For the given set of items and the knapsack capacity of 10 kg, find the subset of the items to be added in the knapsack such that the profit is maximum.

Items	1	2	3	4	5
Weights (in kg)	3	3	2	5	1
Profits	10	15	10	12	8

Solution

Step 1

Given, n = 5

W_i = {3, 3, 2, 5, 1}
P_i = {10, 15, 10, 12, 8}

Calculate P_i/W_i for all the items

Items	1	2	3	4	5
Weights (in kg)	3	3	2	5	1
Profits	10	15	10	20	8
P_i/W_i	3.3	5	5	4	8

Step 2

Arrange all the items in descending order based on P_i/W_i

Items	5	2	3	4	1
Weights (in kg)	1	3	2	5	3
Profits	8	15	10	20	10
P_i/W_i	8	5	5	4	3.3

Step 3

Without exceeding the knapsack capacity, insert the items in the knapsack with maximum profit.

Knapsack = {5, 2, 3}

However, the knapsack can still hold 4 kg weight, but the next item having 5 kg weight will exceed the capacity. Therefore, only 4 kg weight of the 5 kg will be added in the knapsack.

Items	5	2	3	4	1
Weights (in kg)	1	3	2	5	3
Profits	8	15	10	20	10
Knapsack	1	1	1	4/5	0

Hence, the knapsack holds the weights = [(1 * 1) + (1 * 3) + (1 * 2) + (4/5 * 5)] = 10, with maximum profit of [(1 * 8) + (1 * 15) + (1 * 10) + (4/5 * 20)] = 37.

Example

Following is the final implementation of Fractional Knapsack Algorithm using Greedy Approach −

C C++ Java Python

#include <stdio.h>
int n = 5;
int p[10] = {3, 3, 2, 5, 1};
int w[10] = {10, 15, 10, 12, 8};
int W = 10;
int main(){
   int cur_w;
   float tot_v;
   int i, maxi;
   int used[10];
   for (i = 0; i < n; ++i)
      used[i] = 0;
   cur_w = W;
   while (cur_w > 0) {
      maxi = -1;
      for (i = 0; i < n; ++i)
         if ((used[i] == 0) &&
               ((maxi == -1) || ((float)w[i]/p[i] > (float)w[maxi]/p[maxi])))
            maxi = i;
      used[maxi] = 1;
      cur_w -= p[maxi];
      tot_v += w[maxi];
      if (cur_w >= 0)
         printf("Added object %d (%d, %d) completely in the bag. Space left: %d.\n", maxi + 1, w[maxi], p[maxi], cur_w);
      else {
         printf("Added %d%% (%d, %d) of object %d in the bag.\n", (int)((1 + (float)cur_w/p[maxi]) * 100), w[maxi], p[maxi], maxi + 1);
         tot_v -= w[maxi];
         tot_v += (1 + (float)cur_w/p[maxi]) * w[maxi];
      }
   }
   printf("Filled the bag with objects worth %.2f.\n", tot_v);
   return 0;
}

Output

Added object 5 (8, 1) completely in the bag. Space left: 9.
Added object 2 (15, 3) completely in the bag. Space left: 6.
Added object 3 (10, 2) completely in the bag. Space left: 4.
Added object 1 (10, 3) completely in the bag. Space left: 1.
Added 19% (12, 5) of object 4 in the bag.
Filled the bag with objects worth 45.40.

#include <iostream>
int n = 5;
int p[10] = {3, 3, 2, 5, 1};
int w[10] = {10, 15, 10, 12, 8};
int W = 10;
int main(){
   int cur_w;
   float tot_v;
   int i, maxi;
   int used[10];
   for (i = 0; i < n; ++i)
      used[i] = 0;
   cur_w = W;
   while (cur_w > 0) {
      maxi = -1;
      for (i = 0; i < n; ++i)
         if ((used[i] == 0) &&
               ((maxi == -1) || ((float)w[i]/p[i] > (float)w[maxi]/p[maxi])))
            maxi = i;
      used[maxi] = 1;
      cur_w -= p[maxi];
      tot_v += w[maxi];
      if (cur_w >= 0)
         printf("Added object %d (%d, %d) completely in the bag. Space left: %d.\n", maxi + 1, w[maxi], p[maxi], cur_w);
      else {
         printf("Added %d%% (%d, %d) of object %d in the bag.\n", (int)((1 + (float)cur_w/p[maxi]) * 100), w[maxi], p[maxi], maxi + 1);
         tot_v -= w[maxi];
         tot_v += (1 + (float)cur_w/p[maxi]) * w[maxi];
      }
   }
   printf("Filled the bag with objects worth %.2f.\n", tot_v);
   return 0;
}

Output

Added object 5 (8, 1) completely in the bag. Space left: 9.
Added object 2 (15, 3) completely in the bag. Space left: 6.
Added object 3 (10, 2) completely in the bag. Space left: 4.
Added object 1 (10, 3) completely in the bag. Space left: 1.
Added 19% (12, 5) of object 4 in the bag.
Filled the bag with objects worth 45.40.

public class Main {
   static int n = 5;
   static int p[] = {3, 3, 2, 5, 1};
   static int w[] = {10, 15, 10, 12, 8};
   static int W = 10;
   public static void main(String args[]) {
      int cur_w;
      float tot_v = 0;
      int i, maxi;
      int used[] = new int[10];
      for (i = 0; i < n; ++i)
         used[i] = 0;
      cur_w = W;
      while (cur_w > 0) {
         maxi = -1;
         for (i = 0; i < n; ++i)
            if ((used[i] == 0) &&
                  ((maxi == -1) || ((float)w[i]/p[i] > (float)w[maxi]/p[maxi])))
               maxi = i;
         used[maxi] = 1;
         cur_w -= p[maxi];
         tot_v += w[maxi];
         if (cur_w >= 0)
            System.out.println("Added object " + maxi + 1 + " (" + w[maxi] + "," + p[maxi] + ") completely in the bag. Space left: " + cur_w);
         else {
            System.out.println("Added " + ((int)((1 + (float)cur_w/p[maxi]) * 100)) + "% (" + w[maxi] + "," + p[maxi] + ") of object " + (maxi + 1) + " in the bag.");
            tot_v -= w[maxi];
            tot_v += (1 + (float)cur_w/p[maxi]) * w[maxi];
         }
      }
      System.out.println("Filled the bag with objects worth " + tot_v);
   }
}

Output

Added object 41 (8,1) completely in the bag. Space left: 9
Added object 11 (15,3) completely in the bag. Space left: 6
Added object 21 (10,2) completely in the bag. Space left: 4
Added object 01 (10,3) completely in the bag. Space left: 1
Added 19% (12,5) of object 4 in the bag.
Filled the bag with objects worth 45.4

n = 5
p = [3, 3, 2, 5, 1]
w = [10, 15, 10, 12, 8]
W = 10
cur_w = W
tot_v = 0
used = [0] * 10
for i in range(n):
    used[i] = 0
while cur_w > 0:
    maxi = -1
    for i in range(n):
        if (used[i] == 0) and ((maxi == -1) or ((w[i] / p[i]) > (w[maxi] / p[maxi]))):
            maxi = i
    used[maxi] = 1
    cur_w -= p[maxi]
    tot_v += w[maxi]
    if cur_w >= 0:
        print(f"Added object {maxi + 1} ({w[maxi]}, {p[maxi]}) completely in the bag. Space left: {cur_w}.")
    else:
        percent_added = int((1 + (cur_w / p[maxi])) * 100)
        print(f"Added {percent_added}% ({w[maxi]}, {p[maxi]}) of object {maxi + 1} in the bag.")
        tot_v -= w[maxi]
        tot_v += (1 + (cur_w / p[maxi])) * w[maxi]
print(f"Filled the bag with objects worth {tot_v:.2f}.")

Output

Added object 5 (8, 1) completely in the bag. Space left: 9.
Added object 2 (15, 3) completely in the bag. Space left: 6.
Added object 3 (10, 2) completely in the bag. Space left: 4.
Added object 1 (10, 3) completely in the bag. Space left: 1.
Added 19% (12, 5) of object 4 in the bag.
Filled the bag with objects worth 45.40.

Applications

Few of the many real-world applications of the knapsack problem are −

Cutting raw materials without losing too much material
Picking through the investments and portfolios
Selecting assets of asset-backed securitization
Generating keys for the Merkle-Hellman algorithm
Cognitive Radio Networks
Power Allocation
Network selection for mobile nodes

Cooperative wireless communication

Job Sequencing with Deadline

Job scheduling algorithm is applied to schedule the jobs on a single processor to maximize the profits.

The greedy approach of the job scheduling algorithm states that, “Given ‘n’ number of jobs with a starting time and ending time, they need to be scheduled in such a way that maximum profit is received within the maximum deadline”.

Job Scheduling Algorithm

Set of jobs with deadlines and profits are taken as an input with the job scheduling algorithm and scheduled subset of jobs with maximum profit are obtained as the final output.

Algorithm

Find the maximum deadline value from the input set of jobs.
Once, the deadline is decided, arrange the jobs in descending order of their profits.
Selects the jobs with highest profits, their time periods not exceeding the maximum deadline.
The selected set of jobs are the output.

Examples

Consider the following tasks with their deadlines and profits. Schedule the tasks in such a way that they produce maximum profit after being executed −

S. No.	1	2	3	4	5
Jobs	J1	J2	J3	J4	J5
Deadlines	2	2	1	3	4
Profits	20	60	40	100	80

Step 1

Find the maximum deadline value, dm, from the deadlines given.

d_m = 4.

Step 2

Arrange the jobs in descending order of their profits.

S. No.	1	2	3	4	5
Jobs	J4	J5	J2	J3	J1
Deadlines	3	4	2	1	2
Profits	100	80	60	40	20

The maximum deadline, d_m, is 4. Therefore, all the tasks must end before 4.

Choose the job with highest profit, J4. It takes up 3 parts of the maximum deadline.

Therefore, the next job must have the time period 1.

Total Profit = 100.

Step 3

The next job with highest profit is J5. But the time taken by J5 is 4, which exceeds the deadline by 3. Therefore, it cannot be added to the output set.

Step 4

The next job with highest profit is J2. The time taken by J5 is 2, which also exceeds the deadline by 1. Therefore, it cannot be added to the output set.

Step 5

The next job with higher profit is J3. The time taken by J3 is 1, which does not exceed the given deadline. Therefore, J3 is added to the output set.

Total Profit: 100 + 40 = 140

Step 6

Since, the maximum deadline is met, the algorithm comes to an end. The output set of jobs scheduled within the deadline are {J4, J3} with the maximum profit of 140.

Example

Following is the final implementation of Job sequencing Algorithm using Greedy Approach −

C C++ Java Python

#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>

// A structure to represent a Jobs
typedef struct Jobs {
   char id; // Jobs Id
   int dead; // Deadline of Jobs
   int profit; // Profit if Jobs is over before or on deadline
} Jobs;

// This function is used for sorting all Jobss according to
// profit
int compare(const void* a, const void* b){
   Jobs* temp1 = (Jobs*)a;
   Jobs* temp2 = (Jobs*)b;
   return (temp2->profit - temp1->profit);
}

// Find minimum between two numbers.
int min(int num1, int num2){
   return (num1 > num2) ? num2 : num1;
}
int main(){
   Jobs arr[] = { 
      { 'a', 2, 100 },
      { 'b', 2, 20 },
      { 'c', 1, 40 },
      { 'd', 3, 35 },
      { 'e', 1, 25 }
   };
   int n = sizeof(arr) / sizeof(arr[0]);
   printf("Following is maximum profit sequence of Jobs: \n");
   qsort(arr, n, sizeof(Jobs), compare);
   int result[n]; // To store result sequence of Jobs
   bool slot[n]; // To keep track of free time slots

   // Initialize all slots to be free
   for (int i = 0; i < n; i++)
      slot[i] = false;

   // Iterate through all given Jobs
   for (int i = 0; i < n; i++) {

      // Find a free slot for this Job
      for (int j = min(n, arr[i].dead) - 1; j >= 0; j--) {

         // Free slot found
         if (slot[j] == false) {
            result[j] = i;
            slot[j] = true;
            break;
         }
      }
   }

   // Print the result
   for (int i = 0; i < n; i++)
      if (slot[i])
         printf("%c ", arr[result[i]].id);
   return 0;
}

Output

Following is maximum profit sequence of Jobs: 
c a d

#include<iostream>
#include<algorithm>
using namespace std;
struct Job {
   char id;
   int deadLine;
   int profit;
};
bool comp(Job j1, Job j2){
   return (j1.profit > j2.profit); //compare jobs based on profit
}
int min(int a, int b){
   return (a<b)?a:b;
}
int main(){
   Job jobs[] = { { 'a', 2, 100 },
      { 'b', 2, 20 },
      { 'c', 1, 40 },
      { 'd', 3, 35 },
      { 'e', 1, 25 }
	  };
   int n = 5;
   cout << "Following is maximum profit sequence of Jobs: "<<"\n";
   sort(jobs, jobs+n, comp); //sort jobs on profit
   int jobSeq[n]; // To store result (Sequence of jobs)
   bool slot[n]; // To keep track of free time slots
   for (int i=0; i<n; i++)
     slot[i] = false; //initially all slots are free
   for (int i=0; i<n; i++) { //for all given jobs
     for (int j=min(n, jobs[i].deadLine)-1; j>=0; j--) { //search from last free slot
       if (slot[j]==false) {
         jobSeq[j] = i; // Add this job to job sequence
         slot[j] = true; // mark this slot as occupied
         break;
       }
     }
   }
   for (int i=0; i<n; i++)
     if (slot[i])
       cout << jobs[jobSeq[i]].id << " "; //display the sequence
}

Output

Following is maximum profit sequence of Jobs: 
c a d

import java.util.*;
public class Job {
   // Each job has a unique-id,profit and deadline
   char id;
   int deadline, profit;
   // Constructors
   public Job() {}
   public Job(char id, int deadline, int profit) {
      this.id = id;
      this.deadline = deadline;
      this.profit = profit;
   } 
   // Function to schedule the jobs take 2 arguments
   // arraylist and no of jobs to schedule
   void printJobScheduling(ArrayList<Job> arr, int t) {
      // Length of array
      int n = arr.size(); 
      // Sort all jobs according to decreasing order of
      // profit
      Collections.sort(arr,(a, b) -> b.profit - a.profit);   
      // To keep track of free time slots
      boolean result[] = new boolean[t];
      // To store result (Sequence of jobs)
      char job[] = new char[t]; 
      // Iterate through all given jobs
      for (int i = 0; i < n; i++) {     
         // Find a free slot for this job (Note that we
         // start from the last possible slot)
         for (int j = Math.min(t - 1, arr.get(i).deadline - 1); j >= 0; j--) {     
            // Free slot found
            if (result[j] == false) {
               result[j] = true;
               job[j] = arr.get(i).id;
               break;
            }
         }
      }
      // Print the sequence
      for (char jb : job)
      System.out.print(jb + " ");
      System.out.println();
   }
   // Driver code
   public static void main(String args[]) {
      ArrayList<Job> arr = new ArrayList<Job>();
      arr.add(new Job('a', 2, 100));
      arr.add(new Job('b', 2, 20));
      arr.add(new Job('c', 1, 40));
      arr.add(new Job('d', 3, 35));
      arr.add(new Job('e', 1, 25));     
      // Function call
      System.out.println("Following is maximum profit sequence of Jobs: ");
      Job job = new Job();     
      // Calling function
      job.printJobScheduling(arr, 3);
   }
}

Output

Following is maximum profit sequence of Jobs: 
c a d

arr = [
    ['a', 2, 100], 
    ['b', 2, 20], 
    ['c', 1, 40], 
    ['d', 3, 35], 
    ['e', 1, 25]
    ]
print("Following is maximum profit sequence of Jobs: ")
# length of array
n = len(arr)
t = 3
# Sort all jobs according to
# decreasing order of profit
for i in range(n):
   for j in range(n - 1 - i):
     if arr[j][2] < arr[j + 1][2]:
       arr[j], arr[j + 1] = arr[j + 1], arr[j]

# To keep track of free time slots
result = [False] * t

# To store result (Sequence of jobs)
job = ['-1'] * t

# Iterate through all given jobs
for i in range(len(arr)):

   # Find a free slot for this job
   # (Note that we start from the
   # last possible slot)
   for j in range(min(t - 1, arr[i][1] - 1), -1, -1):

     # Free slot found
     if result[j] is False:
       result[j] = True
       job[j] = arr[i][0]
       break

# print the sequence
print(job)

Output

Following is maximum profit sequence of Jobs: 
['c', 'a', 'd']

Optimal Merge Pattern Algorithm

Merge a set of sorted files of different length into a single sorted file. We need to find an optimal solution, where the resultant file will be generated in minimum time.

If the number of sorted files are given, there are many ways to merge them into a single sorted file. This merge can be performed pair wise. Hence, this type of merging is called as 2-way merge patterns.

As, different pairings require different amounts of time, in this strategy we want to determine an optimal way of merging many files together. At each step, two shortest sequences are merged.

To merge a p-record file and a q-record file requires possibly p + q record moves, the obvious choice being, merge the two smallest files together at each step.

Two-way merge patterns can be represented by binary merge trees. Let us consider a set of n sorted files {f₁, f₂, f₃, …, f_n}. Initially, each element of this is considered as a single node binary tree. To find this optimal solution, the following algorithm is used.

Algorithm: TREE (n)  
for i := 1 to n – 1 do  
   declare new node  
   node.leftchild := least (list) 
   node.rightchild := least (list) 
   node.weight) := ((node.leftchild).weight) + ((node.rightchild).weight)  
   insert (list, node);  
return least (list);

At the end of this algorithm, the weight of the root node represents the optimal cost.

Example

Let us consider the given files, f₁, f₂, f₃, f₄ and f₅ with 20, 30, 10, 5 and 30 number of elements respectively.

If merge operations are performed according to the provided sequence, then

M₁ = merge f₁ and f₂ => 20 + 30 = 50

M₂ = merge M₁ and f₃ => 50 + 10 = 60

M₃ = merge M₂ and f₄ => 60 + 5 = 65

M₄ = merge M₃ and f₅ => 65 + 30 = 95

Hence, the total number of operations is

50 + 60 + 65 + 95 = 270

Now, the question arises is there any better solution?

Sorting the numbers according to their size in an ascending order, we get the following sequence −

f₄, f₃, f₁, f₂, f₅

Hence, merge operations can be performed on this sequence

M₁ = merge f₄ and f₃ => 5 + 10 = 15

M₂ = merge M₁ and f₁ => 15 + 20 = 35

M₃ = merge M₂ and f₂ => 35 + 30 = 65

M₄ = merge M₃ and f₅ => 65 + 30 = 95

Therefore, the total number of operations is

15 + 35 + 65 + 95 = 210

Obviously, this is better than the previous one.

In this context, we are now going to solve the problem using this algorithm.

Initial Set

Step 1

Step 2

Step 3

Step 4

Hence, the solution takes 15 + 35 + 60 + 95 = 205 number of comparisons.

Example

C C++ Java Python

#include <stdio.h>
#include <stdlib.h>
int optimalMerge(int files[], int n)
{
    // Sort the files in ascending order
    for (int i = 0; i < n - 1; i++) {
        for (int j = 0; j < n - i - 1; j++) {
            if (files[j] > files[j + 1]) {
                int temp = files[j];
                files[j] = files[j + 1];
                files[j + 1] = temp;
            }
        }
    }
    int cost = 0;
    while (n > 1) {
        // Merge the smallest two files
        int mergedFileSize = files[0] + files[1];
        cost += mergedFileSize;
        // Replace the first file with the merged file size
        files[0] = mergedFileSize;
        // Shift the remaining files to the left
        for (int i = 1; i < n - 1; i++) {
            files[i] = files[i + 1];
        }
        n--; // Reduce the number of files
        // Sort the files again
        for (int i = 0; i < n - 1; i++) {
            for (int j = 0; j < n - i - 1; j++) {
                if (files[j] > files[j + 1]) {
                    int temp = files[j];
                    files[j] = files[j + 1];
                    files[j + 1] = temp;
                }
            }
        }
    }
    return cost;
}
int main()
{
    int files[] = {5, 10, 20, 30, 30};
    int n = sizeof(files) / sizeof(files[0]);
    int minCost = optimalMerge(files, n);
    printf("Minimum cost of merging is: %d Comparisons\n", minCost);
    return 0;
}

Output

Minimum cost of merging is: 205 Comparisons

#include <iostream>
#include <algorithm>
int optimalMerge(int files[], int n) {
    // Sort the files in ascending order
    for (int i = 0; i < n - 1; i++) {
        for (int j = 0; j < n - i - 1; j++) {
            if (files[j] > files[j + 1]) {
                std::swap(files[j], files[j + 1]);
            }
        }
    }
    int cost = 0;
    while (n > 1) {
        // Merge the smallest two files
        int mergedFileSize = files[0] + files[1];
        cost += mergedFileSize;
        // Replace the first file with the merged file size
        files[0] = mergedFileSize;
        // Shift the remaining files to the left
        for (int i = 1; i < n - 1; i++) {
            files[i] = files[i + 1];
        }
        n--; // Reduce the number of files
        // Sort the files again
        for (int i = 0; i < n - 1; i++) {
            for (int j = 0; j < n - i - 1; j++) {
                if (files[j] > files[j + 1]) {
                    std::swap(files[j], files[j + 1]);
                }
            }
        }
    }
    return cost;
}
int main() {
    int files[] = {5, 10, 20, 30, 30};
    int n = sizeof(files) / sizeof(files[0]);
    int minCost = optimalMerge(files, n);
    std::cout << "Minimum cost of merging is: " << minCost << " Comparisons\n";
    return 0;
}

Output

Minimum cost of merging is: 205 Comparisons

import java.util.Arrays;
public class Main {
    public static int optimalMerge(int[] files, int n) {
        // Sort the files in ascending order
        for (int i = 0; i < n - 1; i++) {
            for (int j = 0; j < n - i - 1; j++) {
                if (files[j] > files[j + 1]) {
                    // Swap files[j] and files[j + 1]
                    int temp = files[j];
                    files[j] = files[j + 1];
                    files[j + 1] = temp;
                }
            }
        }
        int cost = 0;
        while (n > 1) {
            // Merge the smallest two files
            int mergedFileSize = files[0] + files[1];
            cost += mergedFileSize;
            // Replace the first file with the merged file size
            files[0] = mergedFileSize;
            // Shift the remaining files to the left
            for (int i = 1; i < n - 1; i++) {
                files[i] = files[i + 1];
            }
            n--; // Reduce the number of files
            // Sort the files again
            for (int i = 0; i < n - 1; i++) {
                for (int j = 0; j < n - i - 1; j++) {
                    if (files[j] > files[j + 1]) {
                        // Swap files[j] and files[j + 1]
                        int temp = files[j];
                        files[j] = files[j + 1];
                        files[j + 1] = temp;
                    }
                }
            }
        }
        return cost;
    }
    public static void main(String[] args) {
        int[] files = {5, 10, 20, 30, 30};
        int n = files.length;
        int minCost = optimalMerge(files, n);
        System.out.println("Minimum cost of merging is: " + minCost + " Comparisons");
    }
}

Output

Minimum cost of merging is: 205 Comparison

def optimal_merge(files):
    # Sort the files in ascending order
    files.sort()
    cost = 0
    while len(files) > 1:
        # Merge the smallest two files
        merged_file_size = files[0] + files[1]
        cost += merged_file_size
        # Replace the first file with the merged file size
        files[0] = merged_file_size
        # Remove the second file
        files.pop(1)
        # Sort the files again
        files.sort()
    return cost
files = [5, 10, 20, 30, 30]
min_cost = optimal_merge(files)
print("Minimum cost of merging is:", min_cost, "Comparisons")

Output

Minimum cost of merging is: 205 Comparisons

Dynamic Programming

Dynamic programming approach is similar to divide and conquer in breaking down the problem into smaller and yet smaller possible sub-problems. But unlike divide and conquer, these sub-problems are not solved independently. Rather, results of these smaller sub-problems are remembered and used for similar or overlapping sub-problems.

Mostly, dynamic programming algorithms are used for solving optimization problems. Before solving the in-hand sub-problem, dynamic algorithm will try to examine the results of the previously solved sub-problems. The solutions of sub-problems are combined in order to achieve the best optimal final solution. This paradigm is thus said to be using Bottom-up approach.

So we can conclude that −

The problem should be able to be divided into smaller overlapping sub-problem.
Final optimum solution can be achieved by using an optimum solution of smaller sub-problems.
Dynamic algorithms use memorization.

However, in a problem, two main properties can suggest that the given problem can be solved using Dynamic Programming. They are −

Overlapping Sub-Problems

Similar to Divide-and-Conquer approach, Dynamic Programming also combines solutions to sub-problems. It is mainly used where the solution of one sub-problem is needed repeatedly. The computed solutions are stored in a table, so that these don’t have to be re-computed. Hence, this technique is needed where overlapping sub-problem exists.

For example, Binary Search does not have overlapping sub-problem. Whereas recursive program of Fibonacci numbers have many overlapping sub-problems.

Optimal Sub-Structure

A given problem has Optimal Substructure Property, if the optimal solution of the given problem can be obtained using optimal solutions of its sub-problems.

For example, the Shortest Path problem has the following optimal substructure property −

If a node x lies in the shortest path from a source node u to destination node v, then the shortest path from u to v is the combination of the shortest path from u to x, and the shortest path from x to v.

The standard All Pair Shortest Path algorithms like Floyd-Warshall and Bellman-Ford are typical examples of Dynamic Programming.

Steps of Dynamic Programming Approach

Dynamic Programming algorithm is designed using the following four steps −

Characterize the structure of an optimal solution.
Recursively define the value of an optimal solution.
Compute the value of an optimal solution, typically in a bottom-up fashion.
Construct an optimal solution from the computed information.

Dynamic Programming vs. Greedy vs. Divide and Conquer

In contrast to greedy algorithms, where local optimization is addressed, dynamic algorithms are motivated for an overall optimization of the problem.

In contrast to divide and conquer algorithms, where solutions are combined to achieve an overall solution, dynamic algorithms use the output of a smaller sub-problem and then try to optimize a bigger sub-problem. Dynamic algorithms use memorization to remember the output of already solved sub-problems.

Examples

The following computer problems can be solved using dynamic programming approach −

Fibonacci number series
Knapsack problem
Tower of Hanoi
All pair shortest path by Floyd-Warshall and Bellman Ford
Shortest path by Dijkstra
Project scheduling
Matrix Chain Multiplication

Dynamic programming can be used in both top-down and bottom-up manner. And of course, most of the times, referring to the previous solution output is cheaper than re-computing in terms of CPU cycles.

Matrix Chain Multiplication

Matrix Chain Multiplication is an algorithm that is applied to determine the lowest cost way for multiplying matrices. The actual multiplication is done using the standard way of multiplying the matrices, i.e., it follows the basic rule that the number of rows in one matrix must be equal to the number of columns in another matrix. Hence, multiple scalar multiplications must be done to achieve the product.

To brief it further, consider matrices A, B, C, and D, to be multiplied; hence, the multiplication is done using the standard matrix multiplication. There are multiple combinations of the matrices found while using the standard approach since matrix multiplication is associative. For instance, there are five ways to multiply the four matrices given above −

(A(B(CD)))
(A((BC)D))
((AB)(CD))
((A(BC))D)
(((AB)C)D)

Now, if the size of matrices A, B, C, and D are l × m, m × n, n × p, p × q respectively, then the number of scalar multiplications performed will be lmnpq. But the cost of the matrices change based on the rows and columns present in it. Suppose, the values of l, m, n, p, q are 5, 10, 15, 20, 25 respectively, the cost of (A(B(CD))) is 5 × 100 × 25 = 12,500; however, the cost of (A((BC)D)) is 10 × 25 × 37 = 9,250.

So, dynamic programming approach of the matrix chain multiplication is adopted in order to find the combination with the lowest cost.

Matrix Chain Multiplication Algorithm

Matrix chain multiplication algorithm is only applied to find the minimum cost way to multiply a sequence of matrices. Therefore, the input taken by the algorithm is the sequence of matrices while the output achieved is the lowest cost parenthesization.

Algorithm

Count the number of parenthesizations. Find the number of ways in which the input matrices can be multiplied using the formulae −

$$P(n)=\left\{\begin{matrix} 1 & if\: n=1\\ \sum_{k=1}^{n-1} P(k)P(n-k)& if\: n\geq 2\\ \end{matrix}\right.$$

(or)

$$P(n)=\left\{\begin{matrix} \frac{2(n-1)C_{n-1}}{n} & if\: n\geq 2 \\ 1 & if\: n= 1\\ \end{matrix}\right.$$

Once the parenthesization is done, the optimal substructure must be devised as the first step of dynamic programming approach so the final product achieved is optimal. In matrix chain multiplication, the optimal substructure is found by dividing the sequence of matrices A[i….j] into two parts A[i,k] and A[k+1,j]. It must be ensured that the parts are divided in such a way that optimal solution is achieved.
Using the formula, $C[i,j]=\left\{\begin{matrix} 0 & if \: i=j\\ \displaystyle \min_{ i\leq k< j}\begin{cases} C [i,k]+C[k+1,j]+d_{i-1}d_{k}d_{j} \end{cases} &if \: i< j \\ \end{matrix}\right.$ find the lowest cost parenthesization of the sequence of matrices by constructing cost tables and corresponding k values table.
Once the lowest cost is found, print the corresponding parenthesization as the output.

Pseudocode

Pseudocode to find the lowest cost of all the possible parenthesizations −

MATRIX-CHAIN-MULTIPLICATION(p)
   n = p.length ─ 1
   let m[1…n, 1…n] and s[1…n ─ 1, 2…n] be new matrices
   for i = 1 to n
      m[i, i] = 0
   for l = 2 to n // l is the chain length
      for i = 1 to n - l + 1
         j = i + l - 1
         m[i, j] = ∞
         for k = i to j - 1
            q = m[i, k] + m[k + 1, j] + pi-1pkpj
            if q < m[i, j]
               m[i, j] = q
               s[i, j] = k
return m and s

Pseudocode to print the optimal output parenthesization −

PRINT-OPTIMAL-OUTPUT(s, i, j )
if i == j
print “A”i
else print “(”
PRINT-OPTIMAL-OUTPUT(s, i, s[i, j])
PRINT-OPTIMAL-OUTPUT(s, s[i, j] + 1, j)
print “)”

Example

The application of dynamic programming formula is slightly different from the theory; to understand it better let us look at few examples below.

A sequence of matrices A, B, C, D with dimensions 5 × 10, 10 × 15, 15 × 20, 20 × 25 are set to be multiplied. Find the lowest cost parenthesization to multiply the given matrices using matrix chain multiplication.

Solution

Given matrices and their corresponding dimensions are −

A_5×10×B_10×15×C_15×20×D_20×25

Find the count of parenthesization of the 4 matrices, i.e. n = 4.

Using the formula, $P\left ( n \right )=\left\{\begin{matrix} 1 & if\: n=1\\ \sum_{k=1}^{n-1}P(k)P(n-k) & if\: n\geq 2 \\ \end{matrix}\right.$

Since n = 4 ≥ 2, apply the second case of the formula −

$$P\left ( n \right )=\sum_{k=1}^{n-1}P(k)P(n-k)$$

$$P\left ( 4 \right )=\sum_{k=1}^{3}P(k)P(4-k)$$

$$P\left ( 4 \right )=P(1)P(3)+P(2)P(2)+P(3)P(1)$$

If P(1) = 1 and P(2) is also equal to 1, P(4) will be calculated based on the P(3) value. Therefore, P(3) needs to determined first.

$$P\left ( 3 \right )=P(1)P(2)+P(2)P(1)$$

$$=1+1=2$$

Therefore,

$$P\left ( 4 \right )=P(1)P(3)+P(2)P(2)+P(3)P(1)$$

$$=2+1+2=5$$

Among these 5 combinations of parenthesis, the matrix chain multiplicatiion algorithm must find the lowest cost parenthesis.

Step 1

The table above is known as a cost table, where all the cost values calculated from the different combinations of parenthesis are stored.

Another table is also created to store the k values obtained at the minimum cost of each combination.

Step 2

Applying the dynamic programming approach formula find the costs of various parenthesizations,

$$C[i,j]=\left\{\begin{matrix} 0 & if \: i=j\\ \displaystyle \min_{ i\leq k< j}\begin{cases} C [i,k]+C\left [ k+1,j \right ]+d_{i-1}d_{k}d_{j} \end{cases} &if \: i< j \\ \end{matrix}\right.$$

$C\left [ 1,1 \right ]=0$

$C\left [ 2,2 \right ]=0$

$C\left [ 3,3 \right ]=0$

$C\left [ 4,4 \right ]=0$

Step 3

Applying the dynamic approach formula only in the upper triangular values of the cost table, since i < j always.

$C[1,2]=\displaystyle \min_{ 1\leq k< 2}\begin{Bmatrix} C[1,1]+C[2,2]+d_{0}d_{1}d_{2} \end{Bmatrix}$

$C[1,2]=0+0+\left ( 5\times 10\times 15 \right )$
$C[1,2]=750$

$C[2,3]=\displaystyle \min_{ 2\leq k< 3}\begin{Bmatrix} C[2,2]+C[3,3]+d_{1}d_{2}d_{3} \end{Bmatrix}$

$C[2,3]=0+0+\left ( 10\times 15\times 20 \right )$
$C[2,3]=3000$

$C[3,4]=\displaystyle \min_{ 3\leq k< 4}\begin{Bmatrix} C[3,3]+C[4,4]+d_{2}d_{3}d_{4} \end{Bmatrix}$

$C[3,4]=0+0+\left ( 15\times 20\times 25 \right )$
$C[3,4]=7500$

Step 4

Find the values of [1, 3] and [2, 4] in this step. The cost table is always filled diagonally step-wise.

$C[2,4]=\displaystyle \min_{ 2\leq k< 4}\begin{Bmatrix} C[2,2]+C[3,4]+d_{1}d_{2}d_{4},C[2,3] +C[4,4]+d_{1}d_{3}d_{4}\end{Bmatrix}$

$C[2,4]=\displaystyle min\left\{ ( 0 + 7500 + (10 \times 15 \times 20)), (3000 + 5000)\right\}$
$C[2,4]=8000$

$C[1,3]=\displaystyle \min_{ 1\leq k< 3}\begin{Bmatrix} C[1,1]+C[2,3]+d_{0}d_{1}d_{3},C[1,2] +C[3,3]+d_{0}d_{2}d_{3}\end{Bmatrix}$

$C[1,3]=min\left\{ ( 0 + 3000 + 1000), (1500+0+750)\right\}$
$C[1,3]=2250$

Step 5

Now compute the final element of the cost table to compare the lowest cost parenthesization.

$C[1,4]=\displaystyle \min_{ 1\leq k< 4}\begin{Bmatrix} C[1,1]+C[2,4]+d_{0}d_{1}d_{4},C[1,2] +C[3,4]+d_{1}d_{2}d_{4},C[1,3]+C[4,4] +d_{1}d_{3}d_{4}\end{Bmatrix}$

$C[1,4]=min\left\{0+8000+1250,750+7500+1875,2200+0+2500\right\}$
$C[1,4]=4700$

Now that all the values in cost table are computed, the final step is to parethesize the sequence of matrices. For that, k table needs to be constructed with the minimum value of ‘k’ corresponding to every parenthesis.

Parenthesization

Based on the lowest cost values from the cost table and their corresponding k values, let us add parenthesis on the sequence of matrices.

The lowest cost value at [1, 4] is achieved when k = 3, therefore, the first parenthesization must be done at 3.

                  (ABC)(D)

The lowest cost value at [1, 3] is achieved when k = 2, therefore the next parenthesization is done at 2.

                  ((AB)C)(D)

The lowest cost value at [1, 2] is achieved when k = 1, therefore the next parenthesization is done at 1. But the parenthesization needs at least two matrices to be multiplied so we do not divide further.

                  ((AB)(C))(D)

Since, the sequence cannot be parenthesized further, the final solution of matrix chain multiplication is ((AB)C)(D).

Example

Following is the final implementation of Matrix Chain Multiplication Algorithm to calculate the minimum number of ways several matrices can be multiplied using dynamic programming −

C C++ Java Python

#include <stdio.h>
#include <string.h>
#define INT_MAX 999999
int mc[50][50];
int min(int a, int b){
   if(a < b)
      return a;
   else
      return b;
}
int DynamicProgramming(int c[], int i, int j){
   if (i == j) {
      return 0;
   }
   if (mc[i][j] != -1) {
      return
         mc[i][j];
   }
   mc[i][j] = INT_MAX;
   for (int k = i; k < j; k++) {
      mc[i][j] = min(mc[i][j], DynamicProgramming(c, i, k) + DynamicProgramming(c, k + 1, j) + c[i - 1] * c[k] * c[j]);
   }
   return mc[i][j];
}
int Matrix(int c[], int n){
   int i = 1, j = n - 1;
   return DynamicProgramming(c, i, j);
}
int main(){
   int arr[] = { 23, 26, 27, 20 };
   int n = sizeof(arr) / sizeof(arr[0]);
   memset(mc, -1, sizeof mc);
   printf("Minimum number of multiplications is: %d", Matrix(arr, n));
}

Output

Minimum number of multiplications is: 26000

#include <bits/stdc++.h>
using namespace std;
int mc[50][50];
int DynamicProgramming(int* c, int i, int j){
   if (i == j) {
      return 0;
   }
   if (mc[i][j] != -1) {
      return
         mc[i][j];
   }
   mc[i][j] = INT_MAX;
   for (int k = i; k < j; k++) {
      mc[i][j] = min(mc[i][j], DynamicProgramming(c, i, k) + DynamicProgramming(c, k + 1, j) + c[i - 1] * c[k] * c[j]);
   }
   return mc[i][j];
}
int Matrix(int* c, int n){
   int i = 1, j = n - 1;
   return DynamicProgramming(c, i, j);
}
int main(){
   int arr[] = { 23, 26, 27, 20 };
   int n = sizeof(arr) / sizeof(arr[0]);
   memset(mc, -1, sizeof mc);
   cout << "Minimum number of multiplications is: " << Matrix(arr, n);
}

Output

Minimum number of multiplications is: 26000

import java.io.*;
import java.util.*;
public class Main {
   static int[][] mc = new int[50][50];
   public static int DynamicProgramming(int c[], int i, int j) {
      if (i == j) {
         return 0;
      }
      if (mc[i][j] != -1) {
         return mc[i][j];
      }
      mc[i][j] = Integer.MAX_VALUE;
      for (int k = i; k < j; k++) {
         mc[i][j] = Math.min(mc[i][j], DynamicProgramming(c, i, k) + DynamicProgramming(c, k + 1, j) + c[i - 1] * c[k] * c[j]);
      }
      return mc[i][j];
   }
   public static int Matrix(int c[], int n) {
      int i = 1, j = n - 1;
      return DynamicProgramming(c, i, j);
   }
   public static void main(String args[]) {
      int arr[] = { 23, 26, 27, 20 };
      int n = arr.length;
      for (int[] row : mc)
         Arrays.fill(row, -1);
      System.out.println("Minimum number of multiplications is: " + Matrix(arr, n));
   }
}

Output

Minimum number of multiplications is: 26000

mc = [[-1 for n in range(50)] for m in range(50)]
def DynamicProgramming(c, i, j):
   if (i == j):
      return 0
   if (mc[i][j] != -1):
      return mc[i][j]
   mc[i][j] = 999999
   for k in range (i, j):
      mc[i][j] = min(mc[i][j], DynamicProgramming(c, i, k) + DynamicProgramming(c, k + 1, j) + c[i - 1] * c[k] * c[j]);
   return mc[i][j]

def Matrix(c, n):
   i = 1
   j = n - 1
   return DynamicProgramming(c, i, j);

arr = [ 23, 26, 27, 20 ]
n = len(arr)
print("Minimum number of multiplications is: ")
print(Matrix(arr, n))

Output

Minimum number of multiplications is: 
26000

Floyd Warshall Algorithm

The Floyd-Warshall algorithm is a graph algorithm that is deployed to find the shortest path between all the vertices present in a weighted graph. This algorithm is different from other shortest path algorithms; to describe it simply, this algorithm uses each vertex in the graph as a pivot to check if it provides the shortest way to travel from one point to another.

Floyd-Warshall algorithm works on both directed and undirected weighted graphs unless these graphs do not contain any negative cycles in them. By negative cycles, it is meant that the sum of all the edges in the graph must not lead to a negative number.

Since, the algorithm deals with overlapping sub-problems – the path found by the vertices acting as pivot are stored for solving the next steps – it uses the dynamic programming approach.

Floyd-Warshall algorithm is one of the methods in All-pairs shortest path algorithms and it is solved using the Adjacency Matrix representation of graphs.

Floyd-Warshall Algorithm

Consider a graph, G = {V, E} where V is the set of all vertices present in the graph and E is the set of all the edges in the graph. The graph, G, is represented in the form of an adjacency matrix, A, that contains all the weights of every edge connecting two vertices.

Algorithm

Step 1 − Construct an adjacency matrix A with all the costs of edges present in the graph. If there is no path between two vertices, mark the value as ∞.

Step 2 − Derive another adjacency matrix A₁ from A keeping the first row and first column of the original adjacency matrix intact in A₁. And for the remaining values, say A₁[i,j], if A[i,j]>A[i,k]+A[k,j] then replace A₁[i,j] with A[i,k]+A[k,j]. Otherwise, do not change the values. Here, in this step, k = 1 (first vertex acting as pivot).

Step 3 − Repeat Step 2 for all the vertices in the graph by changing the k value for every pivot vertex until the final matrix is achieved.

Step 4 − The final adjacency matrix obtained is the final solution with all the shortest paths.

Pseudocode

Floyd-Warshall(w, n){ // w: weights, n: number of vertices
   for i = 1 to n do // initialize, D (0) = [wij]
      for j = 1 to n do{
         d[i, j] = w[i, j];
      }
      for k = 1 to n do // Compute D (k) from D (k-1)
         for i = 1 to n do
            for j = 1 to n do
               if (d[i, k] + d[k, j] < d[i, j]){
                  d[i, j] = d[i, k] + d[k, j];
               }
      return d[1..n, 1..n];
}

Example

Consider the following directed weighted graph G = {V, E}. Find the shortest paths between all the vertices of the graphs using the Floyd-Warshall algorithm.

Solution

Step 1

Construct an adjacency matrix A with all the distances as values.

$$A=\begin{matrix} 0 & 5& \infty & 6& \infty \\ \infty & 0& 1& \infty& 7\\ 3 & \infty& 0& 4& \infty\\ \infty & \infty& 2& 0& 3\\ 2& \infty& \infty& 5& 0\\ \end{matrix}$$

Step 2