Design and Analysis - Set Cover Problem

The set cover algorithm provides solution to many real-world resource allocating problems. For instance, consider an airline assigning crew members to each of their airplanes such that they have enough people to fulfill the requirements for the journey. They take into account the flight timings, the duration, the pit-stops, availability of the crew to assign them to the flights. This is where set cover algorithm comes into picture.

Given a universal set U, containing few elements which are all divided into subsets. Considering the collection of these subsets as S = {S1, S2, S3, S4... Sn}, the set cover algorithm finds the minimum number of subsets such that they cover all the elements present in the universal set.

As shown in the above diagram, the dots represent the elements present in the universal set U that are divided into different sets, S = {S1, S2, S3, S4, S5, S6}. The minimum number of sets that need to be selected to cover all the elements will be the optimal output = {S1, S2, S3}.

Set Cover Algorithm

The set cover takes the collection of sets as an input and and returns the minimum number of sets required to include all the universal elements.

The set cover algorithm is an NP-Hard problem and a 2-approximation greedy algorithm.

Algorithm

Step 1 − Initialize Output = {} where Output represents the output set of elements.

Step 2 − While the Output set does not include all the elements in the universal set, do the following −

• Find the cost-effectiveness of every subset present in the universal set using the formula, $\frac{Cost\left ( S_{i} \right )}{S_{i}-Output}$

• Find the subset with minimum cost effectiveness for each iteration performed. Add the subset to the Output set.

Step 3 − Repeat Step 2 until there is no elements left in the universe. The output achieved is the final Output set.

Pseudocode

APPROX-GREEDY-SET_COVER(X, S)
U = X
OUTPUT = ф
while U ≠ ф
select Si Є S which has maximum |Si∩U|
U = U – S
OUTPUT = OUTPUT∪ {Si}
return OUTPUT


Analysis

assuming the overall number of elements equals the overall number of sets (|X| = |S|), the code runs in time O(|X|3)

Example

Let us look at an example that describes the approximation algorithm for the set covering problem in more detail

S1 = {1, 2, 3, 4}                cost(S1) = 5
S2 = {2, 4, 5, 8, 10}            cost(S2) = 10
S3 = {1, 3, 5, 7, 9, 11, 13}     cost(S3) = 20
S4 = {4, 8, 12, 16, 20}          cost(S4) = 12
S5 = {5, 6, 7, 8, 9}             cost(S5) = 15


Step 1

The output set, Output = ф

Find the cost effectiveness of each set for no elements in the output set,

S1 = cost(S1) / (S1 – Output) = 5 / (4 – 0)
S2 = cost(S2) / (S2 – Output) = 10 / (5 – 0)
S3 = cost(S3) / (S3 – Output) = 20 / (7 – 0)
S4 = cost(S4) / (S4 – Output) = 12 / (5 – 0)
S5 = cost(S5) / (S5 – Output) = 15 / (5 – 0)


The minimum cost effectiveness in this iteration is achieved at S1, therefore, the subset added to the output set, Output = {S1} with elements {1, 2, 3, 4}

Step 2

Find the cost effectiveness of each set for the new elements in the output set,

S2 = cost(S2) / (S2 – Output) = 10 / (5 – 4)
S3 = cost(S3) / (S3 – Output) = 20 / (7 – 4)
S4 = cost(S4) / (S4 – Output) = 12 / (5 – 4)
S5 = cost(S5) / (S5 – Output) = 15 / (5 – 4)


The minimum cost effectiveness in this iteration is achieved at S3, therefore, the subset added to the output set, Output = {S1, S3} with elements {1, 2, 3, 4, 5, 7, 9, 11, 13}.

Step 3

Find the cost effectiveness of each set for the new elements in the output set,

S2 = cost(S2) / (S2 – Output) = 10 / |(5 – 9)|
S4 = cost(S4) / (S4 – Output) = 12 / |(5 – 9)|
S5 = cost(S5) / (S5 – Output) = 15 / |(5 – 9)|


The minimum cost effectiveness in this iteration is achieved at S2, therefore, the subset added to the output set, Output = {S1, S3, S2} with elements {1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 13}

Step 4

Find the cost effectiveness of each set for the new elements in the output set,

S4 = cost(S4) / (S4 – Output) = 12 / |(5 – 11)|
S5 = cost(S5) / (S5 – Output) = 15 / |(5 – 11)|


The minimum cost effectiveness in this iteration is achieved at S4, therefore, the subset added to the output set, Output = {S1, S3, S2, S4} with elements {1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 16, 20}

Step 5

Find the cost effectiveness of each set for the new elements in the output set,

S5 = cost(S5) / (S5 – Output) = 15 / |(5 – 14)|


The minimum cost effectiveness in this iteration is achieved at S5, therefore, the subset added to the output set, Output = {S1, S3, S2, S4, S5} with elements {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 16, 20}

The final output that covers all the elements present in the universal finite set is, Output = {S1, S3, S2, S4, S5}.