Classify strings from an array using Custom Hash Function


In this article, we will delve into an interesting problem involving strings, hashing, and classification in C++. The problem statement is "Classify strings from an array using a custom hash function". This problem offers a great opportunity to learn about custom hash functions, their uses, and their applications in data classification and string manipulation.

Problem Statement

Given an array of strings, the task is to classify the strings into different categories using a custom hash function.

Custom Hash Function

A hash function is a function that is used to map data of arbitrary size to a fixed size. In our case, we are going to create a custom hash function to map the strings into different categories. This function will generate a unique hash value for each string, which we will then use for classification.

C++ Solution Approach

Our approach to solve this problem is to create a custom hash function that will return a unique value for each string. We will then use this value to classify the strings.

Example

Here's the C++ code that implements this solution −

#include <iostream>
#include <unordered_map>
#include <vector>
using namespace std;

size_t customHash(string const& s) {
   size_t h = 0;
   for (char c : s) {
      h = h * 31 + c;
   }
   return h;
}

void classifyStrings(vector<string>& strings) {
   unordered_map<size_t, vector<string>> classes;
   for (string& s : strings) {
      size_t h = customHash(s);
      classes[h].push_back(s);
   }
   
   for (auto& kv : classes) {
      cout << "Class " << kv.first << ":\n";
      for (string& s : kv.second) {
         cout << "  " << s << '\n';
      }
   }
}

int main() {
   vector<string> strings = {"apple", "banana", "apple", "orange", "banana"};
   classifyStrings(strings);
   return 0;
}

Output

Class 2898612069:
  banana
  banana
Class 3286115886:
  orange
Class 93029210:
  apple
  apple

Explanation with a Test Case

Let's consider an array of strings: {"apple", "banana", "apple", "orange", "banana"}.

When we pass this array to the classifyStrings function, it generates a unique hash value for each distinct string using the custom hash function. Then, it groups the strings with the same hash value together, effectively classifying them into the same class.

This shows that "apple" and "banana" each have been classified into their own classes because they appear more than once in the array, while "orange" has its own class.

Conclusion

This problem provides an excellent opportunity to understand the concept and usage of custom hash functions in C++. It's an ideal problem to improve your C++ coding skills and to gain a better understanding of data classification and string manipulation techniques.

Updated on: 17-May-2023

204 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements