Golang Program to Implement Rabin Karp


The Rabin−Karp algorithm in Golang is a powerful string searching algorithm used to efficiently locate a pattern within a larger text. In this article, we need to implement Rabin Karp algorithm in golanguage that will enable efficient pattern matching and showcasing the flexibility of this algorithm in Golang. We can use methods like the single function approach as well as using the modular approach.

Pattern Matching

Let us assume we have the text: “ABCABCDABCABC” and the pattern “ABC”, so by implement Rabin Karp algorithm in golanguage we can find out how many times and where this pattern repeat itself in the given text string. We are going to understand this in the below examples.

Single Function Approach

This approach utilizes a single function to implement Rabin Karp algorithm in golanguage. The function calculates the hash value of the pattern and generates hash values for sliding windows of the text. When hash values match, character−by−character verification confirms a match. Although straightforward, this method might not be optimal for very large texts.

Modular Approach

The modular approach divides the algorithm into separate functions. These functions manage hash calculations, hash updates during sliding, and character comparisons during hash collisions. This modular approach is more versatile and performs better for extensive texts.

Algorithm

  • Initialise an empty slice to store the indices where the pattern is found in the text, and Calculate the length of the pattern and the text.

  • Compute the hash value of the pattern using a suitable hashing function. Iterate through the text from index 0 to textLen − patternLen.

  • Inside the loop, calculate the hash value of the current substring of the text. If the hash value of the substring matches the hash value of the pattern:

  • Perform a character−by−character comparison between the substring and the pattern to validate the match. If the match is confirmed, append the current index to the indices slice.

  • Continue iterating through the text until all substrings have been checked. Return the indices slice containing the indices where the pattern is found.

Syntax

func rabinKarp(pattern, text string) []int

The syntax func rabinKarp(pattern, text string) []int defines a function named rabinKarp that takes two string parameters, pattern and text. The function returns a slice of integers ([]int), representing the indices where the pattern is found in the text.

func hash(str string) uint64

The syntax func hash(str string) uint64 declares a function named hash that accepts a string parameter str. The function is designed to return an unsigned 64−bit integer (uint64), denoting the computed hash value.

Example

In this example we are going to implement Rabin Karp algorithm in golanguage for pattern matching. The rabinKarp function takes the pattern and text as input: pattern represents the pattern we want to search for, and text represents the text in which we want to search for the pattern. Inside the function, the implementation code handles the Rabin−Karp algorithm. It performs the necessary calculations and comparisons to find the pattern in the given text. The function then returns a slice of integers, []int, containing the indices to find the pattern with text.

package main

import (
	"fmt"
)

func rabinKarp(pattern, text string) []int {
	var indices []int
	patternLen := len(pattern)
	textLen := len(text)

	for i := 0; i <= textLen-patternLen; i++ {
		match := true
		for j := 0; j < patternLen; j++ {
			if text[i+j] != pattern[j] {
				match = false
				break
			}
		}
		if match {
			indices = append(indices, i)
		}
	}

	return indices
}

func main() {
	text := "ABCABCDABCABC"
	pattern := "ABC"

	indices := rabinKarp(pattern, text)
	fmt.Println("Pattern found at indices:", indices)
}

Output

Pattern found at indices: [0 3 7 10]

Example

In this example, we have a function named hash that takes a string parameter str. The function calculates and returns an unsigned 64−bit integer (uint64), which represents the hash value of the input string. Inside the function, the implementation code calculates the hash value of the input string using a suitable hashing algorithm. The computed hash value is stored in the hashValue variable and returned as an unsigned 64−bit integer (uint64).

package main

import (
	"fmt"
)

func hash(str string) uint64 {
	var hashValue uint64

	for i := 0; i < len(str); i++ {
		hashValue += uint64(str[i])
	}

	return hashValue
}

func main() {
	input := "example"

	hashValue := hash(input)
	fmt.Println("Hash value:", hashValue)
}

Output

Hash value: 748

Real life Implementation

Plagiarism Detection

The Rabin−Karp algorithm can be used to detect plagiarism in documents. By treating each document as a sequence of characters and using the algorithm to efficiently search for matching patterns between documents, you can identify instances of copied content or similarities between texts.

Data Deduplication

In data storage systems, the Rabin−Karp algorithm can help identify duplicate files or chunks of data. By hashing sections of data and using the algorithm to compare hash values, you can quickly identify whether two pieces of data are identical or similar.

Conclusion

Rabin−Karp is a powerful string searching algorithm that can be used to detect plag, or a duplicate data in a file. In this article we have looked at how we can implement Rabin Karp algorithm in golanguage, a potent string searching technique. Here, we explore two methods: the direct pattern matching approach and the ingenious use of a separate hashing function.

Updated on: 07-Sep-2023

75 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements