Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
PHP Program for Rabin-Karp Algorithm for Pattern Searching
The Rabin-Karp algorithm is a string pattern matching algorithm that efficiently searches for occurrences of a pattern within a larger text. It was developed by Michael O. Rabin and Richard M. Karp in 1987.
The algorithm utilizes a hashing technique to compare the hash values of the pattern and substrings of the text. It works as follows:
Calculate the hash value of the pattern and the first window of the text.
Slide the pattern over the text one position at a time and compare the hash values.
If the hash values match, compare the characters of the pattern and the current window of the text to confirm the match.
If there is a match, record the position/index of the match.
Calculate the hash value for the next window of the text using a rolling hash function.
Repeat steps 3 to 5 until all positions of the text have been checked.
The rolling hash function efficiently updates the hash value for each new window by subtracting the contribution of the first character in the previous window and adding the contribution of the next character in the new window. This helps avoid recalculating the hash value from scratch for each window, making the algorithm more efficient.
How It Works
PHP Implementation
Here's how to implement the RabinKarp algorithm in PHP
<?php
function rabinKarp($pattern, $text)
{
$d = 256; // Number of characters in the input alphabet
$q = 101; // A prime number
$M = strlen($pattern);
$N = strlen($text);
$p = 0; // Hash value for pattern
$t = 0; // Hash value for text
$h = 1;
// Calculate the hash value of pattern and first window of text
for ($i = 0; $i < $M - 1; $i++)
$h = ($h * $d) % $q;
for ($i = 0; $i < $M; $i++) {
$p = ($d * $p + ord($pattern[$i])) % $q;
$t = ($d * $t + ord($text[$i])) % $q;
}
// Slide the pattern over text one by one
for ($i = 0; $i <= $N - $M; $i++) {
// Check the hash values of current window of text and pattern
// If the hash values match, then only check for characters one by one
if ($p == $t) {
$match = true;
// Check for characters one by one
for ($j = 0; $j < $M; $j++) {
if ($text[$i + $j] != $pattern[$j]) {
$match = false;
break;
}
}
// Pattern found
if ($match)
echo "Pattern found at index " . $i . "<br>";
}
// Calculate the hash value for the next window of text
if ($i < $N - $M) {
$t = ($d * ($t - ord($text[$i]) * $h) + ord($text[$i + $M])) % $q;
// If the calculated hash value is negative, make it positive
if ($t < 0)
$t = $t + $q;
}
}
}
// Example usage
$text = "ABCABCABCABCABC";
$pattern = "BC";
rabinKarp($pattern, $text);
?>
Output
Pattern found at index 1 Pattern found at index 4 Pattern found at index 7 Parameter found at index 10 Pattern found at index 13
Key Points
Rolling Hash: The algorithm uses a rolling hash function to efficiently update hash values for sliding windows.
Hash Collision Handling: When hash values match, characterbycharacter comparison ensures true matches.
Prime Modulus: Using a prime number (101) as modulus reduces hash collisions.
Time Complexity: Average case O(n+m), worst case O(nm) where n is text length and m is pattern length.
Conclusion
The RabinKarp algorithm provides an efficient approach for pattern searching using hashing techniques. It's particularly useful when searching for multiple patterns simultaneously, making it valuable for applications like plagiarism detection and data mining.
