PHP Program for Rabin-Karp Algorithm for Pattern Searching

The Rabin-Karp algorithm is a string pattern matching algorithm that efficiently searches for occurrences of a pattern within a larger text. It was developed by Michael O. Rabin and Richard M. Karp in 1987.

The algorithm utilizes a hashing technique to compare the hash values of the pattern and substrings of the text. It works as follows:

  • Calculate the hash value of the pattern and the first window of the text.

  • Slide the pattern over the text one position at a time and compare the hash values.

  • If the hash values match, compare the characters of the pattern and the current window of the text to confirm the match.

  • If there is a match, record the position/index of the match.

  • Calculate the hash value for the next window of the text using a rolling hash function.

  • Repeat steps 3 to 5 until all positions of the text have been checked.

The rolling hash function efficiently updates the hash value for each new window by subtracting the contribution of the first character in the previous window and adding the contribution of the next character in the new window. This helps avoid recalculating the hash value from scratch for each window, making the algorithm more efficient.

How It Works

Text: A B C A B C A B C Pattern: B C Step 1: Hash pattern and first window AB Hash: 123 Step 2: Slide and compare hashes BC Hash: 456 ? Match! Step 3: Character-by-character verification B = B ?, C = C ? Pattern found at index 1 BC Hash: 456

PHP Implementation

Here's how to implement the RabinKarp algorithm in PHP

<?php

function rabinKarp($pattern, $text)
{
   $d = 256; // Number of characters in the input alphabet
   $q = 101; // A prime number

   $M = strlen($pattern);
   $N = strlen($text);
   $p = 0; // Hash value for pattern
   $t = 0; // Hash value for text
   $h = 1;

   // Calculate the hash value of pattern and first window of text
   for ($i = 0; $i < $M - 1; $i++)
      $h = ($h * $d) % $q;

   for ($i = 0; $i < $M; $i++) {
      $p = ($d * $p + ord($pattern[$i])) % $q;
      $t = ($d * $t + ord($text[$i])) % $q;
   }

   // Slide the pattern over text one by one
   for ($i = 0; $i <= $N - $M; $i++) {

      // Check the hash values of current window of text and pattern
      // If the hash values match, then only check for characters one by one
      if ($p == $t) {
         $match = true;

         // Check for characters one by one
         for ($j = 0; $j < $M; $j++) {
            if ($text[$i + $j] != $pattern[$j]) {
               $match = false;
               break;
            }
         }

         // Pattern found
         if ($match)
            echo "Pattern found at index " . $i . "<br>";
      }

      // Calculate the hash value for the next window of text
      if ($i < $N - $M) {
         $t = ($d * ($t - ord($text[$i]) * $h) + ord($text[$i + $M])) % $q;

         // If the calculated hash value is negative, make it positive
         if ($t < 0)
            $t = $t + $q;
      }
   }
}

// Example usage
$text = "ABCABCABCABCABC";
$pattern = "BC";
rabinKarp($pattern, $text);
?>

Output

Pattern found at index 1
Pattern found at index 4
Pattern found at index 7
Parameter found at index 10
Pattern found at index 13

Key Points

  • Rolling Hash: The algorithm uses a rolling hash function to efficiently update hash values for sliding windows.

  • Hash Collision Handling: When hash values match, characterbycharacter comparison ensures true matches.

  • Prime Modulus: Using a prime number (101) as modulus reduces hash collisions.

  • Time Complexity: Average case O(n+m), worst case O(nm) where n is text length and m is pattern length.

Conclusion

The RabinKarp algorithm provides an efficient approach for pattern searching using hashing techniques. It's particularly useful when searching for multiple patterns simultaneously, making it valuable for applications like plagiarism detection and data mining.

Updated on: 2026-03-15T10:35:35+05:30

280 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements