Regular Expression Matching in Python

PythonServer Side ProgrammingProgramming

Suppose we have an input string s and another input string p. Here s is the main string and p is the pattern. We have to define one method, that can match patterns in the string. So we have to implement this for a regular expression, that supports ‘.’ And ‘*’.

  • Dot ‘.’ Matches any single character

  • Star ‘*’ Matches zero or more of the preceding element.

So for example, if the input is like s = “aa” and p = “a.”, then it will be true, for the same input string, if the patter is “.*”, then it will be true.

To solve this, we will follow these steps −

  • ss := size of s and ps := size of p

  • make dp a matrix of size ss x ps, and fill this using false value

  • Update p and s by adding one blank space before these

  • For i in range 2 to ps −

    • dp[0, i] := dp[0, i - 2] when p[i] is star, otherwise False

  • for i in range 1 to ss

    • for j in range 1 to ps

      • if s[i] is p[j], or p[j] is dot, then

        • dp[i, j] := dp[i – 1, j – 1]

      • otherwise when p[j] is star, then

        • dp[i, j] := dp[i, j - 2]

        • if s[i] is p[j – 1] or p[j – 1] is dot, then

          • dp[i, j] := max of dp[i, j] and dp[i – 1, j]

  • return dp[ss, ps]


Let us see the following implementation to get better understanding −

 Live Demo

class Solution(object):
   def isMatch(self, s, p):
      ss = len(s)
      ps = len(p)
      dp = [[False for i in range(ps+1)] for j in range(ss+1)]
      p = " "+p
      s = " " + s
      for i in range(2,ps+1):
         dp[0][i] = dp[0][i-2] if p[i]=='*'else False
      for i in range(1,ss+1):
         for j in range(1,ps+1):
            if s[i] ==p[j] or p[j]=='.':
               dp[i][j]= dp[i-1][j-1]
            elif p[j] == '*':
               dp[i][j] = dp[i][j-2]
               if s[i] == p[j-1] or p[j-1]=='.':
                  dp[i][j] = max(dp[i][j],dp[i-1][j])
      return dp[ss][ps]
ob = Solution()
print(ob.isMatch("aa", "a."))
print(ob.isMatch("aaaaaa", "a*"))


"aa", "a."
"aaaaaa", "a*"


Published on 26-May-2020 14:49:21