Lua pattern matching vs regular expression

LuaServer Side ProgrammingProgramming

It is known that the design of the pattern matching that Lua follows is very much different, then the regular expression design which is generally based on POSIX.

They have very less in common, and the more popular approach would be POSIX out of the two because it works great when the examples become more complex and it can handle variety of cases, but this does not mean that the Lua’s pattern matching is bad. In fact, it is easier to understand and it works like a charm too.

Instead of using regex, the Lua string library has a special set of characters used in syntax matches. Both can be very similar, but Lua pattern matching is more limited and has a different syntax.

While it can be easily interpreted that in some cases the POSIX design is better than Lua’s pattern matching, in the following points, I’ll try to discuss why Lua's pattern matching design is preferable too.

Consider the points mentioned below −

  • Quoting is extremely simple and regular. The quoting character is %, so it is always distinct from the string-quoting character \, which makes Lua patterns much easier to read than POSIX regular expressions (when quoting is necessary).
  • Lua has a "shortest match" - modifier to go along with the "longest match" * operator. So for example s:find '%s(%S-)%:' finds the shortest sequence of nonspace (not matching) characters that is preceded by space and followed by a colon.
  • Lua offers "captures" and can return multiple captures as the result of a match call. This interface is much, much better than capturing substrings through side effects or having some hidden state that has to be interrogated to find captures. You can capture the syntax by using the parentheses.
  • The syntax for pattern matching is very lightweight and common character types include uppercase letters (%u), decimal digits (%d), space characters (%s) and so on. Any character type can be complemented by using the corresponding capital letter, so pattern %S matches any nonspace character.
  • There is also %bxy which matches a balanced pair of delimiters, such as parentheses or braces. Balanced parenthesis matching cannot be done in POSIX regular expressions.

All the above points conclude that Lua’s pattern matching model is definitely a worth using tool.

raja
Published on 19-Jul-2021 11:42:40
Advertisements