MySQL - Regular Expressions



MySQL supports various types of pattern matching operations to retrieve filtered result-sets from huge database tables. In previous chapters, we have already learned about the LIKE operator for pattern matching. In this chapter, we will see another pattern matching operation based on regular expressions.

MySQL Regular Expressions

A regular expression is loosely defined as a sequence of characters that represent a pattern in an input text. It is used to locate or replace text strings using some patterns; this pattern can either be a single character, multiple characters or words, etc.

MySQL implements regular expression support using International Components for Unicode (ICU), which provides full Unicode support and is multi-byte safe.

In MySQL, it is a powerful way to perform a complex search operations in a database to retrieve desired content. And unlike the LIKE operator, the regular expressions are not restricted on search patterns (like % and _) as they use several other meta characters to expand the flexibility and control during pattern matching. This is performed using the REGEXP operator.

Syntax

Following is the basic syntax of the REGEXP operator in MySQL −

expression REGEXP pattern

Patterns used with REGEXP

Following is the table of pattern, which can be used along with the REGEXP operator.

Pattern What the pattern matches
^ Beginning of string
$ End of string
. Any single character
[...] Any character listed between the square brackets
[^...] Any character not listed between the square brackets
p1|p2|p3 Alternation; matches any of the patterns p1, p2, or p3
* Zero or more instances of preceding element
+ One or more instances of preceding element
{n} n instances of preceding element
{m,n} m through n instances of preceding element
[A-Z] Any uppercase letter
[a-z] Any lowercase letter
[0-9] Any digit (from 0 to 9)
[[:<:]] Beginning of words
[[:>:]] Ending of words
[:class:] A character class, i.e. use [:alpha:] to match letters from the alphabet

Examples

The following example demonstrates the usage of some patterns mentioned in the table above, along with the REGEXP operator. For that, we are first creating a database table to perform the search on.

Assume we are creating a table called CUSTOMERS using the following query −

CREATE TABLE CUSTOMERS (
   ID INT AUTO_INCREMENT,
   NAME VARCHAR(20) NOT NULL,
   AGE INT NOT NULL,
   ADDRESS CHAR (25),
   SALARY DECIMAL (18, 2),
   PRIMARY KEY (ID)
);

Now, insert some values into it using the INSERT statements given below −

INSERT INTO CUSTOMERS (ID,NAME,AGE,ADDRESS,SALARY) VALUES 
(1, 'Ramesh', 32, 'Ahmedabad', 2000.00 ),
(2, 'Khilan', 25, 'Delhi', 1500.00 ),
(3, 'Kaushik', 23, 'Kota', 2000.00 ),
(4, 'Chaitali', 25, 'Mumbai', 6500.00 ),
(5, 'Hardik', 27, 'Bhopal', 8500.00 ),
(6, 'Komal', 22, 'Hyderabad', 4500.00 ),
(7, 'Muffy', 24, 'Indore', 10000.00 );

Execute the following query to display all the records present in above created table −

SELECT * FROM CUSTOMERS;

Following are the records present in CUSTOMERS table −

ID NAME AGE ADDRESS SALARY
1 Ramesh 32 Ahmedabad 2000.00
2 Khilan 25 Delhi 1500.00
3 Kaushik 23 Kota 2000.00
4 Chaitali 25 Mumbai 6500.00
5 Hardik 27 Bhopal 8500.00
6 Komal 22 Hyderabad 4500.00
7 Muffy 24 Indore 10000.00

REGEXP with Patterns −

Now, we are finding all the records in the CUSTOMERS table whose name starts with 'k'

SELECT * FROM CUSTOMERS WHERE NAME REGEXP '^k';

Executing the query above will produce the following output −

ID NAME AGE ADDRESS SALARY
2 Khilan 25 Delhi 1500.00
3 Kaushik 23 Kota 2000.00
6 Komal 22 Hyderabad 4500.00

The following query retrieves all records in CUSTOMERS table whose name ends with 'sh'

SELECT * FROM CUSTOMERS WHERE NAME REGEXP 'sh$';

Executing the query above will produce the following output −

ID NAME AGE ADDRESS SALARY
1 Ramesh 32 Ahmedabad 2000.00

Here, we are retrieving all the records whose name contain 'sh'

SELECT * FROM CUSTOMERS WHERE NAME REGEXP 'sh';

As we can see the output, there are only two names that contain 'sh'.

ID NAME AGE ADDRESS SALARY
1 Ramesh 32 Ahmedabad 2000.00
3 Kaushik 23 Kota 2000.00

In the following query, we are finding all the names starting with a vowel and ending with 'ol'

SELECT * FROM CUSTOMERS WHERE NAME REGEXP '^[aeiou].*ol$';

It returned an empty set because the CUSTOMERS table do not have any names who starts with vowel and ends with 'ol'

Empty set (0.00 sec)

The following query finds all the names in the CUSTOMERS table whose name starts with a consonant −

SELECT * FROM CUSTOMERS WHERE NAME REGEXP '^[^aeiou]';

Executing the query above will produce the following output −

ID NAME AGE ADDRESS SALARY
1 Ramesh 32 Ahmedabad 2000.00
2 Khilan 25 Delhi 1500.00
3 Kaushik 23 Kota 2000.00
4 Chaitali 25 Mumbai 6500.00
5 Hardik 27 Bhopal 8500.00
6 Komal 22 Hyderabad 4500.00
7 Muffy 24 Indore 10000.00

Regular Expression Functions and Operators

Following is the table of functions and operators of regular expressions.

S. No Function or Operator
1

NOT REGEXP

Negation of REGEXP

2

REGEXP

Checks whether the string matches regular expression or not

3

REGEXP_INSTR()

Returns the starting index of substring matching regular expression

4

REGEXP_LIKE()

Returns whether the string matches the regular expression

5

REGEXP_REPLACE()

Replaces substrings matching the regular expression

6

REGEXP_SUBSTR()

Returns substrings matching the regular expression

7

RLIKE

Checks whether the string matches regular expression or not

Advertisements