What is Raw String Notation in Python regular expression?


Introduction

A regular expression is a word that is frequently abbreviated as regex. Regex is a set of characters that specifies a search pattern and is mostly used in text processors and search engines to execute find and replace operations.

When a string in Python is prefixed with the letter r or R, as in r'...' and R'...', it becomes a raw string. In contrast to a conventional string, a raw string considers backslashes () as literal characters. When working with strings that include a lot of backslashes, such as regular expressions or directory paths on Windows, raw strings are helpful.

This string is not produced using the standard Python string notation "n." Instead, a one-character string including a newline character is produced—the documentation for Python 2.4.1. The backslash () character is used to escape characters that would otherwise have unique significance, such as newline, the backslash character, or the quotation character, according to string literals.

Syntax Used

For regular expression patterns, the answer is to utilize Python's raw string notation; backslashes are not treated differently in a string literal prefixed with "r."

Therefore, r"\n" is a two-character string made up of the letters "" and "n," whereas "\n" is a one-character string made up of the letter "n."

s = r'lang\tver\nPython\t3'

Algorithm

  • Import re functions
  • Initialize a string.
  • Use metacharacter r or R for using raw string notation.
  • Print the string and get the complete string without escaping any characters.

Understanding Python Raw Strings

Example 1

import re s = r"Hello\tfrom TutorialsPoint\nHi" print(s)

Output

Hello\tfrom TutorialsPoint\nHi

Code Explanation

To understand what a raw string exactly means, let’s consider the below string, having the sequence “\n”.

str = "Hello\tfrom TutorialsPoint\nHi" print(str)

The sequences "\t" and "\n" will now be regarded as escape characters since s is a literal regular string. Therefore, the necessary escape sequences (tab-space and new-line) will be produced if we print the string.

Hello	from TutorialsPoint
Hi

What will happen if we want to make s as a raw string?

# str is now a raw string # Here, both backslashes will NOT be escaped. str = r"Hello\tfrom TutorialsPoint\nHi" print(str)

Here, both the backslashes will not be treated as escape characters, so Python will not print a tab space and a new line.

Rather, it will print “\t” and “\n” literally.

Hello\tfrom TutorialsPoint\nHi

As you can see, since no characters are escaped, the output is the same as the input!

When Python Strings don’t work.

Example 2

import re s =r"Hello\xfrom TutorialsPoint" print(s)

Output

Hello\xfrom TutorialsPoint

Use this instead

import re str = r"Hello\xfrom TutorialsPoint" print(str)

Output

Hello\xfrom TutorialsPoint

Code Explanation

As a result, we cannot even include it in a string literal. Now, what can we do?

When it comes to this, the raw string is useful.

By treating the value as a simple raw string literal, we can quickly pass it into a variable!

str = r"Hello\xfrom TutorialsPoint" print(str)

Now that the issue is resolved, we may send this unprocessed text literally as a regular object!

Hello\xfrom TutorialsPoint

Conclusion

Python Raw Strings are string literals prefixed with a "r" or "R". For example, r"Hello" is a raw string. Raw Strings do not treat backslashes ("") as part of an escape sequence. It will be printed normally as a result. This feature can help us pass string literals that cannot be decoded using normal ways, like the sequence "\x". Raw strings treat backslash as a literal character. To represent special characters such as tabs and newlines, Python uses the backslash (\) to signify the start of an escape sequence

Updated on: 02-Nov-2023

4K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements