Python Internet String Preparation


To identify different things in the internet, it is necessary to compare different identification for equality. The comparison procedure depends on the application domain. For an example, some things are case-insensitive etc. To check these kind of information stringprep is used.

The RFC 3454 defines the procedure to prepare the Unicode strings before transmitting through the wire. After going through the preparation procedure, they have a certain normalized form.

The RFC defines a set of tables; these tables can be combined into profiles. For an example there is a profile of stringprep is nameprep. In the nameprep, there are internationalized domain names

There are two kinds of tables, the set and the mappings. If one character is present in the set table, it will return true, otherwise false. For mapping tables, when the key is passed, it will return the associated value.

To use this modules, we need to import the stringprep module in our code.

import stringprep

The stringprep tables are −

Sr.No. Tables & Description
1

stringprep.in_table_a1(code)

It is unsigned code points in Unicode 3.2

2

stringprep.in_table_b1(code)

It is commonly mapped to nothing.

3

stringprep.in_table_b2(code)

Return mapped value for code to table B.2. Mapping for NFKC case-folding.

4

stringprep.in_table_b3(code)

Mapping case-folding with no normalization.

5

stringprep.in_table_c11(code)

ASCII space characters

6

stringprep.in_table_c12(code)

Non-ASCII Space characters

7

stringprep.in_table_c11_c12(code)

Combination of ASCII and non-ASCII Space characters

8

stringprep.in_table_c21(code)

ASCII control characters

9

stringprep.in_table_c22(code)

Non-ASCII control characters

10

stringprep.in_table_c21_c22(code)

Combination of ASCII and non-ASCII control characters

11

stringprep.in_table_c3(code)

Characters for private use

12

stringprep.in_table_c4(code)

Non-character code points

13

stringprep.in_table_c5(code)

Surrogate Codes

14

stringprep.in_table_c6(code)

Inappropriate for the plain text characters

15

stringprep.in_table_c7(code)

Inappropriate for canonical representation

16

stringprep.in_table_c8(code)

Display property change codes

17

stringprep.in_table_c9(code)

Tagging characters

18

stringprep.in_table_d1(code)

Characters, which have ‘R’ and ‘AL’ bidirectional property.

19

stringprep.in_table_d2(code)

Characters, which have ‘L’ bidirectional property.

Example Code

import stringprep as sp
print('\u0020') #The space character
print(sp.in_table_c11('\u0020')) #It is inside the ASCII space characters
print(sp.in_table_d2('L')) #Letter L has bidirectional property from left to right
print(sp.in_table_d1('L')) #Letter L has no bidirectional property for right to left

Output

True
True
False

Updated on: 30-Jul-2019

204 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements