Article Categories

Selected Reading

Python Program to Determine the Unicode Code Point at a given index

Python Server Side Programming Programming

A Unicode code point is a unique number that represents a character in the Unicode standard. Unicode supports over 130,000 characters including letters, symbols, and emojis. Python provides several methods to determine the Unicode code point at a specific index: ord() function, codecs module, unicodedata module, and array module.

What is a Unicode Code Point?

Every Unicode character has a unique numeric identifier called a code point. Code points are represented in hexadecimal notation with a "U+" prefix followed by a four or more digit hexadecimal number (e.g., U+0065 for 'e').

Method 1: Using ord() Function

The ord() function is the simplest way to get the Unicode code point of a character at a given index ?

Syntax

code_point = ord(string[index])

Example

def get_unicode_code_point(string, index):
    char = string[index]
    code_point = ord(char)
    return code_point

# Test the function
string = "Hello, World!"
index = 1
code_point = get_unicode_code_point(string, index)
print(f"Character '{string[index]}' at index {index} has Unicode code point U+{code_point:04X}")

Character 'e' at index 1 has Unicode code point U+0065

Method 2: Using codecs Module

The codecs module provides encoding/decoding functionality that can help extract Unicode code points ?

Example

import codecs

string = "Hello, World!"
index = 1
char = string[index]

# Get the code point using ord() (most straightforward with codecs)
code_point = ord(char)
print(f"Character '{char}' at index {index} has Unicode code point U+{code_point:04X}")

# Alternative: Using UTF-8 encoding
byte_string = char.encode('utf-8')
print(f"UTF-8 bytes: {byte_string}")

Character 'e' at index 1 has Unicode code point U+0065
UTF-8 bytes: b'e'

Method 3: Using unicodedata Module

The unicodedata module provides additional Unicode character information ?

Example

import unicodedata

string = "Hello, World! ?"
index = 14  # Star emoji

char = string[index]
code_point = ord(char)

try:
    name = unicodedata.name(char)
    print(f"Character '{char}' at index {index}")
    print(f"Unicode code point: U+{code_point:04X}")
    print(f"Character name: {name}")
except ValueError:
    print(f"Character '{char}' has no Unicode name")

Character '?' at index 14
Unicode code point: U+1F31F
Character name: GLOWING STAR

Method 4: Using array Module

The array module can store Unicode code points efficiently for multiple characters ?

Example

import array

string = "Hello, World!"
index = 1

# Create array of Unicode code points for all characters
code_points = array.array('I', [ord(char) for char in string])
code_point = code_points[index]

print(f"Character '{string[index]}' at index {index} has Unicode code point U+{code_point:04X}")
print(f"All code points: {[f'U+{cp:04X}' for cp in code_points[:5]]}...")

Character 'e' at index 1 has Unicode code point U+0065
All code points: ['U+0048', 'U+0065', 'U+006C', 'U+006C', 'U+006F']...

Comparison

Method	Complexity	Best For
`ord()`	Simple	Single character lookup
codecs module	Complex	Encoding/decoding operations
unicodedata module	Medium	Character names and properties
array module	Medium	Bulk operations on multiple characters

Conclusion

The ord() function is the most direct method to get Unicode code points at a given index. Use unicodedata for character names and properties, and array module for processing multiple characters efficiently.

Rohan Singh

Updated on: 2026-03-27T07:28:36+05:30

5K+ Views

Previous Next

Article Categories

Python Program to Determine the Unicode Code Point at a given index

What is a Unicode Code Point?

Method 1: Using ord() Function

Syntax

Example

Method 2: Using codecs Module

Example

Method 3: Using unicodedata Module

Example

Method 4: Using array Module

Example

Comparison

Conclusion

Learn More in Our Tutorials