
- Python Basics
- Python - Home
- Python - Overview
- Python - History
- Python - Features
- Python vs C++
- Python - Hello World Program
- Python - Application Areas
- Python - Interpreter
- Python - Environment Setup
- Python - Virtual Environment
- Python - Basic Syntax
- Python - Variables
- Python - Data Types
- Python - Type Casting
- Python - Unicode System
- Python - Literals
- Python - Operators
- Python - Arithmetic Operators
- Python - Assignment Operators
- Python - Augmented Addition Operator (+=)
- Python - Comparison Operators
- Python - Logical Operators
- Python - Bitwise Operators
- Python - Membership Operators
- Python - Identity Operators
- Python - Comments
- Python - User Input
- Python - Numbers
- Python - Booleans
- Python Control Statements
- Python - Control Flow
- Python - Decision Making
- Python - If else
- Python - Match-Case Statement
- Python - The for Loop
- Python - The for-else Loop
- Python - While Loops
- Python - The break Statement
- Python - The continue Statement
- Python - The pass Statement
- Python Functions & Modules
- Python - Functions
- Python - Default Arguments
- Python - Keyword Arguments
- Python - Keyword-Only Arguments
- Python - Positional Arguments
- Python - Positional-Only Arguments
- Python - Arbitrary Arguments
- Python - Variables Scope
- Python - Function Annotations
- Python - Modules
- Python - Built in Functions
- Python Strings
- Python - Strings
- Python - Slicing Strings
- Python - Modify Strings
- Python - String Concatenation
- Python - String Formatting
- Python - Escape Characters
- Python - String Methods
- Python - String Exercises
- Python Lists
- Python - Lists
- Python - Access List Items
- Python - Change List Items
- Python - Add List Items
- Python - Remove List Items
- Python - Loop Lists
- Python - List Comprehension
- Python - Sort Lists
- Python - Copy Lists
- Python - Join Lists
- Python - List Methods
- Python - List Exercises
- Python Tuples
- Python - Tuples
- Python - Access Tuple Items
- Python - Update Tuples
- Python - Unpack Tuples
- Python - Loop Tuples
- Python - Join Tuples
- Python - Tuple Methods
- Python - Tuple Exercises
- Python Sets
- Python - Sets
- Python - Access Set Items
- Python - Add Set Items
- Python - Remove Set Items
- Python - Loop Sets
- Python - Join Sets
- Python - Copy Sets
- Python - Set Operators
- Python - Set Methods
- Python - Set Exercises
- Python Dictionaries
- Python - Dictionaries
- Python - Access Dictionary Items
- Python - Change Dictionary Items
- Python - Add Dictionary Items
- Python - Remove Dictionary Items
- Python - Dictionary View Objects
- Python - Loop Dictionaries
- Python - Copy Dictionaries
- Python - Nested Dictionaries
- Python - Dictionary Methods
- Python - Dictionary Exercises
- Python Arrays
- Python - Arrays
- Python - Access Array Items
- Python - Add Array Items
- Python - Remove Array Items
- Python - Loop Arrays
- Python - Copy Arrays
- Python - Reverse Arrays
- Python - Sort Arrays
- Python - Join Arrays
- Python - Array Methods
- Python - Array Exercises
- Python File Handling
- Python - File Handling
- Python - Write to File
- Python - Read Files
- Python - Renaming and Deleting Files
- Python - Directories
- Python - File Methods
- Python - OS File/Directory Methods
- Object Oriented Programming
- Python - OOPs Concepts
- Python - Object & Classes
- Python - Class Attributes
- Python - Class Methods
- Python - Static Methods
- Python - Constructors
- Python - Access Modifiers
- Python - Inheritance
- Python - Polymorphism
- Python - Method Overriding
- Python - Method Overloading
- Python - Dynamic Binding
- Python - Dynamic Typing
- Python - Abstraction
- Python - Encapsulation
- Python - Interfaces
- Python - Packages
- Python - Inner Classes
- Python - Anonymous Class and Objects
- Python - Singleton Class
- Python - Wrapper Classes
- Python - Enums
- Python - Reflection
- Python Errors & Exceptions
- Python - Syntax Errors
- Python - Exceptions
- Python - try-except Block
- Python - try-finally Block
- Python - Raising Exceptions
- Python - Exception Chaining
- Python - Nested try Block
- Python - User-defined Exception
- Python - Logging
- Python - Assertions
- Python - Built-in Exceptions
- Python Multithreading
- Python - Multithreading
- Python - Thread Life Cycle
- Python - Creating a Thread
- Python - Starting a Thread
- Python - Joining Threads
- Python - Naming Thread
- Python - Thread Scheduling
- Python - Thread Pools
- Python - Main Thread
- Python - Thread Priority
- Python - Daemon Threads
- Python - Synchronizing Threads
- Python Synchronization
- Python - Inter-thread Communication
- Python - Thread Deadlock
- Python - Interrupting a Thread
- Python Networking
- Python - Networking
- Python - Socket Programming
- Python - URL Processing
- Python - Generics
- Python Miscellenous
- Python - Date & Time
- Python - Maths
- Python - Iterators
- Python - Generators
- Python - Closures
- Python - Decorators
- Python - Recursion
- Python - Reg Expressions
- Python - PIP
- Python - Database Access
- Python - Weak References
- Python - Serialization
- Python - Templating
- Python - Output Formatting
- Python - Performance Measurement
- Python - Data Compression
- Python - CGI Programming
- Python - XML Processing
- Python - GUI Programming
- Python - Command-Line Arguments
- Python - Docstrings
- Python - JSON
- Python - Sending Email
- Python - Further Extensions
- Python - Tools/Utilities
- Python - GUIs
- Python Questions and Answers
- Python - Programming Examples
- Python - Quick Guide
- Python - Useful Resources
- Python - Discussion
Python - Unicode System
Software applications often require to display messages output in a variety in different languages such as in English, French, Japanese, Hebrew, or Hindi. Pythonâs string type uses the Unicode Standard for representing characters. It makes the program possible to work with all these different possible characters.
A character is the smallest possible component of a text. âAâ, âBâ, âCâ, etc., are all different characters. So are âÃâ and âÃâ.
According to The Unicode standard, characters are represented by code points. A code point value is an integer in the range 0 to 0x10FFFF.
A sequence of code points is represented in memory as a set of code units, mapped to 8-bit bytes. The rules for translating a Unicode string into a sequence of bytes are called a character encoding.
Three types of encodings are present, UTF-8, UTF-16 and UTF-32. UTF stands for Unicode Transformation Format.
Python 3.0 onwards has built-in support for Unicode. The str type contains Unicode characters, hence any string created using single, double or the triple-quoted string syntax is stored as Unicode. The default encoding for Python source code is UTF-8.
Hence, string may contain literal representation of a Unicode character (¾) or its Unicode value (\u00BE).
var = "¾" print (var) var = "\u00BE" print (var)
This above code will produce the following output −
'¾' ¾
In the following example, a string â10â is stored using the Unicode values of 1 and 0 which are \u0031 and u0030 respectively.
var = "\u0031\u0030" print (var)
It will produce the following output −
10
Strings display the text in a human-readable format, and bytes store the characters as binary data. Encoding converts data from a character string to a series of bytes. Decoding translates the bytes back to human-readable characters and symbols. It is important not
to confuse these two methods. encode is a string method, while decode is a method of the Python byte object.
In the following example, we have a string variable that consists of ASCII characters. ASCII is a subset of Unicode character set. The encode() method is used to convert it into a bytes object.
string = "Hello" tobytes = string.encode('utf-8') print (tobytes) string = tobytes.decode('utf-8') print (string)
The decode() method converts byte object back to the str object. The encodeing method used is utf-8.
b'Hello' Hello
In the following example, the Rupee symbol (â¹) is stored in the variable using its Unicode value. We convert the string to bytes and back to str.
string = "\u20B9" print (string) tobytes = string.encode('utf-8') print (tobytes) string = tobytes.decode('utf-8') print (string)
When you execute the above code, it will produce the following output −
â¹ b'\xe2\x82\xb9' â¹