
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
What are Character Literals in C++?
In C++, character literals are the constant values, which are assigned to variables of the character data type. These values are represented by a character enclosed within single quotation marks.
There are mainly five types of character literals:
- Narrow-character literals
- Wide-character literals.
- UTF-8 character literals
- UTF-16 character literals
- UTF-32 character literals
Narrow-character Literals
These character literals are of type char, which represents single-byte character. It stores characters from the ASCII table, which includes values ranging from 0 to 127.
char ch = 'A';
Here, the character 'A' has an ASCII value of 65. In C++ each individual character has its own unique ASCII value, so when a character is stored in a char variable, it actually holds its numeric (ACII) value internally.
Wide-character Literals
These character literals are of type wchar_t, which represents characters from larger character sets, such as Unicode. Unlike narrow-character literals (char), which occupies only 1 byte, wchar_t occupies 2 or 4 bytes, which depends on the platform where it allows to store a much wider range of characters beyond the basic ASCII table.
// here 'L' prefix denotes a wide-character literal wchar_t ch = L'A';
In this L'A' still represents the ASCII value 65, but wchar_t allows support for characters which are outside the standard ASCII range such as characters from other languages or special symbols.
UTF-8 Character Literals
These character literals are of type char in the previous version and char8_t in C++20 and later. They are used to represent characters encoded in the UTF-8 format, which is widely used encoding for Unicode characters. They also used for special symbols in web and cross-platform systems
// here u8 indicates that the character is encoded in UTF-8 format char8_t ch = u8'A'; // can also write with auto, which automatically deduced its type as char8_t in C++20 and later auto ch = u8'A';
UTF-16 Character Literals
UTF-16 character literals are of type char16_t, which are introduced to handle characters encoded in the UTF-16 format; it is used to represent a larger set of Unicode characters than ASCII. This is used in windows APIs and for Unicode text like Chinese characters etc.
// here u specifies UTF-16 character literal char16_t ch = u'A'; // you can also use 'auto' to automatically deduce the type auto ch = u'A';
UTF-32 Character Literals
UTF-32 character literals are of type char16_t, which are introduced to represent characters encoded in the UTF-32 format, it is used to represent a broader range of Unicode characters, which includes range outside the basic ASCII or even UTF-16 range like emojis.
// here u specifies UTF-32 character literal char32_t ch = U'A'; // you can also use 'auto' to automatically deduce the type auto ch = U'A';
Example of Characters Literals
Here is the following example code representing different types of character literals and their encodings:
#include <iostream> using namespace std; int main() { // Narrow character literal (ASCII-based) char ch1 = 'H'; cout << "Narrow character: " << ch1 << endl; // UTF-8 character, this is commonly used for special symbols in web and cross-platform systems auto ch3 = u8"¡"; // inverted exclamation for Spanish cout << "UTF-8 character: " << ch3 << endl; return 0; }
Output
For this make sure your compiler supports unicode and wide-character output.
Narrow character: H UTF-8 character: ¡
Comparison Table
Literal Type | Prefix | Type | Encoding | Typical Size |
Narrow-character | None | char | ASCII/UTF-8 | 1 byte |
Wide-character | L | wchar_t | UTF-16/32 | 2 or 4 bytes |
UTF-8 | u8 | char8_t | UTF-8 | 1 byte |
UTF-16 | u | char16_t | UTF-16 | 2 bytes |
UTF-32 | U | char32_t | UTF-32 | 4 bytes |