Article Categories

Selected Reading

mbrtowc() function in C/C++ program

C C++ Server Side Programming Programming

The mbrtowc() function is used to convert a multibyte character sequence to a wide character. This function is part of the C standard library and is defined in the <wchar.h> header file. It provides a safe way to convert multibyte characters (like UTF-8) to wide character representation.

Syntax

size_t mbrtowc(wchar_t* pwc, const char* s, size_t n, mbstate_t* ps);

Parameters

The function accepts the following parameters −

pwc − Pointer to the location where the resulting wide character will be stored
s − Pointer to the multibyte character string to be converted
n − Maximum number of bytes to examine from the string
ps − Pointer to the conversion state object

Return Value

The function returns different values based on the conversion result −

0 − The character converted is the null character
1 to n − Number of bytes that make up the converted multibyte character
(size_t)-2 − The next n bytes form an incomplete but valid multibyte character
(size_t)-1 − Encoding error occurred, errno is set to EILSEQ

Example 1: Basic Conversion

Here's a simple example demonstrating the basic usage of mbrtowc() −

#include <stdio.h>
#include <wchar.h>
#include <locale.h>
#include <string.h>

void convertString(const char* str) {
    mbstate_t state = {0};
    wchar_t wc;
    size_t result;
    const char* ptr = str;
    
    printf("Converting: %s\n", str);
    
    while (*ptr) {
        result = mbrtowc(&wc, ptr, strlen(ptr), &state);
        
        if (result == 0) {
            break;
        } else if (result == (size_t)-1) {
            printf("Encoding error\n");
            break;
        } else if (result == (size_t)-2) {
            printf("Incomplete character\n");
            break;
        } else {
            printf("Converted %zu bytes to wide character: %lc\n", result, wc);
            ptr += result;
        }
    }
}

int main() {
    setlocale(LC_ALL, "");
    
    const char* text = "Hello";
    convertString(text);
    
    return 0;
}

Converting: Hello
Converted 1 bytes to wide character: H
Converted 1 bytes to wide character: e
Converted 1 bytes to wide character: l
Converted 1 bytes to wide character: l
Converted 1 bytes to wide character: o

Example 2: UTF-8 Multibyte Conversion

This example demonstrates conversion of UTF-8 encoded characters −

#include <stdio.h>
#include <wchar.h>
#include <locale.h>
#include <string.h>

int main() {
    setlocale(LC_ALL, "");
    
    const char utf8_str[] = {0x48, 0x65, 0x6C, 0x6C, 0x6F, 0x20, 
                            0xE2, 0x9C, 0x93, 0x00}; // "Hello ?"
    
    mbstate_t state = {0};
    wchar_t wc;
    size_t result;
    const char* ptr = utf8_str;
    
    printf("Converting UTF-8 string:\n");
    
    while (*ptr) {
        result = mbrtowc(&wc, ptr, strlen(ptr), &state);
        
        if (result == 0) {
            break;
        } else if (result > 0 && result != (size_t)-1 && result != (size_t)-2) {
            printf("Converted %zu bytes, wide char code: %d\n", result, (int)wc);
            ptr += result;
        } else {
            printf("Error or incomplete sequence\n");
            break;
        }
    }
    
    return 0;
}

Converting UTF-8 string:
Converted 1 bytes, wide char code: 72
Converted 1 bytes, wide char code: 101
Converted 1 bytes, wide char code: 108
Converted 1 bytes, wide char code: 108
Converted 1 bytes, wide char code: 111
Converted 1 bytes, wide char code: 32
Converted 3 bytes, wide char code: 10003

Key Points

Always initialize the mbstate_t object to zero before first use
Set appropriate locale using setlocale() for proper multibyte support
Check return values to handle errors and incomplete sequences properly
The function is thread-safe when each thread uses its own mbstate_t object

Conclusion

The mbrtowc() function provides a reliable way to convert multibyte character sequences to wide characters in C. It handles various encoding schemes and provides detailed feedback about the conversion process through its return values.

Sunidhi Bansal

Updated on: 2026-03-15T12:48:16+05:30

353 Views

Previous Next