strlen() php function giving the wrong length of unicode characters ?

The PHP strlen() function counts bytes, not characters, which causes incorrect results with Unicode characters that use multiple bytes. For accurate character counting with Unicode strings, use mb_strlen() instead.

Why strlen() Fails with Unicode

Unicode characters like accented letters (é, ñ, ?) often require 2-4 bytes in UTF-8 encoding. The strlen() function counts these bytes rather than the actual visible characters ?

<?php
    $unicodeString = 'JohnSm?th';
    echo "String: " . $unicodeString . "
"; echo "strlen() result: " . strlen($unicodeString) . "
"; echo "mb_strlen() result: " . mb_strlen($unicodeString, 'UTF-8') . "
"; ?>
String: JohnSm?th
strlen() result: 10
mb_strlen() result: 9

Comparison of Methods

Function What It Counts Unicode Support Result for 'JohnSm?th'
strlen() Bytes No 10
mb_strlen() Characters Yes 9

Best Practice Example

Always specify the encoding parameter when using mb_strlen() to ensure consistent results ?

<?php
    $text = 'Café résumé';
    
    // Incorrect way
    echo "strlen(): " . strlen($text) . "
"; // Correct way echo "mb_strlen(): " . mb_strlen($text, 'UTF-8') . "
"; // Also works with other encodings echo "mb_strlen() (auto-detect): " . mb_strlen($text) . "
"; ?>
strlen(): 13
mb_strlen(): 11
mb_strlen() (auto-detect): 11

Conclusion

Use mb_strlen() with UTF-8 encoding for accurate character counting in Unicode strings. The strlen() function should only be used when you specifically need byte count rather than character count.

Updated on: 2026-03-15T09:38:02+05:30

540 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements