Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
PHP – mb_strcut() function
The mb_strcut() function in PHP is used to extract a substring from a string based on byte positions rather than character positions. This is particularly useful when working with multi-byte character encodings like UTF-8, where a single character might use multiple bytes.
Syntax
string mb_strcut( string $str, int $start, int $length = null, string $encoding = null );
Parameters
The mb_strcut() function accepts the following parameters −
str − The input string to be cut.
start − The starting byte position. If positive, counting starts from the beginning (0-indexed). If negative, counting starts from the end of the string.
length − The maximum number of bytes to extract. If omitted or
null, extracts from start position to the end of the string.encoding − The character encoding. If omitted, uses the internal character encoding.
Return Value
Returns the extracted portion of the string as a string, or an empty string if the start position is beyond the string length.
Basic Example
Here's a simple example demonstrating how to extract a substring −
<?php $string = "Hello World"; $result = mb_strcut($string, 6, 5, "UTF-8"); echo $result; ?>
World
Working with Multi-byte Characters
The function handles multi-byte characters properly, ensuring cuts don't break character sequences −
<?php $string = "???????"; // Japanese text $result = mb_strcut($string, 0, 9, "UTF-8"); echo $result; echo "<br>"; echo "Bytes extracted: " . strlen($result); ?>
??? Bytes extracted: 9
Negative Start Position
Using negative start positions to extract from the end of the string −
<?php $string = "TutorialsPoint"; $result = mb_strcut($string, -5, 5, "UTF-8"); echo $result; ?>
Point
Conclusion
The mb_strcut() function is essential for byte-level string manipulation in multi-byte character environments. It ensures proper handling of character boundaries, making it safer than simple byte-cutting operations when working with UTF-8 and other multi-byte encodings.
