PHP: Iterate UTF-8 String Character by Character

Question by Guest | 2014-03-13 at 20:28

I would like to iterate through a PHP string character by character. Up to now, I have developed the following function for this purpose:

$s = 'abc'; // is working
$s = 'äüö'; // is not working
 
for ($i = 0; $i < strlen($s); $i++) {
  $c = $s[$i];
}

Unfortunately, this way is only working for ASCII characters such as "abc". As soon as the string is containing some Unicode letters such as umlauts ("äöü"), it is no longer working. Apparently the function is not aware of multibyte characters (ä is consisting of 2 bytes in UTF-8 encoding).

What can I do to nevertheless iterate through the string character by character in a loop?

ReplyPositiveNegativeDateVotes

Progger99

3Best Answer3 Votes

Indeed, most of the normal PHP functions are not capable of multibyte strings.

But you can do it this way:

$s = 'abcäüö';
$arr = preg_split('//u', $s, -1, PREG_SPLIT_NO_EMPTY);
 
foreach ($arr as $v) {
  echo $v;
}

Using preg_split, you are splitting the string into a single characters and you store them in an array. After that, you can easily loop through the array to access your individual characters.

By the way, we are using the modifier "u" so that the string is treated as UTF-8.
2014-03-13 at 23:32

ReplyPositive Negative

Guest

00 Votes

I think the most efficient way to process each character in a UTF-8 (or similarly encoded) string would be to work through the string using mb_substr. In each iteration of the processing loop, mb_substr would be called twice (to find the next character and the remaining string). It would pass only the remaining string to the next iteration. This way, the main overhead in each iteration would be finding the next character (done twice), which takes only one to five or so operations, depending on the byte length of the character.

If this description is not clear, let me know and I'll provide a working PHP function.
2016-02-29 at 23:56

ReplyPositive Negative

Guest

00 Votes

Yes, of course. A working example would be very good to see what you mean. Thank you very much for it!
2016-03-01 at 00:31

Positive Negative

Important Note

Please note: The contributions published on askingbox.com are contributions of users and should not substitute professional advice. They are not verified by independents and do not necessarily reflect the opinion of askingbox.com. Learn more.

Participate

Ask your own question or write your own article on askingbox.com. That’s how it’s done.

PHP: Iterate UTF-8 String Character by Character

Related Topics

PHP: Permit only certain Letters, Numbers and Characters in a String

MySQL: Line Breaks in MySQL

PHP Mail Function: UTF-8 E-Mail Headers

PHP: Sending an E-Mail

PHP: Check Strings with Ctype-Functions for Character Classes

HTML: Eliminate problems with special characters and character encodings

Android Programming: Receive Responce from HTTP POST Request

Important Note

Participate

Participate

Topics

Info

About