PHP: strlen - Wrong result for Diacritics, Accents and Unicode Characters

Question by Guest | 2016-06-13 at 15:40

I am using the PHP function strlen() to determine and check the length of some user input. Unfortunately, this function is only working for strings, that are not containing any diacritics, accents, umlauts or other Unicode characters.

echo strlen("abc");  // result: 3
echo strlen("äbc");  // result: 4

In my script, I need the exact number of characters and I would like to have the result "3" for both, "abc" as well as "äbc".

Is this a PHP bug or what can I do to solve this?

ReplyPositiveNegative

NetLabel

0Best Answer0 Votes

A single character can be encoded in more than one byte depending on the used character encoding. A typical example is the UTF-8 encoding that is used for most websites. In this encoding, characters like a-z encoded with one byte, characters like ä, ü or ö with multiple bytes.

The function strlen() is only counting the number of bytes, it is not taking into account the meaning or encoding of the bytes. Therefore, you should better use the function mb_strlen() instead of strlen():

echo mb_strlen("abc", "utf-8");  // result: 3
echo mb_strlen("äbc", "utf-8");  // result: 3

Apart from strlen(), the multibyte function mb_strlen() is considering the coding and the resulting character length.

As a first parameter, you can pass the string you want to check, the encoding can be passed as the second parameter. If your website is UTF-8 encoded like most websites, you have to specify "utf-8" at this point.
2016-06-13 at 23:16

ReplyPositive Negative

Reply

Important Note

Please note: The contributions published on askingbox.com are contributions of users and should not substitute professional advice. They are not verified by independents and do not necessarily reflect the opinion of askingbox.com. Learn more.

Participate

Ask your own question or write your own article on askingbox.com. That’s how it’s done.

PHP: strlen - Wrong result for Diacritics, Accents and Unicode Characters

Related Topics

PHP: Permit only certain Letters, Numbers and Characters in a String

MySQL: Line Breaks in MySQL

PHP: Remove arbitrary Characters at the Beginning and the End of a String

PHP: Check Strings with Ctype-Functions for Character Classes

PHP Mail Function: UTF-8 E-Mail Headers

Textarea Maxlength: Limit Maximum Number of Characters in Textarea

PHP: Sending an E-Mail

Important Note

Participate

Participate

Topics

Info

About