strtolower and UTF-8

Posted on September 27, 2010
by
Charset issues is something that always made me go mad. And since I’m french and I’ve designed many french websites, it’s something I couldn’t escape, thanks to all these special chars we have in our language :)

Well, today an issue came up with the strtolower function. Look at what follows:

$t1 = 'Expérience';
$t2 = strtolower($t1);
echo $t2; // echos 'expience'

See? It drops the two letters “ér”. No matter why and how it processes (for more details about UTF-8/ISO issues, please use google), what matters is that it totally screwed up my beautiful string.

On PHP.net, you can read this function uses the charset defined in the current locale. It means that whatever the encodage of your string is (UTF-8, ISO, …), even if you work with UTF-8 all along (database tables, database connection, page chars …), it will use the current locale charset anyway.

To this, I can see two options.