PHP: UTF8 vs. ISO-8859-1
Perhaps some of You know the problem of different encodings and are looking for a php function which detects automatically if utf8_encoding has to be called or not.
Seeking the web for that issue leads to a solution of the joomla community who solved it this way:
<?
$s=’öäüß’; //string to check if utf8 or not
$is_utf=1;
preg_match(’%^(?:
[\x09\x0A\x0D\x20-\x7E] # ASCII
| [\xC2-\xDF][\x80-\xBF] # non-overlong 2-byte
| \xE0[\xA0-\xBF][\x80-\xBF] # excluding overlongs
| [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2} # straight 3-byte
| \xED[\x80-\x9F][\x80-\xBF] # excluding surrogates
| \xF0[\x90-\xBF][\x80-\xBF]{2} # planes 1-3
| [\xF1-\xF3][\x80-\xBF]{3} # planes 4-15
| \xF4[\x80-\x8F][\x80-\xBF]{2} # plane 16
)*$%xs’,
$s)
)
$is_utf=0;
?>
Trying this source with little $s strings run properly.
But it did not take long and there was an accident. Try to parse a bigger string and fail on some PHP 5.2 server. I tried to increase memory_size and other variables, but there was no use in it.
The apache error.log says: [notice] child pid 5264 exit signal Segmentation fault (11)
And the php script itself has no error message or comment at all. “Internal Error”…
You can see the bug report there and as there is no solution found yet, we have to use the splitted strings instead: http://bugs.php.net/bug.php?id=37793
In the end I split the string into smaller peaces and not preg_match is able to work stable again:
<?
function autoencoder($s)//encode if necessary
{
$is_utf=1;
$ss=SplitByLength($s,5000);//with 10000+ => segfault (apache/php)
foreach($ss AS $s_)
{
if
(
preg_match(’%^(?:
[\x09\x0A\x0D\x20-\x7E] # ASCII
| [\xC2-\xDF][\x80-\xBF] # non-overlong 2-byte
| \xE0[\xA0-\xBF][\x80-\xBF] # excluding overlongs
| [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2} # straight 3-byte
| \xED[\x80-\x9F][\x80-\xBF] # excluding surrogates
| \xF0[\x90-\xBF][\x80-\xBF]{2} # planes 1-3
| [\xF1-\xF3][\x80-\xBF]{3} # planes 4-15
| \xF4[\x80-\x8F][\x80-\xBF]{2} # plane 16
)*$%xs’,
$s_)
)
$is_utf=0;
}
if(!$is_utf)
$s=utf8_encode($s);
return $s;
}
function SplitByLength($string, $chunkLength=1)
{
$Result = array();
$Remainder = strlen($string) % $chunkLength;
$cycles = ((strlen($string) - $Remainder) / $chunkLength) + (($Remainder != 0) ? 1 : 0);
for ($x=0; $x < $cycles; $x++)
$Result[$x] = substr($string, ($x * $chunkLength), $chunkLength);
return $Result;
}
?>
Good luck!




Januar 11th, 2011 at 05:42
“Affordable Webdesign Doesn’t Mean Compromise”…
THE BEST SERVICE EVER AND MY SITE DIDN’T COST ME A FORTUNE….
Januar 15th, 2011 at 03:40
MOST INFORMATIVE SITE FOR ELECTRONICS….
**YOUTUBE VIDEO REVIEWS ON THE HOTTEST ELECTRONICS OUT**…
September 8th, 2011 at 21:43
Hello…
My life,vist it http://www.pattayamate.com/zhangda/blog/63653/ ,Thanks….
September 10th, 2011 at 22:44
Hello…
My life,vist it http://www4.atword.jp/zhangda/2011/09/01/cupcake-wedding-cakes-the-big-apple/ ,Thanks….
Oktober 21st, 2011 at 19:50
Great One…
What does a music therapist do, and what is their education like? , http://ftdrfd.sweetcircles.com/ …