PHP: UTF8 vs. ISO-8859-1
Perhaps some of You know the problem of different encodings and are looking for a php function which detects automatically if utf8_encoding has to be called or not.
Seeking the web for that issue leads to a solution of the joomla community who solved it this way:
<?
$s=’öäüß’; //string to check if utf8 or not
$is_utf=1;
preg_match(’%^(?:
[\x09\x0A\x0D\x20-\x7E] # ASCII
| [\xC2-\xDF][\x80-\xBF] # non-overlong 2-byte
| \xE0[\xA0-\xBF][\x80-\xBF] # excluding overlongs
| [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2} # straight 3-byte
| \xED[\x80-\x9F][\x80-\xBF] # excluding surrogates
| \xF0[\x90-\xBF][\x80-\xBF]{2} # planes 1-3
| [\xF1-\xF3][\x80-\xBF]{3} # planes 4-15
| \xF4[\x80-\x8F][\x80-\xBF]{2} # plane 16
)*$%xs’,
$s)
)
$is_utf=0;
?>
Trying this source with little $s strings run properly.
But it did not take long and there was an accident. Try to parse a bigger string and fail on some PHP 5.2 server. I tried to increase memory_size and other variables, but there was no use in it.
The apache error.log says: [notice] child pid 5264 exit signal Segmentation fault (11)
And the php script itself has no error message or comment at all. “Internal Error”…
You can see the bug report there and as there is no solution found yet, we have to use the splitted strings instead: http://bugs.php.net/bug.php?id=37793
In the end I split the string into smaller peaces and not preg_match is able to work stable again:
<?
function autoencoder($s)//encode if necessary
{
$is_utf=1;
$ss=SplitByLength($s,5000);//with 10000+ => segfault (apache/php)
foreach($ss AS $s_)
{
if
(
preg_match(’%^(?:
[\x09\x0A\x0D\x20-\x7E] # ASCII
| [\xC2-\xDF][\x80-\xBF] # non-overlong 2-byte
| \xE0[\xA0-\xBF][\x80-\xBF] # excluding overlongs
| [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2} # straight 3-byte
| \xED[\x80-\x9F][\x80-\xBF] # excluding surrogates
| \xF0[\x90-\xBF][\x80-\xBF]{2} # planes 1-3
| [\xF1-\xF3][\x80-\xBF]{3} # planes 4-15
| \xF4[\x80-\x8F][\x80-\xBF]{2} # plane 16
)*$%xs’,
$s_)
)
$is_utf=0;
}
if(!$is_utf)
$s=utf8_encode($s);
return $s;
}
function SplitByLength($string, $chunkLength=1)
{
$Result = array();
$Remainder = strlen($string) % $chunkLength;
$cycles = ((strlen($string) - $Remainder) / $chunkLength) + (($Remainder != 0) ? 1 : 0);
for ($x=0; $x < $cycles; $x++)
$Result[$x] = substr($string, ($x * $chunkLength), $chunkLength);
return $Result;
}
?>
Good luck!




Januar 11th, 2011 at 05:42
“Affordable Webdesign Doesn’t Mean Compromise”…
THE BEST SERVICE EVER AND MY SITE DIDN’T COST ME A FORTUNE….
Januar 15th, 2011 at 03:40
MOST INFORMATIVE SITE FOR ELECTRONICS….
**YOUTUBE VIDEO REVIEWS ON THE HOTTEST ELECTRONICS OUT**…
September 8th, 2011 at 21:43
Hello…
My life,vist it http://www.pattayamate.com/zhangda/blog/63653/ ,Thanks….
September 10th, 2011 at 22:44
Hello…
My life,vist it http://www4.atword.jp/zhangda/2011/09/01/cupcake-wedding-cakes-the-big-apple/ ,Thanks….
Oktober 21st, 2011 at 19:50
Great One…
What does a music therapist do, and what is their education like? , http://ftdrfd.sweetcircles.com/ …
August 14th, 2012 at 10:59
Seized Cars…
Howdy! Quick question that’s completely off topic. Do you know how to make your site mobile friendly? My site looks weird when viewing from my iphone 4. I’m trying to find a theme or plugin that might be able to fix this issue. If you have any recomm…
September 28th, 2012 at 11:09
Work At Home…
I’m truly enjoying the design and layout of your site. It’s a very easy on the eyes which makes it much more pleasant for me to come here and visit more often. Did you hire out a designer to create your theme? Superb work!…
Oktober 4th, 2012 at 11:24
Easy Money…
Hey, I think your website might be having browser compatibility issues. When I look at your blog in Safari, it looks fine but when opening in Internet Explorer, it has some overlapping. I just wanted to give you a quick heads up! Other then that, excel…