I got a similar problem with Arabic characters in article titles.
I've finally fixed it myself.
Here's my hack.
On the file
includes/phpInputFilter/class.inputfilter.php ,
replace the function "decode": (near line 450)
Code:
function decode($source)
{
// url decode
$source = html_entity_decode($source, ENT_QUOTES, "ISO-8859-1");
// convert decimal
$source = preg_replace('/(\d+);/me', "chr(\\1)", $source); // decimal notation
// convert hex
$source = preg_replace('/([a-f0-9]+);/mei', "chr(0x\\1)", $source); // hex notation
return $source;
}
By this hacked one:
Code:
function decode($source)
{
$source = preg_replace('/(\d{2,5});/','#@!\1!@#',$source);
// url decode
$source = html_entity_decode($source, ENT_QUOTES, "ISO-8859-1");
// convert decimal
$source = preg_replace('/(\d+);/me', "chr(\\1)", $source); // decimal notation
// convert hex
$source = preg_replace('/([a-f0-9]+);/mei', "chr(0x\\1)", $source); // hex notation
$source = preg_replace('/#@!(\d{2,5})!@#/','\1;',$source);
return $source;
}
The idea, is to let the decode function do its works, without affecting all theses special NCR characters.