Solution to "Special Characters" problem in mospagebreak

For Joomla! 1.0 Coding related discussions.
Locked
User avatar
energumeno
Joomla! Apprentice
Joomla! Apprentice
Posts: 5
Joined: Sat Oct 22, 2005 1:14 am
Location: Puerto Rico
Contact:

Solution to "Special Characters" problem in mospagebreak

Post by energumeno » Sat Oct 22, 2005 3:08 am

Hi,

i wrote a post on the mambo forum some time ago and have seen some related posts regarding the problem with mospagebreak and "Special Characters".

Let me explain.

I am working on a site which is on spanish and when i try to write a static content article i want to make a pagebreak with a title so it is shown on the Article index, which includes spanish characters such as ñ,á,é,í,ó,ú,¿ ... to mention some it truncates the title. For example .. when i write a title like "Este es el título" (notice that the first í of título is accented) it renders "Este es el t". or a most critical example when i write "¿Qué es esto?" it renders "Page 1" or whatever page number it has.

Some have suggested workarounds with the charset which clearly doesn't work, and other posts have just been unanswerd.

The thing is that i have found the source of the problem. it is related to the use of the parse_str PHP function on the createTOC function in JOOMLA_FOLDER/mambots/content/mospaging.php on lines 145 to 185. I looked the PHP documentation for parse_str to see exactly how it worked and it parses a string as if it were a "QueryString" from a URL so when it reads:

title=something

it stores the results of the parsing in an array named $args2 and then uses $args2['title'] which would contain the string 'something'.

So far so good.

Now, when we use somthing like this {mospagebreak title=Apples & Bananas}, the editor converts it to {mospagebreak title=Apples & Bananas} (converts the ampersand (&) to its html code equivalent (&)) then on the function mentioned just above it replaces & for & (str_replace( '&', '&', $matches[0][2] )) and then it parses the string with parse_str. What this does is that when the string is parsed it looks like this:

title=Apples & Bananas

so it will parse it to be 2 variables:
title = 'Apples'
Bananas = ''

so anything after the ampersand on the title is lost!!!!

Now, returning to the "special characters" issue. When we enter for example an ¿ it will store ¿ or when we enter ñ it will store ñ or when we enter á it will store &aacute and so on. Then when the createTOC() function is generating our Article Index it just ignores all the text after the first & which will cause the title to be truncated.

Example:

{mospagebreak title=Este es el Título}

is converted to

{mospagebreak title=Este es el Título}

which translates in the parse_str finding 2 variables:
title = 'Este es el T'
iacute;tulo= ''

Example2:

{mospagebreak title=¿qué es esto?}

is converted to

{mospagebreak title=¿ué es esto?}

which translates in the parse_str finding 3 variables:
title = ''
iqueset;u = ''
eacute; es esto = ''

and since title is "empty" it renders Page 1 instead of the actual title.


A workaround i did for my site was to substitue the replace and parse line:

Code: Select all

parse_str( str_replace( '&', '&', $bot[2] ), $args2 );
with

Code: Select all

parse_str( 
    str_replace( '&', '&', 
    str_replace( '¿', '¿', 
    str_replace( '¡', '¡', 
    str_replace( 'Á', 'Á',
    str_replace( 'á', 'á', 
    str_replace( 'É', 'É', 
    str_replace( 'é', 'é',
    str_replace( 'Í', 'Í',
    str_replace( 'í', 'í',
    str_replace( 'Ó', 'Ó', 
    str_replace( 'ó', 'ó',
    str_replace( 'Ú', 'Ú', 
    str_replace( 'ú', 'ú', 
    str_replace( 'Ü', 'Ü', 
    str_replace( 'Ü', 'ü', 
    str_replace( 'Ñ', 'Ñ', 
    str_replace( 'ñ', 'ñ', 
    str_replace( '&', '&', 
    $bot[2] ))))))))))))))))))
    , $args2 
);

which solves my problem with spanish characters on the article index, but still, the & gets unrendered on the TOC and anything beyond it will be truncated.

my solution works for spanish sites and can be extended by making the appropriate str_replace's to suit other languages too, but i would prefer to find another (simpler) solution.

As i have seen on every document i've read and as a i can infer from the createTOC() function, mospagebreak only has one parameter: title. So, it would be a better solution to just use whatever is placed after the "title=" as the title on the TOC unless i'm wrong and mospagebreak could take another parameter i don't know of. Any other suggestions??

hope this is helpful for some people.
Last edited by energumeno on Sat Oct 22, 2005 5:54 pm, edited 1 time in total.
Energúmeno
PE, MCSD, MCDBA, MCT

"Aquel que no esta orgulloso de su origen, no valdrá nunca nada, pues empieza por despreciarse a sí mismo"
-- Pedro Albizu Campos

User avatar
energumeno
Joomla! Apprentice
Joomla! Apprentice
Posts: 5
Joined: Sat Oct 22, 2005 1:14 am
Location: Puerto Rico
Contact:

Found it !!!! (except for &)

Post by energumeno » Sun Oct 23, 2005 2:13 am

instead of str_replace():

Code: Select all

parse_str( str_replace( '&', '&', $bot[2] ), $args2 );
we can html_entity_decode():

Code: Select all

parse_str( html_entity_decode( $bot[2] ), $args2 );
which will convert all special characters from it's htmlencode equivalent to the character itself.

i think this is an important contribution to the internationalization of Joomla! since this problem has annoyed me for the past few months until i decided yesterday to dig on the code and find the source of the problem and an acceptable solution. i'm sure there's a lot of people out there wondering how to work around this problem in an easy and effective way.

The only detail remaining is the & which is imposible to work around by using parse_str() as explained on the original post. and would require a different approach for setting $args2 which doesn't involve parse_str().
Last edited by energumeno on Sun Oct 23, 2005 2:21 am, edited 1 time in total.
Energúmeno
PE, MCSD, MCDBA, MCT

"Aquel que no esta orgulloso de su origen, no valdrá nunca nada, pues empieza por despreciarse a sí mismo"
-- Pedro Albizu Campos

User avatar
lpkb
Joomla! Enthusiast
Joomla! Enthusiast
Posts: 159
Joined: Mon Aug 22, 2005 9:44 pm
Contact:

Re: Solution to "Special Characters" problem in mospagebreak

Post by lpkb » Thu Oct 27, 2005 3:25 pm

Thank you for this--I'm going to test it over the next few days, but it seems to be exactly what I needed.

FYI mospagebreak can also take a "heading" parameter which changes the name of the first item in the index. I think heading needs to come before title in the first use, but I'm not sure. (eg.  {mospagebreak heading=Page one title&title=Page two title}

User avatar
energumeno
Joomla! Apprentice
Joomla! Apprentice
Posts: 5
Joined: Sat Oct 22, 2005 1:14 am
Location: Puerto Rico
Contact:

Re: Solution to "Special Characters" problem in mospagebreak

Post by energumeno » Thu Oct 27, 2005 11:16 pm

please read:

http://forum.joomla.org/index.php/topic,14617.0.html

i posted the problem on the bugtracker and will hopefully be taken care of

http://developer.joomla.org/sf/go/artf1809?nav=1
Energúmeno
PE, MCSD, MCDBA, MCT

"Aquel que no esta orgulloso de su origen, no valdrá nunca nada, pues empieza por despreciarse a sí mismo"
-- Pedro Albizu Campos

grisha
Joomla! Apprentice
Joomla! Apprentice
Posts: 9
Joined: Thu Nov 01, 2007 4:37 pm

Re: Solution to

Post by grisha » Wed Aug 27, 2008 3:16 pm

Special characters for math pages
Postby grisha on Sun Aug 24, 2008 5:09 pm
When trying to insert a special symbol withing a text, for example, a Greek letter, one can see it right in the editor first. However, after hitting "Apply" button, a question mark appears instead of this symbol. Joomla just wipes out the html code that contains ampersands &. You enter ω but get just "?" One can see the same result, of course, when copying an html source code containing math symbols. Does anyone know what is wrong? Some settings missing?

Thanks.


Locked

Return to “Joomla! 1.0 Coding”