I have been reading quite a few post, but i could not figure out what's the final status.
1. It seem we can change the chartset from the language file from iso to utf-8
2. Joomla 1.0.x does not support utf-8
What kind of problems can we expect if you still decide to go utf-8 with 1.0x series. Actually I have seen quite a few live site using joomla 1.0x and utf-8, with no apparent problems
Looking for a definitive answer. thanks
charset: utf-8 with 1.0x series
Moderator: wendhausen
- 55thinking
- Joomla! Enthusiast
- Posts: 183
- Joined: Mon Sep 05, 2005 8:58 am
- Location: Madrid
- Contact:
charset: utf-8 with 1.0x series
55 Thinking - Strategy Design Technology
Good looking, Fast and Usable web solutions
http://www.55thinking.com/
Good looking, Fast and Usable web solutions
http://www.55thinking.com/
- eyesofkids
- Joomla! Enthusiast
- Posts: 238
- Joined: Tue Aug 23, 2005 6:04 am
- Location: Taipei , Taiwan
- Contact:
Re: charset: utf-8 with 1.0x series
Yes, you can do it.55thinking wrote: 1. It seem we can change the chartset from the language file from iso to utf-8
2. Joomla 1.0.x does not support utf-8
But you need to care about the database charset and ie utf-8 bugs.
Some functions in PHP don't support utf-8 strings like substr, strlen...and some of 'Regular Expression'.
So there are some little problems when the Joomla! 1.0.x handle these non-English strings.
IMO, the "support utf-8" of Joomla 1.0.x is better than the old Mambo.
Eddy Chang
All day long the superior man is creatively active. At nightfall his mind is still beset with cares. ~ I-CHING
Eddy Chang (Taipei, Taiwan)
Member of the Traditional Chinese Joomla Translation Team
http://www.joomla.org.tw
Eddy Chang (Taipei, Taiwan)
Member of the Traditional Chinese Joomla Translation Team
http://www.joomla.org.tw
- davidgal
- Joomla! Guru
- Posts: 963
- Joined: Sat Aug 20, 2005 9:19 am
- Location: Israel
- Contact:
Re: charset: utf-8 with 1.0x series
Hi there,55thinking wrote: I have been reading quite a few post, but i could not figure out what's the final status.
1. It seem we can change the chartset from the language file from iso to utf-8
2. Joomla 1.0.x does not support utf-8
What kind of problems can we expect if you still decide to go utf-8 with 1.0x series. Actually I have seen quite a few live site using joomla 1.0x and utf-8, with no apparent problems
Looking for a definitive answer. thanks
I'll do my best to point out the issues of utf-8 in Joomla 1.0.x series.
To be fully utf-8 compatible the following needs to be fulfilled:
- The database needs to be utf-8 compliant otherwise there is a danger of data truncation. A 20 character string in utf-8 may be up to 60 bytes long. In a varchar field that is defined as utf-8 with a length of 20 - 20 utf-8 characters can be safely stored. The field adapts to the byte length. In a non-utf-8 database the same varchar (20) field will truncate the string after 20 bytes.
- The connection between the database and the php application needs to have utf-8 encoding otherwise unwanted conversions will occur and data corruption will result.
- Multibyte string functions need to be used when the applied data is encoded as utf-8. Unfortunately PHP's native string functions are not utf-8 aware and can seriously corrupt data (see http://www.phpwact.org/php/i18n/utf-8). There is an extension package to PHP 4 and 5 that has utf-8 aware string function ('mb_string'). However this extension is not always loaded/installed and the php code needs to be modified to call the appropriate mb_ versions of the string function. (PHP 6 will be fully Unicode and utf-8 aware).
- The HTML page encoding needs to be set to utf-8 (setting charset in the language file)
Why does Joomla 1.0.x seem to work fine when only the charset is set to utf-8?
This in fact occurs if only pure English is used. The reason is that all English characters are in lower ASCII and do not include any extended ASCII characters. In this special case utf-8 is equivalent to iso-8859-1 as all characters are single byte characters. The problems begin with European languages with diacritic Latin characters (umlauts, accents etc.) and with other non Latin languages. If you are only going to use English, you might as well stay with iso-8859-1. If you are going to use other languages, please check out the workaround below.
How is all this solved for Joomla 1.5?
See: http://dev.joomla.org/component/option, ... d,33/p,16/
Is there a workaround to apply utf-8 in Joomla 1.0.x series?
Yes. Here is a quick guideline to getting Joomla 1.0.x to work with utf-8
- use MySQL version 4.1.2 or newer (older versions don't support utf-8).
- create an empty database manually before installing Joomla. Set the character set to utf8 when creating with some collation (utf8_general_ci is the default and should be OK).
- convert the language files to utf-8 (all language files including for editors, components etc.)
- Install Joomla using the pre-existing database. After installation check that the database has utf8 encoding for all text fields (just in case Joomla created a new database and is not working on the pre-created one).
- set 'charset=utf-8' in the _ISO define in the language file
- You should uncomment one line of code in the includes/database.php file at about line 102 (second line below)
Code: Select all
$this->_table_prefix = $table_prefix;
//@mysql_query("SET NAMES 'utf8'", $this->_resource); // THIS IS THE LINE TO UNCOMMENT
$this->_ticker = 0;
$this->_log = array();
I sincerely hope that this is definitive enough
Last edited by davidgal on Wed Apr 12, 2006 10:27 pm, edited 1 time in total.
David Gal
- eyesofkids
- Joomla! Enthusiast
- Posts: 238
- Joined: Tue Aug 23, 2005 6:04 am
- Location: Taipei , Taiwan
- Contact:
Re: charset: utf-8 with 1.0x series
very clear description form davidgal.
It's a good article for every translators and non-English Joomla! users.
I will translate it to Chinese and post on my site....
Thank you, davidgal !
It's a good article for every translators and non-English Joomla! users.
I will translate it to Chinese and post on my site....
Thank you, davidgal !
All day long the superior man is creatively active. At nightfall his mind is still beset with cares. ~ I-CHING
Eddy Chang (Taipei, Taiwan)
Member of the Traditional Chinese Joomla Translation Team
http://www.joomla.org.tw
Eddy Chang (Taipei, Taiwan)
Member of the Traditional Chinese Joomla Translation Team
http://www.joomla.org.tw
- 55thinking
- Joomla! Enthusiast
- Posts: 183
- Joined: Mon Sep 05, 2005 8:58 am
- Location: Madrid
- Contact:
Re: charset: utf-8 with 1.0x series
Thanks a lot David
You've provided an excellent definitive answer !!!
You've provided an excellent definitive answer !!!
55 Thinking - Strategy Design Technology
Good looking, Fast and Usable web solutions
http://www.55thinking.com/
Good looking, Fast and Usable web solutions
http://www.55thinking.com/