1.0.13 with UTF-8 by mistake

General questions regarding the use of languages and encoding issues in Joomla! 1.0.x. Multi-lingual site solutions can be discussed in the child board. Translation discussions are now separate and can be found in the Working Groups Area.
Locked
haakonmm
Joomla! Fledgling
Joomla! Fledgling
Posts: 3
Joined: Tue Sep 30, 2008 7:17 pm

1.0.13 with UTF-8 by mistake

Post by haakonmm » Tue Sep 30, 2008 7:46 pm

I installed Joomla about a year ago and all went fine. Didn't think much about charsets, collations etc and didn't have much trouble with it either. But as I gradually learned more and more I noticed that using 1.0.x with utf was a bad thing, as it clearly states in this sticky: http://forum.joomla.org/viewtopic.php?f=309&t=98880

The thing is that my Joomla installation runs from a database with "utf8_unicode_ci" tables only, which is the server default for new databases. All the Norwegian characters seems fine both on the site and the database. I have installed some components like CB, Docman and RSform. These also appears to be just fine, and have created their own utf tables automatically. What worries me is the future of my site, which has about 50 members, when adding more components, content etc. On my testsite I noticed most of the gallery components available created "latin_swedish_ci" tables instead of utf. I therefor would very much appreciate help with the following questions:

- Should I try to convert the database to iso/latin_general?
- Can I do this simply by dumping the db and convert it in the process? How is this done?
- Is it ok to mix both "utf" and "latin" tables in the same db?
- Is there anything else I should consider?

Thanks in advance.

User avatar
infograf768
Joomla! Master
Joomla! Master
Posts: 19069
Joined: Fri Aug 12, 2005 3:47 pm
Location: **Translation Matters**

Re: 1.0.13 with UTF-8 by mistake

Post by infograf768 » Wed Oct 01, 2008 8:18 am

Indeed, J 1.0.x is not FULLY utf8 and some problems may appear for multibytes language, but it is nevertheless quite stable using utf8 when parametered OK.

Some 3pd language files are not utf8: solution edit these php files and save as utf8 no bom,
Some extensions do create iso_swedish tables. If this is the case, edit the package before install to take off from the table creation the mysam ISO to get a correct collation.

Now the best solution is to move to J 1.5.7 where all is utf8.
CB, Docman, phocagallery are utf8 aware, some using the legacy layer.
Jean-Marie Simonet / infograf · http://www.info-graf.fr
---------------------------------
ex-Joomla Translation Coordination Team • ex-Joomla! Production Working Group

haakonmm
Joomla! Fledgling
Joomla! Fledgling
Posts: 3
Joined: Tue Sep 30, 2008 7:17 pm

Re: 1.0.13 with UTF-8 by mistake

Post by haakonmm » Wed Oct 01, 2008 1:54 pm

Thanks for the reply. I played around with my testsite yesterday and noticed a thing. As previously stated my site runs with a clean utf-8 database, but as I checked the the site in my browser it clearly states that that it's running with iso-8859-1 character encoding. When I tried to change the "charset=iso-8859-1" to "charset=utf-8" in my language file and save it in utf with notepad++, to mirror my database, i just got funnly looking norwegian characters. Shouldn't running both the the language.php file and database with utf work just fine?

I have to admin I'm not to keen on upgrading to 1.5 just yet, even if it is the best way to go. I spent a lot of time configuring my current site to meet my needs. Also my current template doesn't have a 1.5 version. Therefor, as previously asked, is it possible to dump the database as iso-8859-1 and upload it again to a new database?

I didn't quite get what you meant by "Some extensions do create iso_swedish tables. If this is the case, edit the package before install to take off from the table creation the mysam ISO to get a correct collation.". Could you please rephrase?

Again, thanks!

User avatar
infograf768
Joomla! Master
Joomla! Master
Posts: 19069
Joined: Fri Aug 12, 2005 3:47 pm
Location: **Translation Matters**

Re: 1.0.13 with UTF-8 by mistake

Post by infograf768 » Wed Oct 01, 2008 3:53 pm

haakonmm wrote:Thanks for the reply. I played around with my testsite yesterday and noticed a thing. As previously stated my site runs with a clean utf-8 database, but as I checked the the site in my browser it clearly states that that it's running with iso-8859-1 character encoding. When I tried to change the "charset=iso-8859-1" to "charset=utf-8" in my language file and save it in utf with notepad++, to mirror my database, i just got funnly looking norwegian characters. Shouldn't running both the the language.php file and database with utf work just fine?
There are a few things to do to get utf8 in joomla 1.0.x:
http://help.joomla.org/component/option ... temid,268/
I have to admin I'm not to keen on upgrading to 1.5 just yet, even if it is the best way to go. I spent a lot of time configuring my current site to meet my needs. Also my current template doesn't have a 1.5 version. Therefor, as previously asked, is it possible to dump the database as iso-8859-1 and upload it again to a new database?
This is possible. The new empty database has evidently to be latin-1 collated/charset. I would check the contents though in the original table (in PHPMyadmin) as well as in the dump to be sure nothing is wrong.
I didn't quite get what you meant by "Some extensions do create iso_swedish tables. If this is the case, edit the package before install to take off from the table creation the mysam ISO to get a correct collation.". Could you please rephrase?
Every extension which creates tables in the db contains some code to do so. This code may state the charset of the table.
In that case, one has to edit the file to take off the default charset and then repackage the extension.

Code: Select all

) ENGINE=MyISAM DEFAULT CHARSET=latin1
should be edited to

Code: Select all

) ENGINE=MyISAM 
Jean-Marie Simonet / infograf · http://www.info-graf.fr
---------------------------------
ex-Joomla Translation Coordination Team • ex-Joomla! Production Working Group

haakonmm
Joomla! Fledgling
Joomla! Fledgling
Posts: 3
Joined: Tue Sep 30, 2008 7:17 pm

Re: 1.0.13 with UTF-8 by mistake

Post by haakonmm » Wed Oct 01, 2008 4:23 pm

Ok, just one more question before I start. How about this setting in phpMyAdmin: "MySQL connection collation: utf8_unicode_ci"? Do I have to change it if I try to use the iso8955-1 charset instead? And I suppose it affects all my databases?

User avatar
infograf768
Joomla! Master
Joomla! Master
Posts: 19069
Joined: Fri Aug 12, 2005 3:47 pm
Location: **Translation Matters**

Re: 1.0.13 with UTF-8 by mistake

Post by infograf768 » Thu Oct 02, 2008 5:15 am

The collation of the database should take precedence over the general connexion, as far as I know.
Jean-Marie Simonet / infograf · http://www.info-graf.fr
---------------------------------
ex-Joomla Translation Coordination Team • ex-Joomla! Production Working Group


Locked

Return to “Language - 1.0.x”