Joomla! Discussion Forums



It is currently Tue Feb 09, 2010 9:21 pm (All times are UTC )

 


Forum rules

Forum Rules
Absolute Beginner's Guide to Joomla! <-- please read before posting, this means YOU.
Forum Post Assistant - If you are serious about wanting help, you will use this tool to help you post.



Post new topic Reply to topic  [ 16 posts ] 
Author Message
Posted: Wed Nov 26, 2008 4:40 pm 
Joomla! Apprentice
Joomla! Apprentice
Offline

Joined: Thu Sep 14, 2006 7:51 pm
Posts: 16
I run a Vietnamese web site and am migrating from version 1.0.13 to 1.5.8. In version 1.0.13, I made a mistake by setting the database collation to latin1_swedish_gi. As a result, this database stores the Unicode characters in an unreadable format, for example, the string (readable):

Tôi không hiểu ý bạn muốn nói gì

is stored as (unreadable):

Tôi không hiểu ý bạn muốn nói gì

The old site, however, has no problem displaying all the text as readable format. When I create the new site, I set the new database collation to utf8_general_ci. With the migrating tool, I have successfully migrated and converted the text into readable format. But this tool only handles the core tables. I have the extension mXcomment and I must migrate the data manually. I need to convert the text into readable format because the new site cannot display it correctly. So I need to do one of the followings:

1) Convert the text in the database into readable format. Or:
2) Reconfigure the php files on the new site to make it display the text in the desired format.

Can someone give me some direction? Thanks.

Note: All the text entered after the migration is stored as readable format, there's no problem with that.


Top
   
 
Posted: Thu Nov 27, 2008 1:10 am 
User avatar
Joomla! Ace
Joomla! Ace
Offline

Joined: Thu Aug 18, 2005 9:27 am
Posts: 1279
Location: Toowoomba, Queensland, Australia
You can write a migration plugin to handle the extra tables, its not too hard to do in practice.

http://blog.joomlatools.org/2008/04/rol ... ugins.html

Alternatively you can use iconv or a similar tool to convert the SQL file from latin1 to utf8, but if you've got enough PHP experience (you don't need much) building a migrator plugin is pretty easy and will get it in the main migration SQL file which will be converted with everthing else.

_________________
Sam Moffatt
Dev WG: Update and Install Technologies, Authentication Systems, Migrator Component, Summer of Code Mentor
S&I WG: Package Builder Maintainer, JoomlaCode Backend Systems
Core Team: Collector of Titles


Top
   
 
Posted: Thu Nov 27, 2008 5:31 am 
User avatar
Joomla! Champion
Joomla! Champion
Offline

Joined: Mon Aug 29, 2005 10:17 am
Posts: 7459
Location: Netherlands/S'pore/Bali/North America
the people at Phoca have made this very practical tool to change your existing site's collation in less than a minute. Works flawless and is free! http://www.phoca.cz/phoca-changing-collation/

Leo 8)

_________________
For Specialized & Individual Support:: http://gws-desk.com
Joomla Specialized Development Solutions:: http://gws-studio.com
Professional & Dedicated Joomla Hosting :: gws-host.com
NEW: Specialized K2 Development Support at GWS-Desk


Top
   
 
Posted: Thu Nov 27, 2008 7:40 am 
User avatar
Joomla! Enthusiast
Joomla! Enthusiast
Offline

Joined: Fri May 12, 2006 8:37 am
Posts: 119
Location: Sofia
Using iconv does not produce very consistent results. There were at least three symbols that remained corrupted. Instead, use the following procedure:

1. Export your DB
2. Open it with Notepad++
3. Convert it to UTF-8 w/o BOM
4. Check that your text is readable
5. Replace all your present collation with utf8_general_ci
6. Save as UTF-8
7. Copy your online DB to a new DB and change to it
8. Delete all tables of the new DB
9. Go to Operations and select utf8_general_ci for collation
10. Import the converted dump
11. Change the access in your config to point to the new DB

_________________
Prof. Dr. Plamen Gradinarov - CEO and Founder, Eurasia Ltd


Top
  E-mail  
 
Posted: Thu Nov 27, 2008 8:39 am 
User avatar
Joomla! Champion
Joomla! Champion
Offline

Joined: Mon Aug 29, 2005 10:17 am
Posts: 7459
Location: Netherlands/S'pore/Bali/North America
Absolute no need Imago. The tool i pointed you at does exactly what you do manually with a lot of work and a lot of error chances. It is a full automated script which works very well. Why doing things manually when you can have the script do this for you?
I consider this an advise which is useful for experienced users but still not needed. You think I will spend 10 minutes on doing this when i can do it in 10 seconds automated without errors?

Attachment:
collatool.jpg


You do not have the required permissions to view the files attached to this post.

_________________
For Specialized & Individual Support:: http://gws-desk.com
Joomla Specialized Development Solutions:: http://gws-studio.com
Professional & Dedicated Joomla Hosting :: gws-host.com
NEW: Specialized K2 Development Support at GWS-Desk


Top
   
 
Posted: Thu Nov 27, 2008 3:18 pm 
User avatar
Joomla! Enthusiast
Joomla! Enthusiast
Offline

Joined: Fri May 12, 2006 8:37 am
Posts: 119
Location: Sofia
Changing the collation of DB, tables and columns is only a small portion of the task. The most important thing is to convert the text, say, from cp1251 to utf8. Since there is no option to define the starting encoding of the text in the DB, I doubt the script does this kind of conversion. Would be happy if you prove me wrong in this disbelief.

_________________
Prof. Dr. Plamen Gradinarov - CEO and Founder, Eurasia Ltd


Top
  E-mail  
 
Posted: Thu Nov 27, 2008 4:01 pm 
User avatar
Joomla! Champion
Joomla! Champion
Offline

Joined: Mon Aug 29, 2005 10:17 am
Posts: 7459
Location: Netherlands/S'pore/Bali/North America
Load and try before commenting. So go ahead as expected, setup a default Joomla site (if convenient for you as all beginners with Fantastico) and run the script. For me the script runs well. I can give you access to one of our 25 development sites if you have yourself no options to test this?

Leo

_________________
For Specialized & Individual Support:: http://gws-desk.com
Joomla Specialized Development Solutions:: http://gws-studio.com
Professional & Dedicated Joomla Hosting :: gws-host.com
NEW: Specialized K2 Development Support at GWS-Desk


Top
   
 
Posted: Thu Nov 27, 2008 5:13 pm 
User avatar
Joomla! Enthusiast
Joomla! Enthusiast
Offline

Joined: Fri May 12, 2006 8:37 am
Posts: 119
Location: Sofia
As I expected, the script does NOT convert cp1251 text to utf8. Instead, it managed to ruin 120,000 entries in a Russian encyclopedia. They simply disappeared - ALL of them. So guys, if your language is not English, DO NOT use that killer script.

_________________
Prof. Dr. Plamen Gradinarov - CEO and Founder, Eurasia Ltd


Top
  E-mail  
 
Posted: Fri Nov 28, 2008 1:01 am 
User avatar
Joomla! Ace
Joomla! Ace
Offline

Joined: Thu Aug 18, 2005 9:27 am
Posts: 1279
Location: Toowoomba, Queensland, Australia
The topic states latin1 which isn't cp1251, so this is more than likely the reason its not working. If you find that iconv is having issues with a few characters than I'd suggest that this is potentially one of a few problems:
1) Its a bug in iconv (very unlikely, but still possible)
2) The characters you have in there aren't cp1251 - they may already be utf-8 or some other character set again. iconv can't help you where you've got dirty data, most automated tools will have issues there anyway.

If your data is dirty then getting that converted properly is a real pain where you will have to manually eye ball most entries to look for the problem. There was a blog on O'Reilly that I've pasted a few times which is one guy converting his old mysql db into utf-8 and having insane amounts of fun with different character sets and that may help you in this specific case. The problem itself is dirty data and hopefully once we're all on UTF-8 it'll disappear for the majority of users (then we end up with issues for Japanese and Chinese characters but thats another problem).

_________________
Sam Moffatt
Dev WG: Update and Install Technologies, Authentication Systems, Migrator Component, Summer of Code Mentor
S&I WG: Package Builder Maintainer, JoomlaCode Backend Systems
Core Team: Collector of Titles


Top
   
 
Posted: Fri Nov 28, 2008 4:14 am 
User avatar
Joomla! Champion
Joomla! Champion
Offline

Joined: Mon Aug 29, 2005 10:17 am
Posts: 7459
Location: Netherlands/S'pore/Bali/North America
Imago wrote:
As I expected, the script does NOT convert cp1251 text to utf8. Instead, it managed to ruin 120,000 entries in a Russian encyclopedia. They simply disappeared - ALL of them. So guys, if your language is not English, DO NOT use that killer script.
You should be a little bit more careful in your postings. The thread reads: Migrating from latin1_swedish_ci to utf8_general_ci database The script does this excellent.

Next I just changed the collation of a default Joomla site from utf8_general_ci into cp1251_ukrainian_ci without a hick. examples of the output:

ALTER DATABASE joomad1_jo151 COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner MODIFY type varchar(30) CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY type varchar(30) COLLATE cp1251_ukrainian_ci DEFAULT 'banner' ... OK
ALTER TABLE jos_banner MODIFY name varchar(255) CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY name varchar(255) COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner MODIFY alias varchar(255) CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY alias varchar(255) COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner MODIFY imageurl varchar(100) CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY imageurl varchar(100) COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner MODIFY clickurl varchar(200) CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY clickurl varchar(200) COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner MODIFY editor varchar(50) CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY editor varchar(50) COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner MODIFY custombannercode text CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY custombannercode text COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner MODIFY description text CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY description text COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner MODIFY tags text CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY tags text COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner MODIFY params text CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY params text COLLATE cp1251_ukrainian_ci ... OK


proof of success is attached.

Also I can show you sites which are in German (no English) which is a language that has very special characters where this script runs very well. It also works well with Greek amongst others........Do not generalize all issues.......The script is a fabulous piece of code for the task.

You can find a specific non Joomla script which works also very, very well (and we use that for non-joomla sites where we run into issues with character sets:

http://bogdan.org.ua/2008/02/08/convert ... other.html

Where it wont work (Chinese most likely and Japanese as Sam mentioned yes we do it manually in the way you described. In most cases not needed though....

Positivism is a virtue and the opposite is destruction

Cheers
Leo
Attachment:
coll1.jpg


You do not have the required permissions to view the files attached to this post.

_________________
For Specialized & Individual Support:: http://gws-desk.com
Joomla Specialized Development Solutions:: http://gws-studio.com
Professional & Dedicated Joomla Hosting :: gws-host.com
NEW: Specialized K2 Development Support at GWS-Desk


Top
   
 
Posted: Fri Nov 28, 2008 9:09 am 
User avatar
Joomla! Champion
Joomla! Champion
Offline

Joined: Mon Aug 29, 2005 10:17 am
Posts: 7459
Location: Netherlands/S'pore/Bali/North America
Please note the following:

I have entered in Joomadev1.info the following in "we are volunteers" :
Tôi không hiểu ý bạn muốn nói gì Site runs default encoding as was the fact with poster henryvuong

I have than run the script and changed to latin1_swedish_ci (with the script

see image number1
Attachment:
number1.jpg


Please note date and time of update (just in case) Last Updated on Friday, 28 November 2008 08:51

I than run the script which can be found at http://joomadev1.info/changing_collation/index.php

result:
Attachment:
number2.jpg


See http://joomadev1.info

Leo


You do not have the required permissions to view the files attached to this post.

_________________
For Specialized & Individual Support:: http://gws-desk.com
Joomla Specialized Development Solutions:: http://gws-studio.com
Professional & Dedicated Joomla Hosting :: gws-host.com
NEW: Specialized K2 Development Support at GWS-Desk


Top
   
 
Posted: Fri Nov 28, 2008 1:48 pm 
User avatar
Joomla! Master
Joomla! Master
Offline

Joined: Thu Aug 18, 2005 8:55 pm
Posts: 13220
Location: Nijmegen, The Netherlands
MOD note: We've deleted some messages in this thread and made some small edits to the current ones.

Keep all commentary civil, and be courteous at all times. Constructive criticism is welcome, but insults directed towards other users or the site admins will not be tolerated. Coarse/insulting language will not be tolerated.

_________________
Kind Regards,
Peter Martin, Global Moderator - Community Leadership Team
www.db8.nl - Joomla specialist, Nijmegen, Nederland
Joomla 1.5 Quick Reference Guide: www.db8.nl/en/downloads/misc-downloads/ ... glish.html


Top
   
 
Posted: Sun Nov 30, 2008 10:26 pm 
Joomla! Apprentice
Joomla! Apprentice
Offline

Joined: Thu Sep 14, 2006 7:51 pm
Posts: 16
Thanks to all the advices. My baby is born on the Thanksgiving Day. I'll be busy for a while. I'll try it later.


Top
   
 
Posted: Thu Feb 19, 2009 10:35 am 
Joomla! Apprentice
Joomla! Apprentice
Offline

Joined: Tue Dec 25, 2007 8:58 pm
Posts: 29
It worked for me! thank you phoca! And thanks leolam! :) (using 1.5.9)

Now I have a question - why did this happen in the first place? I installed joomla using simplescripts, thats supposed to be utf-8, no? Is there a way to control the collation?
Then I imported a new DB from yootheme, could that have made the problem? ???


Top
  E-mail  
 
Posted: Thu Feb 19, 2009 11:04 am 
User avatar
Joomla! Champion
Joomla! Champion
Offline

Joined: Mon Aug 29, 2005 10:17 am
Posts: 7459
Location: Netherlands/S'pore/Bali/North America
mordechai wrote:
Then I imported a new DB from yootheme, could that have made the problem? ???
You can see that in the database which you received from Yootheme. Most unlikely though in case of a Joomla 1.5-DB. If you are not sure you can send that database to the email in my profile and i will have a look.

I think though that you created a database with the wrong collation maybe manually?

Leo

_________________
For Specialized & Individual Support:: http://gws-desk.com
Joomla Specialized Development Solutions:: http://gws-studio.com
Professional & Dedicated Joomla Hosting :: gws-host.com
NEW: Specialized K2 Development Support at GWS-Desk


Top
   
 
Posted: Sun Jan 03, 2010 10:29 pm 
Joomla! Apprentice
Joomla! Apprentice
Offline

Joined: Sun Jan 14, 2007 1:25 pm
Posts: 7
leolam wrote:
the people at Phoca have made this very practical tool to change your existing site's collation in less than a minute. Works flawless and is free! http://www.phoca.cz/phoca-changing-collation/

Leo 8)


thanks for this....
i have changed collation for my DB using this, it cut-off some words, but it was easier for me to correct it after then to meddle with manual converting

regards


Top
  E-mail  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 16 posts ] 

Quick reply

 



Who is online

Users browsing this forum: larrybg and 37 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group