Migrating from latin1_swedish_ci to utf8_general_ci database

Joomla version 1.5 is end-of-life and are no longer supported. Please use Joomla 3.x instead.

Moderator: General Support Moderators

Forum rules
Forum Rules
Absolute Beginner's Guide to Joomla! <-- please read before posting, this means YOU.
Forum Post Assistant - If you are serious about wanting help, you will use this tool to help you post.
Locked
henryvuong
Joomla! Apprentice
Joomla! Apprentice
Posts: 24
Joined: Thu Sep 14, 2006 7:51 pm

Migrating from latin1_swedish_ci to utf8_general_ci database

Post by henryvuong » Wed Nov 26, 2008 4:40 pm

I run a Vietnamese web site and am migrating from version 1.0.13 to 1.5.8. In version 1.0.13, I made a mistake by setting the database collation to latin1_swedish_gi. As a result, this database stores the Unicode characters in an unreadable format, for example, the string (readable):

Tôi không hiểu ý bạn muốn nói gì

is stored as (unreadable):

Tôi không hiểu ý bạn muốn nói gì

The old site, however, has no problem displaying all the text as readable format. When I create the new site, I set the new database collation to utf8_general_ci. With the migrating tool, I have successfully migrated and converted the text into readable format. But this tool only handles the core tables. I have the extension mXcomment and I must migrate the data manually. I need to convert the text into readable format because the new site cannot display it correctly. So I need to do one of the followings:

1) Convert the text in the database into readable format. Or:
2) Reconfigure the php files on the new site to make it display the text in the desired format.

Can someone give me some direction? Thanks.

Note: All the text entered after the migration is stored as readable format, there's no problem with that.

pasamio
Joomla! Ace
Joomla! Ace
Posts: 1318
Joined: Thu Aug 18, 2005 9:27 am
Location: San Jose, CA, USA
Contact:

Re: Migrating from latin1_swedish_ci to utf8_general_ci database

Post by pasamio » Thu Nov 27, 2008 1:10 am

You can write a migration plugin to handle the extra tables, its not too hard to do in practice.

http://blog.joomlatools.org/2008/04/rol ... ugins.html

Alternatively you can use iconv or a similar tool to convert the SQL file from latin1 to utf8, but if you've got enough PHP experience (you don't need much) building a migrator plugin is pretty easy and will get it in the main migration SQL file which will be converted with everthing else.
Sam Moffatt
Updater, Installer and Authentication Systems
JoomlaCode Backend Systems
Pie.

User avatar
leolam
Joomla! Master
Joomla! Master
Posts: 20652
Joined: Mon Aug 29, 2005 10:17 am
Location: Netherlands/ Germany/ S'pore/Bogor/ North America
Contact:

Re: Migrating from latin1_swedish_ci to utf8_general_ci database

Post by leolam » Thu Nov 27, 2008 5:31 am

the people at Phoca have made this very practical tool to change your existing site's collation in less than a minute. Works flawless and is free! http://www.phoca.cz/phoca-changing-collation/

Leo 8)
Joomla's #1 Professional Services Provider:
#Joomla Professional Support: https://gws-desk.com -
#Joomla Specialized Hosting Solutions: https://gws-host.com -

User avatar
Imago
Joomla! Enthusiast
Joomla! Enthusiast
Posts: 113
Joined: Fri May 12, 2006 8:37 am
Location: Sofia
Contact:

Re: Migrating from latin1_swedish_ci to utf8_general_ci database

Post by Imago » Thu Nov 27, 2008 7:40 am

Using iconv does not produce very consistent results. There were at least three symbols that remained corrupted. Instead, use the following procedure:

1. Export your DB
2. Open it with Notepad++
3. Convert it to UTF-8 w/o BOM
4. Check that your text is readable
5. Replace all your present collation with utf8_general_ci
6. Save as UTF-8
7. Copy your online DB to a new DB and change to it
8. Delete all tables of the new DB
9. Go to Operations and select utf8_general_ci for collation
10. Import the converted dump
11. Change the access in your config to point to the new DB
Prof. Dr. Plamen Gradinarov - CEO and Founder, Eurasia Ltd

User avatar
leolam
Joomla! Master
Joomla! Master
Posts: 20652
Joined: Mon Aug 29, 2005 10:17 am
Location: Netherlands/ Germany/ S'pore/Bogor/ North America
Contact:

Re: Migrating from latin1_swedish_ci to utf8_general_ci database

Post by leolam » Thu Nov 27, 2008 8:39 am

Absolute no need Imago. The tool i pointed you at does exactly what you do manually with a lot of work and a lot of error chances. It is a full automated script which works very well. Why doing things manually when you can have the script do this for you?
I consider this an advise which is useful for experienced users but still not needed. You think I will spend 10 minutes on doing this when i can do it in 10 seconds automated without errors?
collatool.jpg
You do not have the required permissions to view the files attached to this post.
Joomla's #1 Professional Services Provider:
#Joomla Professional Support: https://gws-desk.com -
#Joomla Specialized Hosting Solutions: https://gws-host.com -

User avatar
Imago
Joomla! Enthusiast
Joomla! Enthusiast
Posts: 113
Joined: Fri May 12, 2006 8:37 am
Location: Sofia
Contact:

Re: Migrating from latin1_swedish_ci to utf8_general_ci database

Post by Imago » Thu Nov 27, 2008 3:18 pm

Changing the collation of DB, tables and columns is only a small portion of the task. The most important thing is to convert the text, say, from cp1251 to utf8. Since there is no option to define the starting encoding of the text in the DB, I doubt the script does this kind of conversion. Would be happy if you prove me wrong in this disbelief.
Prof. Dr. Plamen Gradinarov - CEO and Founder, Eurasia Ltd

User avatar
leolam
Joomla! Master
Joomla! Master
Posts: 20652
Joined: Mon Aug 29, 2005 10:17 am
Location: Netherlands/ Germany/ S'pore/Bogor/ North America
Contact:

Re: Migrating from latin1_swedish_ci to utf8_general_ci database

Post by leolam » Thu Nov 27, 2008 4:01 pm

Load and try before commenting. So go ahead as expected, setup a default Joomla site (if convenient for you as all beginners with Fantastico) and run the script. For me the script runs well. I can give you access to one of our 25 development sites if you have yourself no options to test this?

Leo
Joomla's #1 Professional Services Provider:
#Joomla Professional Support: https://gws-desk.com -
#Joomla Specialized Hosting Solutions: https://gws-host.com -

User avatar
Imago
Joomla! Enthusiast
Joomla! Enthusiast
Posts: 113
Joined: Fri May 12, 2006 8:37 am
Location: Sofia
Contact:

Re: Migrating from latin1_swedish_ci to utf8_general_ci database

Post by Imago » Thu Nov 27, 2008 5:13 pm

As I expected, the script does NOT convert cp1251 text to utf8. Instead, it managed to ruin 120,000 entries in a Russian encyclopedia. They simply disappeared - ALL of them. So guys, if your language is not English, DO NOT use that killer script.
Prof. Dr. Plamen Gradinarov - CEO and Founder, Eurasia Ltd

pasamio
Joomla! Ace
Joomla! Ace
Posts: 1318
Joined: Thu Aug 18, 2005 9:27 am
Location: San Jose, CA, USA
Contact:

Re: Migrating from latin1_swedish_ci to utf8_general_ci database

Post by pasamio » Fri Nov 28, 2008 1:01 am

The topic states latin1 which isn't cp1251, so this is more than likely the reason its not working. If you find that iconv is having issues with a few characters than I'd suggest that this is potentially one of a few problems:
1) Its a bug in iconv (very unlikely, but still possible)
2) The characters you have in there aren't cp1251 - they may already be utf-8 or some other character set again. iconv can't help you where you've got dirty data, most automated tools will have issues there anyway.

If your data is dirty then getting that converted properly is a real pain where you will have to manually eye ball most entries to look for the problem. There was a blog on O'Reilly that I've pasted a few times which is one guy converting his old mysql db into utf-8 and having insane amounts of fun with different character sets and that may help you in this specific case. The problem itself is dirty data and hopefully once we're all on UTF-8 it'll disappear for the majority of users (then we end up with issues for Japanese and Chinese characters but thats another problem).
Sam Moffatt
Updater, Installer and Authentication Systems
JoomlaCode Backend Systems
Pie.

User avatar
leolam
Joomla! Master
Joomla! Master
Posts: 20652
Joined: Mon Aug 29, 2005 10:17 am
Location: Netherlands/ Germany/ S'pore/Bogor/ North America
Contact:

Re: Migrating from latin1_swedish_ci to utf8_general_ci database

Post by leolam » Fri Nov 28, 2008 4:14 am

Imago wrote:As I expected, the script does NOT convert cp1251 text to utf8. Instead, it managed to ruin 120,000 entries in a Russian encyclopedia. They simply disappeared - ALL of them. So guys, if your language is not English, DO NOT use that killer script.
You should be a little bit more careful in your postings. The thread reads: Migrating from latin1_swedish_ci to utf8_general_ci database The script does this excellent.

Next I just changed the collation of a default Joomla site from utf8_general_ci into cp1251_ukrainian_ci without a hick. examples of the output:

ALTER DATABASE joomad1_jo151 COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner MODIFY type varchar(30) CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY type varchar(30) COLLATE cp1251_ukrainian_ci DEFAULT 'banner' ... OK
ALTER TABLE jos_banner MODIFY name varchar(255) CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY name varchar(255) COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner MODIFY alias varchar(255) CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY alias varchar(255) COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner MODIFY imageurl varchar(100) CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY imageurl varchar(100) COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner MODIFY clickurl varchar(200) CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY clickurl varchar(200) COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner MODIFY editor varchar(50) CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY editor varchar(50) COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner MODIFY custombannercode text CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY custombannercode text COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner MODIFY description text CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY description text COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner MODIFY tags text CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY tags text COLLATE cp1251_ukrainian_ci ... OK
ALTER TABLE jos_banner MODIFY params text CHARACTER SET binary ... OK
ALTER TABLE jos_banner MODIFY params text COLLATE cp1251_ukrainian_ci ... OK


proof of success is attached.

Also I can show you sites which are in German (no English) which is a language that has very special characters where this script runs very well. It also works well with Greek amongst others........Do not generalize all issues.......The script is a fabulous piece of code for the task.

You can find a specific non Joomla script which works also very, very well (and we use that for non-joomla sites where we run into issues with character sets:

http://bogdan.org.ua/2008/02/08/convert ... other.html

Where it wont work (Chinese most likely and Japanese as Sam mentioned yes we do it manually in the way you described. In most cases not needed though....

Positivism is a virtue and the opposite is destruction

Cheers
Leo
coll1.jpg
You do not have the required permissions to view the files attached to this post.
Joomla's #1 Professional Services Provider:
#Joomla Professional Support: https://gws-desk.com -
#Joomla Specialized Hosting Solutions: https://gws-host.com -

User avatar
leolam
Joomla! Master
Joomla! Master
Posts: 20652
Joined: Mon Aug 29, 2005 10:17 am
Location: Netherlands/ Germany/ S'pore/Bogor/ North America
Contact:

Re: Migrating from latin1_swedish_ci to utf8_general_ci database

Post by leolam » Fri Nov 28, 2008 9:09 am

Please note the following:

I have entered in Joomadev1.info the following in "we are volunteers" :
Tôi không hiểu ý bạn muốn nói gì Site runs default encoding as was the fact with poster henryvuong

I have than run the script and changed to latin1_swedish_ci (with the script

see image number1
number1.jpg
Please note date and time of update (just in case) Last Updated on Friday, 28 November 2008 08:51

I than run the script which can be found at http://joomadev1.info/changing_collation/index.php

result:
number2.jpg
See http://joomadev1.info

Leo
You do not have the required permissions to view the files attached to this post.
Joomla's #1 Professional Services Provider:
#Joomla Professional Support: https://gws-desk.com -
#Joomla Specialized Hosting Solutions: https://gws-host.com -

User avatar
pe7er
Joomla! Master
Joomla! Master
Posts: 24926
Joined: Thu Aug 18, 2005 8:55 pm
Location: Nijmegen, Netherlands
Contact:

Re: Migrating from latin1_swedish_ci to utf8_general_ci database

Post by pe7er » Fri Nov 28, 2008 1:48 pm

MOD note: We've deleted some messages in this thread and made some small edits to the current ones.

Keep all commentary civil, and be courteous at all times. Constructive criticism is welcome, but insults directed towards other users or the site admins will not be tolerated. Coarse/insulting language will not be tolerated.
Kind Regards,
Peter Martin, Global Moderator
Company website: https://db8.nl/en/ - Joomla specialist, Nijmegen, Netherlands
The best website: https://the-best-website.com

henryvuong
Joomla! Apprentice
Joomla! Apprentice
Posts: 24
Joined: Thu Sep 14, 2006 7:51 pm

Re: Migrating from latin1_swedish_ci to utf8_general_ci database

Post by henryvuong » Sun Nov 30, 2008 10:26 pm

Thanks to all the advices. My baby is born on the Thanksgiving Day. I'll be busy for a while. I'll try it later.

mordechai
Joomla! Apprentice
Joomla! Apprentice
Posts: 38
Joined: Tue Dec 25, 2007 8:58 pm

Re: Migrating from latin1_swedish_ci to utf8_general_ci database

Post by mordechai » Thu Feb 19, 2009 10:35 am

It worked for me! thank you phoca! And thanks leolam! :) (using 1.5.9)

Now I have a question - why did this happen in the first place? I installed joomla using simplescripts, thats supposed to be utf-8, no? Is there a way to control the collation?
Then I imported a new DB from yootheme, could that have made the problem? ???

User avatar
leolam
Joomla! Master
Joomla! Master
Posts: 20652
Joined: Mon Aug 29, 2005 10:17 am
Location: Netherlands/ Germany/ S'pore/Bogor/ North America
Contact:

Re: Migrating from latin1_swedish_ci to utf8_general_ci database

Post by leolam » Thu Feb 19, 2009 11:04 am

mordechai wrote: Then I imported a new DB from yootheme, could that have made the problem? ???
You can see that in the database which you received from Yootheme. Most unlikely though in case of a Joomla 1.5-DB. If you are not sure you can send that database to the email in my profile and i will have a look.

I think though that you created a database with the wrong collation maybe manually?

Leo
Joomla's #1 Professional Services Provider:
#Joomla Professional Support: https://gws-desk.com -
#Joomla Specialized Hosting Solutions: https://gws-host.com -

dado023
Joomla! Apprentice
Joomla! Apprentice
Posts: 34
Joined: Sun Jan 14, 2007 1:25 pm

Re: Migrating from latin1_swedish_ci to utf8_general_ci database

Post by dado023 » Sun Jan 03, 2010 10:29 pm

leolam wrote:the people at Phoca have made this very practical tool to change your existing site's collation in less than a minute. Works flawless and is free! http://www.phoca.cz/phoca-changing-collation/

Leo 8)
thanks for this....
i have changed collation for my DB using this, it cut-off some words, but it was easier for me to correct it after then to meddle with manual converting

regards

shamboz
Joomla! Fledgling
Joomla! Fledgling
Posts: 1
Joined: Tue Nov 01, 2011 7:45 pm

Re: Migrating from latin1_swedish_ci to utf8_general_ci data

Post by shamboz » Tue Nov 01, 2011 7:56 pm

joomla uses defaults for making tables ..

if you do this BEFORE THE INSTALLATION you will have every collation as "utf8_general_ci"

1. open notepad++ (google it for the download link)
2. crtl + f , and click "find in files" tab
3. Enter "<joomla directory>\installation" in Directory field.
3. enter "latin1" in "find what" and "utf8" in "replace with".
5. press "replace in files"

or simply replace every "latin1" with "utf8" in the <joomla directory>\installation directory.

nimaasdaghi
Joomla! Apprentice
Joomla! Apprentice
Posts: 5
Joined: Mon Jul 29, 2013 8:24 pm

Re: Migrating from latin1_swedish_ci to utf8_general_ci data

Post by nimaasdaghi » Tue Sep 02, 2014 8:55 am

leolam wrote:the people at Phoca have made this very practical tool to change your existing site's collation in less than a minute. Works flawless and is free! http://www.phoca.cz/phoca-changing-collation/

Leo 8)
how can we use it? i have upload it too my root server and then what should i do?

User avatar
dpacadmin
Joomla! Champion
Joomla! Champion
Posts: 6029
Joined: Sat Aug 16, 2008 1:46 pm
Location: the Bat Cave
Contact:

Re: Migrating from latin1_swedish_ci to utf8_general_ci data

Post by dpacadmin » Wed Sep 03, 2014 3:21 am

There are instructions here;
http://www.phoca.cz/documents/38-tools/ ... n-database
please note this point it makes;
"Please, run this script only if you know what you do"

Waila
Joomla! Apprentice
Joomla! Apprentice
Posts: 27
Joined: Mon Sep 22, 2008 11:28 am

Re: Migrating from latin1_swedish_ci to utf8_general_ci database

Post by Waila » Wed Oct 12, 2016 6:46 am

is there such tool to convert the sql file ?


Locked

Return to “Migrating and Upgrading to Joomla! 1.5”