•Language slugs transliteration
- infograf768
- Joomla! Master
- Posts: 19133
- Joined: Fri Aug 12, 2005 3:47 pm
- Location: **Translation Matters**
•Language slugs transliteration
Present status.
Transliteration is implemented in 1.5.x for latin languages only.
Concretely it means that when the alias field remains empty, it is automatically filled by:
1. Latin languages
The ANSI equivalent of accented letters i.e. non-accented letters.
à ->a
ë -> e
Ŝ -> s
etc.
2. Non latin languages (Greek, Cyrillic, Chinese, Arabic scripts, etc.
The date when the article is saved.
Proposal
To implement in core a way to get transliteration applied to any language that would provide the necessary ini file.
At first sight, it could look easy in the sense that a parameter could decide to use or not an available ini, based on language defined for site or admin.
At second sight, not so easy as any unicode glyphs can be used in 1.5.x, whatever the choice of languages for front or back-end.
Yvolk proposed, a year ago, a solution which was not followed up:
http://forum.joomla.org/viewtopic.php?f ... 90#p749972
Could anyone have a look and consider its feasability for 1.6?
Transliteration is implemented in 1.5.x for latin languages only.
Concretely it means that when the alias field remains empty, it is automatically filled by:
1. Latin languages
The ANSI equivalent of accented letters i.e. non-accented letters.
à ->a
ë -> e
Ŝ -> s
etc.
2. Non latin languages (Greek, Cyrillic, Chinese, Arabic scripts, etc.
The date when the article is saved.
Proposal
To implement in core a way to get transliteration applied to any language that would provide the necessary ini file.
At first sight, it could look easy in the sense that a parameter could decide to use or not an available ini, based on language defined for site or admin.
At second sight, not so easy as any unicode glyphs can be used in 1.5.x, whatever the choice of languages for front or back-end.
Yvolk proposed, a year ago, a solution which was not followed up:
http://forum.joomla.org/viewtopic.php?f ... 90#p749972
Could anyone have a look and consider its feasability for 1.6?
Jean-Marie Simonet / infograf
---------------------------------
ex-Joomla Translation Coordination Team • ex-Joomla! Production Working Group
---------------------------------
ex-Joomla Translation Coordination Team • ex-Joomla! Production Working Group
- newart
- Joomla! Virtuoso
- Posts: 3177
- Joined: Fri Sep 02, 2005 10:06 am
- Location: Solar system - Earth - European Union
Re: •Language slugs transliteration
very interesting your thread about a more international language support... I think this problem can be a big issue when you can have an URL directly written in russian / chinese and so on...
former Q&T WorkGroup Joomla member - Italian Translation Team Member
-
- Joomla! Champion
- Posts: 7018
- Joined: Wed Nov 22, 2006 3:35 pm
- Location: Nebraska
- Contact:
Re: •Language slugs transliteration
I don't know what all this involves and how a solution would be designed, but I would welcome this improvement.
- yvolk
- Joomla! Guru
- Posts: 979
- Joined: Thu Jun 01, 2006 1:52 pm
- Location: Moscow, Russia
- Contact:
Re: Language slugs transliteration
Thank you, infograf768, for relaunching this topic
I'm still ready to help, if I'll be needed.
I'm still ready to help, if I'll be needed.
- yvolk
- Joomla! Guru
- Posts: 979
- Joined: Thu Jun 01, 2006 1:52 pm
- Location: Moscow, Russia
- Contact:
Re: Language slugs transliteration
Hi, I have good news!
I've moved this last year's work to the working multilingual plugin, yvTransliterate.
The plugin works, although its integration with Joomla! is not smart enough, and this may
give new impulse for Joomla! core team to improve multilingual support of Joomla!
See more information in the thread.
I've moved this last year's work to the working multilingual plugin, yvTransliterate.
The plugin works, although its integration with Joomla! is not smart enough, and this may
give new impulse for Joomla! core team to improve multilingual support of Joomla!
See more information in the thread.
-
- Joomla! Champion
- Posts: 7018
- Joined: Wed Nov 22, 2006 3:35 pm
- Location: Nebraska
- Contact:
Re: •Language slugs transliteration
Free software communities are amazing creatures. When barriers are lowered so that anyone can participate, something very cool happens. People pick up problems to solve that they find interesting. Combined, as we share the results of our work, we end up with far more than we could begin to accomplish on individually.
Currently, there are eleven members of the Joomla! core team. Recently, we welcomed our 200,000th forum member and someone downloaded the 5,000,000th copy of Joomla!. Not to discredit the considerable efforts of those who are and who have served on the core team, but, if anyone is waiting on *them* to improve anything without significant help from this great community, those people will be waiting a very, very long time.
Thanks, Yuri, for selecting another problem that you find interesting to solve. I am confident you will solve it. I hope you have fun in the process and that you learn cool things. It would be awesome if you improved this area of study in such a way that other free software communities learned from your improvements.
Amy
Currently, there are eleven members of the Joomla! core team. Recently, we welcomed our 200,000th forum member and someone downloaded the 5,000,000th copy of Joomla!. Not to discredit the considerable efforts of those who are and who have served on the core team, but, if anyone is waiting on *them* to improve anything without significant help from this great community, those people will be waiting a very, very long time.
Thanks, Yuri, for selecting another problem that you find interesting to solve. I am confident you will solve it. I hope you have fun in the process and that you learn cool things. It would be awesome if you improved this area of study in such a way that other free software communities learned from your improvements.
Amy
- yvolk
- Joomla! Guru
- Posts: 979
- Joined: Thu Jun 01, 2006 1:52 pm
- Location: Moscow, Russia
- Contact:
Re: •Language slugs transliteration
Yeah, that was really fun game. I went to sleep at 3AM last night, that is VERY unusial for meAmyStephen wrote:Thanks, Yuri ... I hope you have fun in the process...
-
- Joomla! Champion
- Posts: 7018
- Joined: Wed Nov 22, 2006 3:35 pm
- Location: Nebraska
- Contact:
Re: •Language slugs transliteration
That is so cool when a project is that interesting that we sacrifice sleep! There have been a number of times I worked all through the night to get something done. Talk about feeling alive! Being able to accomplish something that is important to me is fulfilling. Anyway, congratulations and good luck! I hope it continues to be fun.
Amy
Amy
-
- Joomla! Fledgling
- Posts: 4
- Joined: Tue Oct 21, 2008 12:03 pm
Re: •Language slugs transliteration
I write some Chinese letter in alias field, but it falls back to date.Concretely it means that when the alias field remains empty, it is automatically filled by:
Is there any particular reason that the alias has to be ASCII only? Is it because the way joomla designed or for SEO/SEF purpose?
For SEO/SEF, search engines (e.g google) support CJK URLs such as "zh.wikipedia.org/wiki/首页", I don't think tranliteration is necessary in this situation, and I would rather like to see a URL in Chinese instead of "zh.wikipedia.org/wiki/1111-22-3-4-5.html".
regards.
- yvolk
- Joomla! Guru
- Posts: 979
- Joined: Thu Jun 01, 2006 1:52 pm
- Location: Moscow, Russia
- Contact:
Re: •Language slugs transliteration
IMHO it is for both reasonsdawnfantasy wrote:Is there any particular reason that the alias has to be ASCII only? Is it because the way joomla designed or for SEO/SEF purpose?
1. Joomla! makes this string 'URLSafe'
2. It is used for SEO/SEF
In my opinion, this is a matter of compatibility. Yes, Google uses non-ASCII chars in URLs and some Browsers understand them in some cases... but even Chinese wikipedia in your example "zh.wikipedia.org/wiki/首页" converts such URLs to something like this:
"http://zh.wikipedia.org/w/index.php?tit ... iant=zh-cn".
Looking at this URL, I see that transliteration would be much nicer, than HEX codes...
Text of all my messages is available under the terms of the GNU Free Documentation License: http://www.gnu.org/copyleft/fdl.html
-
- Joomla! Fledgling
- Posts: 4
- Joined: Tue Oct 21, 2008 12:03 pm
Re: •Language slugs transliteration
Would be nice if users could choose the method of transformation.
1. leave it, do not change. saved as utf8 in DB.
2. change to date.
3. Use a transliteration engine.
Regards
1. leave it, do not change. saved as utf8 in DB.
2. change to date.
3. Use a transliteration engine.
Regards
- tudorilisoi
- Joomla! Enthusiast
- Posts: 161
- Joined: Sun Nov 27, 2005 9:57 am
- Location: Romania
- Contact:
Re: •Language slugs transliteration
Hi,
There are a number of php scripts that can transliterate for you.
The simplest method would be checking for iconv library and transliterating the slugs into ascii
as simple as this:
libraries/joomla/language/language.php
replace this line (around line 224)
with
this will work well on romanian, swedish, danish, german, etc.
I don't know about russian or chinese
There are a number of php scripts that can transliterate for you.
The simplest method would be checking for iconv library and transliterating the slugs into ascii
as simple as this:
libraries/joomla/language/language.php
replace this line (around line 224)
Code: Select all
$string = htmlentities(utf8_decode($string));
Code: Select all
// TMJ MOD
if (function_exists('iconv')) {
$string=iconv('UTF-8','ASCII//TRANSLIT',$string);
$string = htmlentities($string);
} else {
$string = htmlentities(utf8_decode($string));
}
I don't know about russian or chinese
TeachMeJoomla.net - Joomla tutorials, tips, mods, and extensions. Joomla freelance custom programming/development
- infograf768
- Joomla! Master
- Posts: 19133
- Joined: Fri Aug 12, 2005 3:47 pm
- Location: **Translation Matters**
Re: •Language slugs transliteration
I am going to test this right now.tudorilisoi wrote:Hi,
There are a number of php scripts that can transliterate for you.
The simplest method would be checking for iconv library and transliterating the slugs into ascii
as simple as this:
libraries/joomla/language/language.php
replace this line (around line 224)
withCode: Select all
$string = htmlentities(utf8_decode($string));
this will work well on romanian, swedish, danish, german, etc.Code: Select all
// TMJ MOD if (function_exists('iconv')) { $string=iconv('UTF-8','ASCII//TRANSLIT',$string); $string = htmlentities($string); } else { $string = htmlentities(utf8_decode($string)); }
I don't know about russian or chinese
Jean-Marie Simonet / infograf
---------------------------------
ex-Joomla Translation Coordination Team • ex-Joomla! Production Working Group
---------------------------------
ex-Joomla Translation Coordination Team • ex-Joomla! Production Working Group
- infograf768
- Joomla! Master
- Posts: 19133
- Joined: Fri Aug 12, 2005 3:47 pm
- Location: **Translation Matters**
Re: •Language slugs transliteration
My tests show that it solves some issues but not all.
For example:
çĆĂğøµşçıÐöüŸÏ
gives
ccagouscidquotoquotuquotyquoti
i.e. only the first 10 glyphs are correctly transliterated. Not öüŸÏ
which would mean that we get Latin 1 Basic
Latin 1 supplement
Part only of Latin-extended-A
Better than before indeed.
BTW, code should be:
For example:
çĆĂğøµşçıÐöüŸÏ
gives
ccagouscidquotoquotuquotyquoti
i.e. only the first 10 glyphs are correctly transliterated. Not öüŸÏ
which would mean that we get Latin 1 Basic
Latin 1 supplement
Part only of Latin-extended-A
Better than before indeed.
BTW, code should be:
Code: Select all
function transliterate($string)
{
// TMJ MOD
if (function_exists('iconv')) {
$string=iconv('UTF-8','ASCII//TRANSLIT',$string);
$string = htmlentities($string);
} else {
$string = htmlentities(utf8_decode($string));
$string = preg_replace(
array('/ß/','/&(..)lig;/', '/&([aouAOU])uml;/','/&(.)[^;]*;/'),
array('ss',"$1","$1".'e',"$1"),
$string);
}
return $string;
}
Jean-Marie Simonet / infograf
---------------------------------
ex-Joomla Translation Coordination Team • ex-Joomla! Production Working Group
---------------------------------
ex-Joomla Translation Coordination Team • ex-Joomla! Production Working Group
- tudorilisoi
- Joomla! Enthusiast
- Posts: 161
- Joined: Sun Nov 27, 2005 9:57 am
- Location: Romania
- Contact:
Re: •Language slugs transliteration
Iconv is locale dependent
So when using the correct language and its particular accented/special characters, it will work well
If you set Romanian as the language and write some turkish titles, it will not work as expected
http://taschenorakel.de/mathias/2007/11 ... terations/
So when using the correct language and its particular accented/special characters, it will work well
If you set Romanian as the language and write some turkish titles, it will not work as expected
http://taschenorakel.de/mathias/2007/11 ... terations/
TeachMeJoomla.net - Joomla tutorials, tips, mods, and extensions. Joomla freelance custom programming/development
- tudorilisoi
- Joomla! Enthusiast
- Posts: 161
- Joined: Sun Nov 27, 2005 9:57 am
- Location: Romania
- Contact:
Re: •Language slugs transliteration
Also, the strings should not be html encoded. rawurlencode after transliteration would be better and would retain the special characters. I did not test it, but I'm sure it's the best way to go
TeachMeJoomla.net - Joomla tutorials, tips, mods, and extensions. Joomla freelance custom programming/development
- infograf768
- Joomla! Master
- Posts: 19133
- Joined: Fri Aug 12, 2005 3:47 pm
- Location: **Translation Matters**
Re: •Language slugs transliteration
rawurlencode has unwanted results in the sense that its transliteration can make an url so long and so non-user friendly that one would not be able to transmit it by other means than electronic devices.
Example:
is in Greek
I am still getting in the alias
" quote " which, as we know by the sample posted above is composed of " quot " and " e "
Example:
Code: Select all
http://el.wikipedia.org/wiki/%CE%9C%CE%B5%CE%B3%CE%AC%CE%BB%CE%B7_%CE%AD%CE%BA%CF%81%CE%B7%CE%BE%CE%B7
Concerning the locale, I have tested by using the French language packs for 1.5 which locale take care of the glyph " ë "
I am still getting in the alias
" quote " which, as we know by the sample posted above is composed of " quot " and " e "
Jean-Marie Simonet / infograf
---------------------------------
ex-Joomla Translation Coordination Team • ex-Joomla! Production Working Group
---------------------------------
ex-Joomla Translation Coordination Team • ex-Joomla! Production Working Group
- yvolk
- Joomla! Guru
- Posts: 979
- Joined: Thu Jun 01, 2006 1:52 pm
- Location: Moscow, Russia
- Contact:
Re: •Language slugs transliteration
BTW, I published v.1.1 of yvTransliterate plugin, that uses different translitaration tables (stored within XML file) for each (source) Language. So one may use transliteration, that is "standardized" (ISO...) or any custom...
In v.1.1:
Added another 'hook' into Joomla! core: parameter 'Extend JLanguage class'. Now yvTransliterate may transliterate not only aliases of Articles (or comments in a case of yvComment), but aliases of other elements of Joomla! site interface also: menu items, sections, categories... in fact, yvTransliterate works in every place, where Joomla! core called JLanguage::transliterate method.
Please note, that in this case yvTransliterate uses Language of current user as source Language for transliteration. So, for example, if you want Section alias to be transliterated according to Russian transliteration table, you have to log in to Administrator site (backend) in Russian language.
This feature works MUCH more effective under PHP5 (it creates proxy to the JLanguage object instead of creating (and populating...) second instance of JLanguage class.
In v.1.1:
Added another 'hook' into Joomla! core: parameter 'Extend JLanguage class'. Now yvTransliterate may transliterate not only aliases of Articles (or comments in a case of yvComment), but aliases of other elements of Joomla! site interface also: menu items, sections, categories... in fact, yvTransliterate works in every place, where Joomla! core called JLanguage::transliterate method.
Please note, that in this case yvTransliterate uses Language of current user as source Language for transliteration. So, for example, if you want Section alias to be transliterated according to Russian transliteration table, you have to log in to Administrator site (backend) in Russian language.
This feature works MUCH more effective under PHP5 (it creates proxy to the JLanguage object instead of creating (and populating...) second instance of JLanguage class.
Text of all my messages is available under the terms of the GNU Free Documentation License: http://www.gnu.org/copyleft/fdl.html
- infograf768
- Joomla! Master
- Posts: 19133
- Joined: Fri Aug 12, 2005 3:47 pm
- Location: **Translation Matters**
Re: •Language slugs transliteration
For those interested, I have made a unicode slugs system plugin which works OK in 1.5
See : http://info-graf.fr/infografcvs/Des-url ... C3%A9.html
(Alas this forum does not let it show as it could, i.e. info-graf.fr/infografcvs/Des-urls-de-toute-beauté.html)
FYI, core customizable transliteration has been added yesterday in 1.6 SVN 12997 (thanks Ercan ! )
See : http://info-graf.fr/infografcvs/Des-url ... C3%A9.html
(Alas this forum does not let it show as it could, i.e. info-graf.fr/infografcvs/Des-urls-de-toute-beauté.html)
FYI, core customizable transliteration has been added yesterday in 1.6 SVN 12997 (thanks Ercan ! )
Jean-Marie Simonet / infograf
---------------------------------
ex-Joomla Translation Coordination Team • ex-Joomla! Production Working Group
---------------------------------
ex-Joomla Translation Coordination Team • ex-Joomla! Production Working Group
- tudorilisoi
- Joomla! Enthusiast
- Posts: 161
- Joined: Sun Nov 27, 2005 9:57 am
- Location: Romania
- Contact:
Re: •Language slugs transliteration
Hi, it's been a while
I stumbled upon a great ASCIIfier, build for performance
It has an example covering accented characters in 130+ languages, including Greek , Hindi, Taiwanese, Chinese.
All transliterate nicely to their ASCII counterparts
There are no PHP locale or extension dependencies.
Looks like the holy Grail to me, as long as you take care to strip evil MS Word 3 byte illegal characters (such as the 0x96 long dash) before transliterating(otherwise the converting fails and throws an error).
http://sourceforge.net/projects/phputf8 ... _to_ascii/
I stumbled upon a great ASCIIfier, build for performance
It has an example covering accented characters in 130+ languages, including Greek , Hindi, Taiwanese, Chinese.
All transliterate nicely to their ASCII counterparts
There are no PHP locale or extension dependencies.
Looks like the holy Grail to me, as long as you take care to strip evil MS Word 3 byte illegal characters (such as the 0x96 long dash) before transliterating(otherwise the converting fails and throws an error).
http://sourceforge.net/projects/phputf8 ... _to_ascii/
TeachMeJoomla.net - Joomla tutorials, tips, mods, and extensions. Joomla freelance custom programming/development
- joomfriend
- Joomla! Explorer
- Posts: 284
- Joined: Sun Feb 08, 2009 5:10 pm
- Contact:
Re: •Language slugs transliteration
If I am not mistaken, this Alias/Transliteration issue is already fixed in SEF components like SH404SEF. However, it will be great to have it in the Joomla Core.
Many thanks for all your efforts.
Many thanks for all your efforts.
- https://www.adelnipet.com: Adelni Pet - Your Social Pet Network
- https://www.egliseprimitive.org: Christian Website
- https://www.egliseprimitive.org: Christian Website