Page 1 of 1

[MEDIUM:CONFIRMED:7714] Search Component - Problem with character "ň" and utf-8

Posted: Fri Jun 22, 2007 8:15 am
by H13
Hi to all, I am having problem with Search Component. If I enter "ň" character into search box I get the following error:

Warning: utf8_to_unicode: Incomplete multi-octet sequence in UTF-8 at byte 1 in D:\www\joomla\libraries\phputf8\utils\unicode.php on line 176

The same error was in [6719] but in [6731] not more.
http://forum.joomla.org/index.php?actio ... board=11.0

The next problem is, the search component change characters. e.g. in Czech language: 'žluťoučký kůň' to 'žluÅ¥ouÄ�ký kůÅ_' in search area.

Re: [7714] J!1.5 - Search Component - Problem with character "ň" and utf-8

Posted: Fri Jun 22, 2007 9:22 am
by akede
Hi,

I notified one of the devs about it - they need to check.


Alex

Re: [7714] J!1.5 - Search Component - Problem with character "ň" and utf-8

Posted: Fri Jun 22, 2007 9:44 am
by friesengeist
H13 wrote: The next problem is, the search component change characters. e.g. in Czech language: 'žluťoučký kůň' to 'žluÅ¥ouÄ�ký kůÅ_' in search area.
Confirmed.

Jinx, I'm wondering why we have to do a redirect after entering a search term at all, IMO it would be enough to have the search module perform a "GET" request to the correct URL, e.g. without SEF or with SEF . I don't see any reason why the search term should be part of the URL, usually one does not want to have search engines index the our own search results ???

Alex, I think this thread should be moved to the 1.5 Q&T forum...

Re: [7714] J!1.5 - Search Component - Problem with character "ň" and utf-8

Posted: Thu Jun 28, 2007 8:30 am
by user deleted
Mod note; moving to 1.5 Q&T

Re: [7714] J!1.5 - Search Component - Problem with character "ň" and utf-8

Posted: Thu Jun 28, 2007 2:53 pm
by CirTap
friesengeist wrote:I don't see any reason why the search term should be part of the URL, ...
users may want to bookmark a "search result", and since the purpose of a search is the GET data from a system not to store (POST), the search phrase/term must be present in the URL.
http://www.w3.org/2001/tag/doc/whenToUseGet.html

The fact that an application may in addition save the incoming search term is not relevant from the user's p.o.v. Even if GETting data may imply data storage (for statistics) in the backend, using GET is still appropriate.

Have fun,
CirTap

Re: [7714] J!1.5 - Search Component - Problem with character "ň" and utf-8

Posted: Thu Jun 28, 2007 5:18 pm
by friesengeist
CirTap wrote:
friesengeist wrote:I don't see any reason why the search term should be part of the URL, ...
users may want to bookmark a "search result", and since the purpose of a search is the GET data from a system not to store (POST), the search phrase/term must be present in the URL.
http://www.w3.org/2001/tag/doc/whenToUseGet.html
I should have written more clearly what I meant ;) The search word should be part of the URL which the user sees, but not part of the "static" parts within slashes. So something like http://localhost/joomla/search?q=blablabla is OK, whereas I don't see any reason why it should be http://localhost/joomla/search/blablabla.

Re: [UNDER REVIEW:7714] Search Component - Problem with character "ň" and utf-8

Posted: Thu Jun 28, 2007 6:20 pm
by CirTap
friesengeist wrote:something like http://localhost/joomla/search?q=blablabla is OK, whereas I don't see any reason why it should be http://localhost/joomla/search/blablabla.
ok, I see...
it could be arguable whether form-submitted data should be transformed into a path-like URI ... likely to become error-prone. doesn't make writing rules for routing any easier.

CirTap

Re: [UNDER REVIEW:7714] Search Component - Problem with character "ň" and utf-8

Posted: Mon Jul 30, 2007 1:19 pm
by Websitemaker
rev 8242 - still the same problem with local characters in search string .... 

if i try to search string "čćžšđČĆŽŠĐ" ... Joomla try to search string "Ä_Ä�žšÄ�Ä�Ä�ŽŠÄ_"

Re: [UNDER REVIEW:7714] Search Component - Problem with character "ň" and utf-8

Posted: Sat Aug 25, 2007 2:17 pm
by drank
Hi,

I have RC1 and still get this behavior when I search in Bulgarian - "тест" becomes "Ñ�еÑ�Ñ�".

Regards

Re: [UNDER REVIEW:7714] Search Component - Problem with character "ň" and utf-8

Posted: Sat Aug 25, 2007 2:40 pm
by H13
Last SVN - [8553] - still the same problem ... :P

Re: [UNDER REVIEW:7714] Search Component - Problem with character "ň" and utf-8

Posted: Sat Aug 25, 2007 3:01 pm
by Jinx
I have made changes on SVN, could you guys recheck this issue ?

Re: [UNDER REVIEW:7714] Search Component - Problem with character "ň" and utf-8

Posted: Sun Aug 26, 2007 9:04 am
by Websitemaker
well it works with few characters but with some not ...

for example if i search "čćžđČŽĆĐ"  (slovenian special characters)

i get this link and it produces error (first and last characters are not right in searchword):

http://localhost/j15/index.php?searchwo ... om_content


searchword should be: 

searchword=%C4%8D%C4%87%C5%BE%C4%91%C4%8C%C5%BD%C4%86%C4%90



edit: 

also this string "šŠ" produces error

http://localhost/j15/index.php?searchwo ... om_content

but searchword is OK:  searchword=%C5%A1%C5%A0

Re: [UNDER REVIEW:7714] Search Component - Problem with character "ň" and utf-8

Posted: Sun Aug 26, 2007 10:35 am
by H13
[8559] I still get the same error messages:

Notice: Trying to get property of non-object in D:\www\Joomla16\components\com_search\search.php on line 78
Warning: utf8_to_unicode: Incomplete multi-octet sequence in UTF-8 at byte 1 in D:\www\Joomla16\libraries\phputf8\utils\unicode.php on line 176
Warning: utf8_to_unicode: Incomplete multi-octet sequence in UTF-8 at byte 1 in D:\www\Joomla16\libraries\phputf8\utils\unicode.php on line 176


and ?_ instead of Czech characters ň, Ř, Á in Search Keyword form and in Search for ... with Google

Re: [UNDER REVIEW:7714] Search Component - Problem with character "ň" and utf-8

Posted: Sun Aug 26, 2007 11:44 am
by user deleted
Q&T Note; changing status to confirmed

Re: [MEDIUM:CONFIRMED:7714] Search Component - Problem with character "ň" and utf-8

Posted: Tue Oct 16, 2007 10:20 pm
by kelb
Fixed.

Re: [MEDIUM:CONFIRMED:7714] Search Component - Problem with character "ň" and ut

Posted: Wed Oct 17, 2007 8:43 am
by H13
Great!

Re: [MEDIUM:CONFIRMED:7714] Search Component - Problem with character "ň" and ut

Posted: Wed Oct 17, 2007 10:06 am
by user deleted
Confirmed and moving to resolved.

Re: [MEDIUM:FIXED:7714] Search Component - Problem with character "ň" and utf-8

Posted: Sat Oct 20, 2007 5:44 pm
by Jinx
Robin, you sure this has been fixed ?

Re: [MEDIUM:FIXED:7714] Search Component - Problem with character "ň" and utf-8

Posted: Sat Oct 20, 2007 6:53 pm
by user deleted
Jinx wrote: Robin, you sure this has been fixed ?
Hi Johan,

I tested just about everything that got reported in the thread. But I'll move it back to be sure, and re-test.

Re: [MEDIUM:FIXED:7714] Search Component - Problem with character "ň" and utf-8

Posted: Sat Oct 20, 2007 7:16 pm
by H13
Hi,

If I have a word e.g. "ňiň" in article and I want to search it with search function, I get no error message, but search function doesn't find this word.

e.g. there is a word "ňiň".

- I search "ňiň" - I get no results
- I search "nin" - I get all words which contain "ňiň"

SVN 9256

Jan

Re: [MEDIUM:FIXED:7714] Search Component - Problem with character "ň" and utf-8

Posted: Sun Oct 21, 2007 5:46 pm
by user deleted
I tested again, also with the latest results H13 posted. What I can confirm is the fact that too many results are returned so it does not look fixed yet. Re-opening report again.

Search box is not displaying special characters

Posted: Thu Apr 10, 2008 8:02 am
by darwajamadhu
i am searching for "vote “no,” will" .first time it is giving results.but in the search box it is dispalying special chars for the double quotes.if i search as exact search,the result is not coming.why bcz,the double quotes are replaced with special chars in the search box.any solution for this?pls help me.

Re: [MEDIUM:CONFIRMED:7714] Search Component - Problem with char

Posted: Wed Apr 23, 2008 11:49 am
by giris
It seems that the problems with local characters is still there when you use an IIS as server, my testserver is an Apache and there it seems to work out fine.

IIS: http://www.naturpasset.se/index.php/com ... phrase=all

Apache: http://nrespons.itmedia.se/index.php/co ... phrase=all

Any ideas how to solve it on the IIS?

/Per-Erik

Re: [MEDIUM:CONFIRMED:7714] Search Component - Problem with char

Posted: Wed Apr 23, 2008 12:45 pm
by H13
Hi, try to change the collation of your database tables (database, tables, columns). For me it works, I have had the latin_swedish collation (mysql default) in my columns. After changing it to utf-8 it works for me...
See: http://www.phoca.cz/articles/web/how-to ... -database/

Maybe it will help you.

Jan

Re: [MEDIUM:CONFIRMED:7714] Search Component - Problem with char

Posted: Wed Apr 23, 2008 8:25 pm
by giris
My tables and columns is already configured for UTF-8 :(

/Per-Erik

Re: [MEDIUM:CONFIRMED:7714] Search Component - Problem with

Posted: Wed May 11, 2011 10:28 am
by foo123
For anyone that might be having a similar problem here is a solution to
unicode.php incomplete mutli-octet error.

I had this error when transferring articles form JCE to notepad++ and back.
With utf-8 encoding seems some character(s) didn't get encoded well, anyway
Change this in unicode.php at libraries/phputf8/utils/unicode.php line 167 onwards

Code: Select all

           
 } else {
                /**
                *((0xC0 & (*in) != 0x80) && (mState != 0))
                * Incomplete multi-octet sequence.
                */
                /*trigger_error(
                        'utf8_to_unicode: Incomplete multi-octet '.
                        '   sequence in UTF-8 at byte '.$i,
                        E_USER_WARNING
                    );

                return FALSE;*/
                    //initialize UTF8 cache
                    // mine
					$mState = 0;
                    $mUcs4  = 0;
                    $mBytes = 1;
            }
as for when this error happens when inserting foreign chars into searchboxes etc..
in Joomla 1.5.22 with JoomFish 2 and Virtuemart 1.1.8 when inserting foreign chars into searchboxes seems to work correctly.
Anyway just type the chars in a text-editor encode in utf8 and paste into search box
but previous hack would also do the job..

Re: [MEDIUM:CONFIRMED:7714] Search Component - Problem with

Posted: Sun Feb 05, 2012 1:34 pm
by meldweny
i have the same error after installing jomsocial joomla 1.7
foo123
i tried your code but it gives blank page

Re: [MEDIUM:CONFIRMED:7714] Search Component - Problem with

Posted: Sun Feb 05, 2012 7:16 pm
by Per Yngve Berg
meldweny: This thread is from 2007 and is for Joomla 1.5.

Post your question making a new topic in the appropriate forum (j2.5/1.7).