common words in dutch are not blocked

General questions regarding the use of languages in Joomla! 4.x.

Moderator: General Support Moderators

Forum rules
Forum Rules
Absolute Beginner's Guide to Joomla! <-- please read before posting, this means YOU.
Forum Post Assistant - If you are serious about wanting help, you will use this tool to help you post.
Windows Defender SmartScreen Issues <-- please read this if using Windows 10.
Locked
User avatar
onderzoekspraktijk
Joomla! Apprentice
Joomla! Apprentice
Posts: 25
Joined: Sun Dec 25, 2005 2:01 pm
Location: Rotterdam
Contact:

common words in dutch are not blocked

Post by onderzoekspraktijk » Thu Mar 09, 2023 3:02 pm

When using smart search you can filter out so called common words.
Common words do not select relevant content because they are used frequently in all texts.

If a website visitor uses smart search you cannot block the input of common words. You can however block any search results from these common words.
Fot every content language installed there is a file called "com_finder.commonwords.txt". Inside that file there is a list of common words. The list is different for each language.

To activate this list, in the Options menu for the component in the Tab Index, you need to select Yes for Filter Common Words, and select an installed content language as the Default Language.
Smart Search- Options - Met Gerard - Administration.png
The Dutch language files also contains a list of Dutch common words.
When it is activated however none of the words in the Dutch common words list are blocked.

When the English or German language are selected it works: there are no search results for any common term in these lists.

I can see nog difference at all between these lists.
I used the language package without any changes. I deleted and reinstalled the Dutch language: it keeps presenting search results for any common word inside the Dutch list.

Is this a known bug, or am I missing something?

Cheers,
Paul
You do not have the required permissions to view the files attached to this post.

User avatar
toivo
Joomla! Master
Joomla! Master
Posts: 17441
Joined: Thu Feb 15, 2007 5:48 am
Location: Sydney, Australia

Re: common words in dutch are not blocked

Post by toivo » Sun Mar 12, 2023 5:17 am

Attempted to set up a multi-language test site but the results are strange. Installed first Joomla 4.2.8 from scratch with the en-GB language, which added 174 common words into the database table _finder_terms_common. Then installed the en-AU language pack and that operation deleted all the 174 rows from the table, no rows left. I will try to test this further and submit a bug report to Joomla! Issue Tracker - CMS.
Toivo Talikka, Global Moderator

User avatar
onderzoekspraktijk
Joomla! Apprentice
Joomla! Apprentice
Posts: 25
Joined: Sun Dec 25, 2005 2:01 pm
Location: Rotterdam
Contact:

Re: common words in dutch are not blocked

Post by onderzoekspraktijk » Sat Mar 25, 2023 3:08 pm

You mentioned the database table _finder_terms_common. In the end that helped me to get a work around that gave me a working solution.

To get there I took these steps:
- log in to the cpanel in your sitehosting;
- use the file manager to select and copy the list of common words inside the file com_finder.commonwords.txt, inside the language/nl-NL folder;
- paste this in a word-processor to switch the <br/> codes for </p><p> codes;
- paste the edited list of terms in a csv table (I used Numbers for this);
- sort the list alphabetically, and delete all doubles in the list (because doubles block import into the php tables later);
- using phpMyAdmin, export the msql table for common words _finder_terms_common as a csv file;
- copy-paste the edited list from your word-processor into the exported php table;
- sanitise the field: add nl codes in the second column, ad 0 as code in the third column, delete all en fields in this file;
- now, using phpMyAdmin, import this csv field into the mysql database table of the website itself.

After this it works as intended.
All searches for the Dutch language common words result in "not found". (In a perfect world this message would be something like "the search term is a common word that does not produce useful search results. Please use a less common word for your search.")

The Dutch language list has several doubles, i.e. the same word is two or more times part of the list. When importing such a list in the mysql table this results in errors.
I suppose the intended mechanism is that the list of common words is imported into the php table when the language is selected as the common wordt language in the smart search component settings. If so the mechanism does not work.

User avatar
toivo
Joomla! Master
Joomla! Master
Posts: 17441
Joined: Thu Feb 15, 2007 5:48 am
Location: Sydney, Australia

Re: common words in dutch are not blocked

Post by toivo » Mon Mar 27, 2023 5:04 am

That is a good workaround but a new language should add its list of words to the Common Words in Smart Search.

The issue in the language installation, where the Common Words are not updated, has now been raised and it should be categorised as a bug: [#40215] - [4.2.x] Installing new language does not update #__finder_terms_common.
Toivo Talikka, Global Moderator

User avatar
toivo
Joomla! Master
Joomla! Master
Posts: 17441
Joined: Thu Feb 15, 2007 5:48 am
Location: Sydney, Australia

Re: common words in dutch are not blocked

Post by toivo » Tue Mar 28, 2023 1:10 am

A response to the above error report asked me to test PR #39188 in Joomla 4.3.0-dev. It was possible to add the Common Words from the German language pack by installing it as an extension. However, the Dutch language pack did not add Common Words into the database table #__finder_terms_common when installed as an extension nor from Install - Languages.

We'll keep this topic unresolved until there is a fix to the issue.
Toivo Talikka, Global Moderator


Locked

Return to “Language - Joomla! 4.x”