SearXNG is a free internet metasearch engine which aggregates results from various search services and databases. Users are neither tracked nor profiled.
Go to file
Markus Heiser 9ae409a05a [mod] add locale.get_engine_locale to get predictable results
The match_language function sometimes returns incorrect results which is why a
new function get_engine_locale is required.

A bugfix of the match_language is not easily possible, because there is almost
no documentation for it and already the call parameters are undefined.  E.g. the
function processes values like the ones from yahoo::

    "yahoo": [
        "ar",
        ...
        "zh_chs",
        "zh_cht"
     ]

The get_engine_locale has been documented in detail, there is a clear
description of the assumptions as well as the requirements and approximation
rules (read doc-string for more details)::

    Argument ``engine_locales`` is a python dict that maps *SearXNG locales* to
    corresponding *engine locales*:

      <engine>: {
          # SearXNG string : engine-string
          'ca-ES'          : 'ca_ES',
          'fr-BE'          : 'fr_BE',
          'fr-CA'          : 'fr_CA',
          'fr-CH'          : 'fr_CH',
          'fr'             : 'fr_FR',
          ...
          'pl-PL'          : 'pl_PL',
          'pt-PT'          : 'pt_PT'
      }

    .. hint::

       The *SearXNG locale* string has to be known by babel!

In the following you will find a comparison:

>>> import babel.languages
>>> from searx.utils import match_language
>>> from searx.locales import get_engine_locale

Assume we have an engine that supports the follwoing locales:

>>> lang_list = {
...     "zh-CN": "zh_CN",
...     "zh-HK": "zh_HK",
...     "nl-BE": "nl_BE",
...     "fr-CA": "fr_CA",
... }

Assumption:

  A. When a user selects a language the results should be optimized according to
     the selected language.

  B. When user selects a language and a territory the results should be
     optimized with first priority on territory and second on language.

----

Example: (Assumption A.)

  A user selects region 'zh-TW' which should end in zh_HK

hint:
  CN is 'Hans' and HK ('Hant') fits better to TW ('Hant')

>>> get_engine_locale('zh-TW', lang_list)
'zh_HK'
>>> lang_list[match_language('zh-TW', lang_list)]
'zh_CN'

----

Example: (Assumption A.)

  A user selects only the language 'zh' which should end in CN

>>> get_engine_locale('zh', lang_list)
'zh_CN'
>>> lang_list[match_language('zh', lang_list)]
'zh_CN'

----

Example: (Assumption B.)

  A user selects region 'fr-BE' which should end in nl-BE

hint:
  priority should be on the territory the user selected.  If the user
  prefers 'fr' he will select 'fr' without a region tag.

>>> get_engine_locale('fr-BE', lang_list, default='unknown')
'nl_BE'
>>> match_language('fr-BE', lang_list, fallback='unknown')
'fr-CA'

----

Example: (Assumption A.)

  A user selects only the language 'fr' which should end in fr_CA

>>> get_engine_locale('fr', lang_list)
'fr_CA'
>>> lang_list[match_language('fr', lang_list)]
'fr_CA'

----

The difference in priority on the territory is best shown with a engine that
supports the following locales:

>>> lang_list = {
...     "fr-FR": "fr_FR",
...     "fr-CA": "fr_CA",
...     "en-GB": "en_GB",
...     "nl-BE": "nl_BE",
... }

----

Example: (Assumption A.)

   A user selects only a language

>>> get_engine_locale('en', lang_list)
'en_GB'
>>> match_language('en', lang_list)
'en-GB'

hint: the engine supports fr_FR and fr_CA since no territory is given, fr_FR
takes priority ..

>>> get_engine_locale('fr', lang_list)
'fr_FR'
>>> lang_list[match_language('fr', lang_list)]
'fr_FR'

----

Example: (Assumption B.)

  A user selects region 'fr-BE' which should end in nl-BE

>>> get_engine_locale('fr-BE', lang_list)
'nl_BE'
>>> lang_list[match_language('fr-BE', lang_list)]
'fr_FR'

----

If the user selects a language and there are two locales like the following:

>>> lang_list = {
...      "fr-BE": "fr_BE",
...      "fr-CH": "fr_CH",
...  }
>>>

>>> get_engine_locale('fr', lang_list)
'fr_BE'
>>> lang_list[match_language('fr', lang_list)]
'fr_BE'

Looks like both functions return the same value, but match_language depends on the
order of the dictionary (which is not predictable):

>>> lang_list = {
...      "fr-CH": "fr_CH",
...      "fr-BE": "fr_BE",
...  }
>>> get_engine_locale('fr', lang_list)
'fr_BE'
>>> lang_list[match_language('fr', lang_list)]
'fr_CH'
>>>

The get_engine_locale selects the locale by looking at the "population percent"
and this percentage has an higher amount in BE (68.%) compared to CH (21%)

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-08-14 10:35:55 +02:00
.github [clean up] drop obsolete searx, filtron and morty install scripts 2022-07-30 13:39:35 +02:00
dockerfiles docker: log to stdout 2022-03-19 13:47:45 +01:00
docs [mod] add 'Accept-Language' HTTP header to online processores 2022-08-01 17:01:59 +02:00
examples Fix whitespaces 2016-07-11 18:52:37 +07:00
searx [mod] add locale.get_engine_locale to get predictable results 2022-08-14 10:35:55 +02:00
searxng_extra fix searxng_extra/update/update*.py scripts 2022-07-02 12:16:00 +02:00
src/brand [simple] ImageLayout.watch: img_load_error.svg if img load fails 2021-11-29 21:10:13 +01:00
tests [fix] improve OpenSearch description 2022-08-11 19:04:36 +02:00
utils [fix] uWSGI: increase buffer-size 2022-07-31 12:40:06 +02:00
.coveragerc [mod] use github actions instead of travis 2020-11-17 15:09:06 +01:00
.dir-locals.el [emacs] flycheck should use the eslint checker from developer tools 2022-01-24 11:34:34 +01:00
.dockerignore [fix] tidy up ignore lists .gitignore & .dockerignore 2021-06-22 16:55:30 +02:00
.gitattributes [fix] update .gitattributes 2021-06-22 20:34:39 +02:00
.gitignore [fix] ensure that test.pyright installs pyright 2022-01-23 08:00:39 +01:00
.nvmrc Node: update to node 16.15.1 2022-06-25 14:07:39 +02:00
.pylintrc [fix] prepare for pylint 2.14.0 2022-06-03 15:41:52 +02:00
.weblate [translations] web integration 2021-08-07 15:06:06 +02:00
.yamllint.yml [enh] add test.yamllint - lint yaml files 2021-06-05 17:41:24 +02:00
AUTHORS.rst [mod] link to public-instances can be set to hidden 2022-07-04 13:26:01 +02:00
babel.cfg [fix] jinja/babel: WithExtension and AutoEscapeExtension are built-in now. 2022-03-25 09:42:12 +01:00
CHANGELOG.rst reference docs.searxng.org 2022-01-02 21:18:29 +01:00
CONTRIBUTING.md reference docs.searxng.org 2022-01-02 21:18:29 +01:00
Dockerfile Dockerfile: use alpine 3.16 2022-06-27 17:44:30 +00:00
LICENSE [fix] full AGPLv3+ license according to #382 2015-07-04 18:23:54 +02:00
Makefile [clean up] drop obsolete searx, filtron and morty install scripts 2022-07-30 13:39:35 +02:00
manage [fix] pyright repported errors 2022-07-30 18:04:44 +02:00
package.json Node: update to node 16.15.1 2022-06-25 14:07:39 +02:00
PULL_REQUEST_TEMPLATE.md Add PR template and contribution guidelines 2020-07-10 17:10:02 +02:00
pyrightconfig-ci.json [mod] add test.pyright to test & ci.test targets 2022-01-23 08:00:39 +01:00
pyrightconfig.json [fix] pyrightconfig.json include only dedicated folders in the test 2022-01-23 08:00:39 +01:00
README.rst [README] add doc-links: disable metrics & hostname replace 2022-07-05 14:19:48 +02:00
requirements-dev.txt Bump selenium from 4.3.0 to 4.4.0 2022-08-12 07:04:30 +00:00
requirements.txt Merge pull request #1656 from searxng/dependabot/pip/master/flask-2.2.2 2022-08-13 18:33:39 +02:00
SECURITY.md [enh] add security policy 2022-01-25 00:56:20 +01:00
setup.py [mod] replace /help by /info pages and include pages in project docs 2022-03-12 11:36:31 +01:00



Privacy-respecting, hackable metasearch engine

If you are looking for running instances, ready to use, then visit searx.space. Otherwise jump to the user, admin and developer handbooks you will find on our homepage.

SearXNG install SearXNG homepage SearXNG wiki AGPL License Issues commits weblate SearXNG logo


Contact

Come join us if you have questions or just want to chat about SearXNG.

Matrix

#searxng:matrix.org

IRC

#searxng on libera.chat which is bridged to Matrix.

Differences to searx

SearXNG is a fork of searx. Here are some of the changes:

User experience

  • Huge update of the simple theme:
    • usable on desktop, tablet and mobile
    • light and dark versions (you can choose in the preferences)
    • support right-to-left languages
    • see the screenshots
  • the translations are up to date, you can contribute on Weblate
  • the preferences page has been updated:
    • you can see which engines are reliable or not
    • engines are grouped inside each tab
    • each engine has a description
  • thanks to the anonymous metrics, it is easier to report a bug of an engine and thus engines get fixed more quickly
  • administrator can block and/or replace the URLs in the search results

Setup

  • you don't need Morty to proxy the images even on a public instance
  • you don't need Filtron to block bots, we implemented the builtin limiter
  • you get a well maintained Docker image, now also built for ARM64 and ARM/v7 architectures
  • alternatively we have up to date installation scripts

Contributing is easier

  • readable debug log
  • contributions to the themes are made easier, check out our Development Quickstart guide
  • a lot of code cleanup and bug fixes
  • the dependencies are up to date

Translations

We need translators, suggestions are welcome at https://weblate.bubu1.eu/projects/searxng/searxng/

Make a donation

You can support the SearXNG project by clicking on the donation page: https://docs.searxng.org/donate.html