Commit graph

4639 commits

Author SHA1 Message Date
Hackurei
cb92767f19 [feat] enigine: add CrowdView forum search engine 2023-07-07 21:36:11 +02:00
searxng-bot
4a2f310da3 [translations] update from Weblate
152f2008 - 2023-07-05 - return42 <markus.heiser@darmarit.de>
9dbf6b22 - 2023-07-01 - return42 <markus.heiser@darmarit.de>
4ad4c00f - 2023-07-01 - Bananhylsa <thayer@hjemmeserver.net>
2023-07-07 21:13:47 +02:00
Markus Heiser
5720844fcd [doc] rearranges Settings & Engines docs for better readability
We have built up detailed documentation of the *settings* and the *engines* over
the past few years.  However, this documentation was still spread over various
chapters and was difficult to navigate in its entirety.

This patch rearranges the Settings & Engines documentation for better
readability.

To review new ordered docs::

   make docs.clean docs.live

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-07-01 22:45:19 +02:00
searxng-bot
81c9a18456 [translations] update from Weblate
2238e87b - 2023-06-28 - jenishngl <jenishngl+codeberg@gmail.com>
c70d228a - 2023-06-24 - nogb <u8cn71wq@yogibo.anonaddy.me>
389c0c62 - 2023-06-24 - return42 <markus.heiser@darmarit.de>
656d9fcb - 2023-06-23 - return42 <markus.heiser@darmarit.de>
a9c9b116 - 2023-06-25 - alma <alma@users.noreply.translate.codeberg.org>
528b845f - 2023-06-24 - nogb <u8cn71wq@yogibo.anonaddy.me>
b8c50f23 - 2023-06-23 - return42 <markus.heiser@darmarit.de>
39f47c0f - 2023-06-23 - return42 <markus.heiser@darmarit.de>
ae0aa811 - 2023-06-24 - Fjuro <ifjuro@proton.me>
c8216259 - 2023-06-26 - lemonadeforlife <nahianlabiblimon44@gmail.com>
2023-06-30 11:49:07 +02:00
dalf
fbb72fc1f4 Update searx.data - update_engine_descriptions.py 2023-06-29 13:59:25 +02:00
Markus Heiser
87e7926ae9 [fix] engine: Anna's Archive - grep results from '.js-scroll-hidden' elements
The renderuing of the WEB page is very strange; except the firts position all
other positions of Anna's result page are enclosed in SGML comments.  These
cooments are *uncommented* by some JS code, see query of the class
'.js-scroll-hidden' in Anna's HTML template [1].

[1] https://annas-software.org/AnnaArchivist/annas-archive/-/blob/main/allthethings/templates/macros/md5_list.html

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-29 09:32:57 +02:00
Markus Heiser
e2df6b77a3 [mod] engine: Anna's Archive - additionl settings (content, sort, ext)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-29 09:32:57 +02:00
Markus Heiser
eafc2906f1 [mod] engine: Anna's Archive - fetch search arguments from search form
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-29 09:32:57 +02:00
Paolo Basso
7adb9090e5 [mod] engine: Anna's Archive - add language support 2023-06-29 09:32:57 +02:00
Paolo Basso
e5637fe7b9 [feat] engine: implementation of Anna's Archive
Anna's Archive [1] is a free non-profit online shadow library metasearch engine
providing access to a variety of book resources (also via IPFS), created by a
team of anonymous archivists [2].

[1] https://annas-archive.org/
[2] https://annas-software.org/AnnaArchivist/annas-archive
2023-06-29 09:32:57 +02:00
Markus Heiser
fd26f37073 [upd] make data.all
- ahmia_blacklist.txt
- currencies.json
- engine_descriptions.json
- engine_traits.json
- osm_keys_tags.json
- useragents.json

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-28 21:21:53 +02:00
Markus Heiser
efea962504 [fix] simple template: preferences - add missing icon_smal import
Related: https://github.com/searxng/searxng/commit/2149e88bdd64#r119535272
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-28 18:36:52 +02:00
Paolo Basso
401561cb58 [mod] engine torznab - refactor & option to hide links
- torznab engine using types and clearer code
- torznab option to hide torrent and magnet links.
- document the torznab engine
- add myself to authors

Closes: https://github.com/searxng/searxng/issues/1124
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-28 10:03:44 +02:00
Markus Heiser
da7c30291d [fix] Google API changed
It seems that Google is rolling out a modified WEB API [1][2].

In the past there was only the UI language in the `hl` argument but nowadays it
seems a combination of the UI language and the "search region" is mixed in this
argument and the `gl` argument has been removed.  I'm very surprised that google
is starting to mix the parameters of the UI with the parameters of the search
index.

This patch modifies the get_google_info(..) function.  Beside Google-WEB this
function is also used by other Google services, here are some examples to test
region & language of ..

- Google-WEB:    `!go dragon boat :en-CA`
- Google-News:   `!gon dragon boat :en-CA`
- Google-Videos: `!gov bmw :en-CA`
- Goolge-Images  `!goi bmw :en-CA`

- [1] https://github.com/searxng/searxng/issues/2515#issuecomment-1606294635
- [2] https://github.com/searxng/searxng/issues/2515#issuecomment-1607150817

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-26 18:28:09 +02:00
Markus Heiser
e8706fb738 [fix] engine & network issues / documentation and type annotations
This patch fixes some quirks and issues related to the engines and the network.
Each engine has its own network and this network was broken for the following
engines[1]:

- archlinux
- bing
- dailymotion
- duckduckgo
- google
- peertube
- startpage
- wikipedia

Since the files have been touched anyway, the type annotaions of the engine
modules has also been completed so that error messages from the type checker are
no longer reported.

Related and (partial) fixed issue:

- [1] https://github.com/searxng/searxng/issues/762#issuecomment-1605323861
- [2] https://github.com/searxng/searxng/issues/2513
- [3] https://github.com/searxng/searxng/issues/2515

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-25 13:58:26 +02:00
searxng-bot
2e4a435134 [translations] update from Weblate
9512b92a - 2023-06-23 - Coccocoas_Helper <coccocoahelper@gmail.com>
ca08c51e - 2023-06-23 - Coccocoas_Helper <coccocoahelper@gmail.com>
56ad4f21 - 2023-06-21 - return42 <markus.heiser@darmarit.de>
3ee419d6 - 2023-06-21 - return42 <markus.heiser@darmarit.de>
2023-06-23 09:34:46 +02:00
Markus Heiser
86db08793b [fix] implement a JSONEncoder for the json format
This patch implements a simple JSONEncoder just to fix #2502 / on the long term
SearXNG needs a data schema for the result items and a json generator for the
result list.

Closes: https://github.com/searxng/searxng/issues/2505
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-19 19:49:44 +02:00
Markus Heiser
fa1ef9a07b [mod] move some code from webapp module to webutils module (no functional change)
Over the years the webapp module became more and more a mess.  To improve the
modulaization a little this patch moves some implementations from the webapp
module to webutils module.

HINT: this patch brings non functional change

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-19 19:49:44 +02:00
searxng-bot
71b6ff07ca [translations] update from Weblate
98f61c70 - 2023-06-15 - alexgabi <alexgabi@disroot.org>
a1679b93 - 2023-06-13 - return42 <markus.heiser@darmarit.de>
ebd1d574 - 2023-06-13 - return42 <markus.heiser@darmarit.de>
b28a1da3 - 2023-06-13 - return42 <markus.heiser@darmarit.de>
56409bf0 - 2023-06-11 - return42 <markus.heiser@darmarit.de>
abc4916c - 2023-06-10 - return42 <markus.heiser@darmarit.de>
b1900abe - 2023-06-10 - return42 <markus.heiser@darmarit.de>
b48e84c4 - 2023-06-10 - return42 <markus.heiser@darmarit.de>
bf395e32 - 2023-06-10 - return42 <markus.heiser@darmarit.de>
c9c0a3c9 - 2023-06-10 - return42 <markus.heiser@darmarit.de>
3f50d31e - 2023-06-10 - return42 <markus.heiser@darmarit.de>
9da1c142 - 2023-06-09 - artnay <jiri.gronroos@iki.fi>
2023-06-16 09:20:43 +02:00
searxng-bot
1be27d5d83 [translations] update from Weblate
b40da1a3 - 2023-06-06 - return42 <markus.heiser@darmarit.de>
666ee7d4 - 2023-06-06 - return42 <markus.heiser@darmarit.de>
1e0e8ead - 2023-06-06 - return42 <markus.heiser@darmarit.de>
404b9937 - 2023-06-07 - Ivan Gabaldon <admin@inetol.net>
a627f9a1 - 2023-06-04 - return42 <markus.heiser@darmarit.de>
a234d2f8 - 2023-06-04 - gallegonovato <fran-carro@hotmail.es>
cc41f9b5 - 2023-06-02 - return42 <markus.heiser@darmarit.de>
24651eac - 2023-06-02 - return42 <markus.heiser@darmarit.de>
c37b0627 - 2023-06-02 - return42 <markus.heiser@darmarit.de>
9a435ea1 - 2023-06-02 - return42 <markus.heiser@darmarit.de>
40e0adad - 2023-06-02 - return42 <markus.heiser@darmarit.de>
6833b142 - 2023-06-02 - return42 <markus.heiser@darmarit.de>
00f397ad - 2023-06-02 - tentsbet <remendne@pentrens.jp>
7d3d4a97 - 2023-06-02 - return42 <markus.heiser@darmarit.de>
f7d713a4 - 2023-06-02 - return42 <markus.heiser@darmarit.de>
b1ec3160 - 2023-06-03 - ghose <correo@xmgz.eu>
04591a3a - 2023-06-02 - return42 <markus.heiser@darmarit.de>
cb3ac67c - 2023-06-02 - return42 <markus.heiser@darmarit.de>
fe81dbc7 - 2023-06-02 - return42 <markus.heiser@darmarit.de>
7882670f - 2023-06-02 - return42 <markus.heiser@darmarit.de>
38882f3b - 2023-06-02 - return42 <markus.heiser@darmarit.de>
c6df5047 - 2023-06-02 - return42 <markus.heiser@darmarit.de>
6ca23c3b - 2023-06-02 - return42 <markus.heiser@darmarit.de>
72f1ee09 - 2023-06-02 - return42 <markus.heiser@darmarit.de>
2023-06-09 07:07:51 +00:00
Markus Heiser
22b13f4fa5 [mod] tools.Config.get(): add missing type annotations
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-05 14:07:19 +02:00
Markus Heiser
f3763d73ad [mod] limiter: blocklist and passlist (ip_lists)
A blocklist and a passlist can be configured in /etc/searxng/limiter.toml::

    [botdetection.ip_lists]
    pass_ip = [
      '51.15.252.168',  # IPv4 of check.searx.space
    ]

    block_ip = [
      '93.184.216.34',  # IPv4 of example.org
    ]

Closes: https://github.com/searxng/searxng/issues/2127
Closes: https://github.com/searxng/searxng/pull/2129
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-05 14:07:19 +02:00
Markus Heiser
f77807257b [fix] engines: don't spam marginalia.nu with default settings
The engine configuration of marginalia [2][3][4][5] spams marginalia.nu with
requests from SearXNG instances [1].  It is not in the interest of SearXNG to
disturb other FOSS projects, so the engine will be removed::

    - name: marginalia
      engine: json_engine
      shortcut: mar
      categories: general
      paging: false
      # Key and license: https://www.marginalia.nu/marginalia-search/api/
      # index: 0 popular, 1 blogs, 2 big_sites, 3 default, 4 experimental
      search_url: https://api.marginalia.nu/<insert your key here>/search/{query}?index=4&count=20
      results_query: results
      url_query: url
      title_query: title
      content_query: description
      timeout: 1.5
      disabled: true
      about:
        website: https://www.marginalia.nu/
        official_api_documentation: https://api.marginalia.nu/
        use_official_api: true
        require_api_key: true
        results: JSON

[1] https://github.com/searxng/searxng/issues/1673
[2] https://github.com/searxng/searxng/pull/1627
[3] https://github.com/searxng/searxng/issues/1620
[4] https://news.ycombinator.com/item?id=35874640
[5] d82a858491/code/services-satellite/api-service/src/main/java/nu/marginalia/api/svc/ResponseCache.java (L12-L20)

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-05 08:23:17 +02:00
Markus Heiser
80aaef6c95
Merge pull request #2357 / limiter -> botdetection
The monolithic implementation of the limiter was divided into methods and
implemented in the Python package searx.botdetection.  Detailed documentation on
the methods has been added.

The methods are divided into two groups:

1. Probe HTTP headers

- Method http_accept
- Method http_accept_encoding
- Method http_accept_language
- Method http_connection
- Method http_user_agent

2. Rate limit:

- Method ip_limit
- Method link_token (new)

The (reduced) implementation of the limiter is now in the module
searx.botdetection.limiter.  The first group was transferred unchanged to this
module.  The ip_limit contains the sliding windows implemented by the limiter so
far.

This merge also fixes some long outstandig issue:

- limiter does not evaluate the Accept-Language correct [1]
- limiter needs a IPv6 prefix to block networks instead of IPs [2]

Without additional configuration the limiter works as before (apart from the
bugfixes).  For the commissioning of additional methods (link_toke), a
configuration must be made in an additional configuration file.  Without this
configuration, the limiter runs as before (zero configuration).

The ip_limit Method implements the sliding windows of the vanilla limiter,
additionally the link_token method can be used in this method.  The link_token
method can be used to investigate whether a request is suspicious. To activate
the link_token method in the ip_limit method add the following to your
/etc/searxng/limiter.toml::

    [botdetection.ip_limit]
    link_token = true


[1] https://github.com/searxng/searxng/issues/2455
[2] https://github.com/searxng/searxng/issues/2477
2023-06-03 06:00:15 +02:00
Markus Heiser
1a1ab34d9d [fix] URL percent-encoding in translations fail in babel
Closes: https://github.com/searxng/searxng/issues/2482
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-02 20:30:41 +02:00
Markus Heiser
b867c39ce0 [build] /static 2023-06-02 19:05:43 +02:00
Markus Heiser
2149e88bdd [mod] template preferences: split into elements (no functional change)
HINT: this patch has no functional change / it is the preparation for following
      changes and bugfixes

Over the years, the preferences template became an unmanageable beast.  To make
the source code more readable the monolith is splitted into elements.  The
splitting into elements also has the advantage that a new template can make use
of them.

The reversed checkbox is a quirk that is only used in the prefereces and must be
eliminated in the long term.  For this the macro 'checkbox_onoff_reversed' was
added to the preferences.html template.  The 'checkbox' macro is also a quirk of
the preferences.html we don't want to use in other templates (it is an
input-checkbox in a HTML form that was misused for status display).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-02 19:05:43 +02:00
searxng-bot
789b43ab60 [translations] update from Weblate
5344314f - 2023-05-30 - return42 <markus.heiser@darmarit.de>
ee8fd955 - 2023-06-01 - BBTranslate <357835338@qq.com>
1ce31caf - 2023-05-29 - return42 <markus.heiser@darmarit.de>
fe75c53d - 2023-05-29 - return42 <markus.heiser@darmarit.de>
ca60af52 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
f34b88f3 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
22d76a26 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
43d8c982 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
43a92e85 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
2bfc12dd - 2023-05-29 - return42 <markus.heiser@darmarit.de>
e2b5fb5f - 2023-05-29 - return42 <markus.heiser@darmarit.de>
9f088420 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
bdf81b4c - 2023-05-29 - return42 <markus.heiser@darmarit.de>
f6a24c5d - 2023-05-30 - return42 <markus.heiser@darmarit.de>
01bcea56 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
8c0209f8 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
c629c610 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
a4e4945d - 2023-05-29 - return42 <markus.heiser@darmarit.de>
96bad166 - 2023-06-01 - mradalbert <mister.adalbert@gmail.com>
b0032d90 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
366adaef - 2023-05-29 - return42 <markus.heiser@darmarit.de>
2e4271bf - 2023-05-29 - return42 <markus.heiser@darmarit.de>
c5856fd6 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
790b5a6f - 2023-05-29 - return42 <markus.heiser@darmarit.de>
6c9f92a9 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
f5a6a35d - 2023-05-29 - return42 <markus.heiser@darmarit.de>
4c8eeb32 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
7b8c0618 - 2023-05-30 - nicfab <nicfab@icloud.com>
4e851dd4 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
0fa6006e - 2023-05-29 - return42 <markus.heiser@darmarit.de>
877f4396 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
c3bb1da7 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
e66e6fae - 2023-05-30 - return42 <markus.heiser@darmarit.de>
1cac4771 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
949e994f - 2023-05-28 - ghose <correo@xmgz.eu>
8b181582 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
65f8fb93 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
e5088e1c - 2023-05-29 - return42 <markus.heiser@darmarit.de>
f151100c - 2023-05-29 - return42 <markus.heiser@darmarit.de>
51d169fa - 2023-05-29 - return42 <markus.heiser@darmarit.de>
e68ac961 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
c336c5a1 - 2023-05-31 - dom1torii <djmdmitri.a@gmail.com>
88bda0d0 - 2023-05-30 - Fijxu <fijxu@zzls.xyz>
6a57c29a - 2023-05-29 - return42 <markus.heiser@darmarit.de>
0c585b4d - 2023-05-30 - return42 <markus.heiser@darmarit.de>
e8ca9891 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
817b2da4 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
6b2508aa - 2023-05-29 - return42 <markus.heiser@darmarit.de>
3a5b1842 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
fd826ab8 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
a3938c43 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
30cad6b2 - 2023-05-30 - Ivan Gabaldon <admin@inetol.net>
e997055f - 2023-05-30 - return42 <markus.heiser@darmarit.de>
de6bd3d8 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
ba5e0129 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
e48fd248 - 2023-05-29 - return42 <markus.heiser@darmarit.de>
b0e7d3f1 - 2023-05-30 - return42 <markus.heiser@darmarit.de>
2023-06-02 09:34:36 +02:00
Markus Heiser
80af38d37b [mod] increase SUSPICIOUS_IP_WINDOW from one day to 30 days
In my tests I see bots rotating IPs (with endless IP lists).  If such a bot has
100 IPs and has three attempts (SUSPICIOUS_IP_MAX = 3) then it can successfully
send up to 300 requests in one day while rotating the IP.  To block the bots for
a longer period of time the SUSPICIOUS_IP_WINDOW, as the time period in which an
IP is observed, must be increased.

For normal WEB-browsers this is no problem, because the SUSPICIOUS_IP_WINDOW is
deleted as soon as the CSS with the token is loaded.

SUSPICIOUS_IP_WINDOW = 3600 * 24 * 30
  Time (sec) before sliding window for one suspicious IP expires.

SUSPICIOUS_IP_MAX = 3
  Maximum requests from one suspicious IP in the :py:obj:`SUSPICIOUS_IP_WINDOW`."""

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-01 16:00:49 +02:00
Markus Heiser
281e36f4b7 [fix] limiter: replace real_ip by IPv4/v6 network
Closes: https://github.com/searxng/searxng/issues/2477
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-01 15:51:14 +02:00
Markus Heiser
38431d2e14 [fix] correct determination of the IP for the request
For correct determination of the IP to the request the function
botdetection.get_real_ip() is implemented.  This fonction is used in the
ip_limit and link_token method of the botdetection and it is used in the
self_info plugin.

A documentation about the X-Forwarded-For header has been added.

[1] https://github.com/searxng/searxng/pull/2357#issuecomment-1566211059

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-01 14:38:53 +02:00
Markus Heiser
b8c7c2c9aa [mod] botdetection - improve ip_limit and link_token methods
- counting requests in LONG_WINDOW and BURST_WINDOW is not needed when the
  request is validated by the link_token method [1]

- renew a ping-key on validation [2], this is needed for infinite scrolling,
  where no new token (CSS) is loaded. / this does not fix the BURST_MAX issue in
  the vanilla limiter

- normalize the counter names of the ip_limit method to 'ip_limit.*'

- just integrate the ip_limit method straight forward in the limiter plugin /
  non intermediate code --> ip_limit now returns None or a werkzeug.Response
  object that can be passed by the plugin to the flask application / non
  intermediate code that returns a tuple

[1] https://github.com/searxng/searxng/pull/2357#issuecomment-1566113277
[2] https://github.com/searxng/searxng/pull/2357#discussion_r1208542206
[3] https://github.com/searxng/searxng/pull/2357#issuecomment-1566125979

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-01 14:38:53 +02:00
Markus Heiser
52f1452c09 [mod] limiter: ip_limt - monitore suspicious IPs
To intercept bots that get their IPs from a range of IPs, there is a
``SUSPICIOUS_IP_WINDOW``.  In this window the suspicious IPs are stored for a
longer time.  IPs stored in this sliding window have a maximum of
``SUSPICIOUS_IP_MAX`` accesses before they are blocked.  As soon as the IP makes
a request that is not suspicious, the sliding window for this IP is droped.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-01 14:38:53 +02:00
Markus Heiser
9d7456fd6c [fix] limiter.toml: botdetection.ip_limit turn off link_token by default
To activate the ``link_token`` method in the ``ip_limit`` method add the
following to your ``/etc/searxng/limiter.toml``::

   [botdetection.ip_limit]
   link_token = true

Related: https://github.com/searxng/searxng/pull/2357#issuecomment-1554116941
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-01 14:38:53 +02:00
Markus Heiser
66fdec0eb9 [mod] limiter: add config file /etc/searxng/limiter.toml
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-01 14:38:53 +02:00
Markus Heiser
1ec325adcc [mod] limiter -> botdetection: modularization and documentation
In order to be able to meet the outstanding requirements, the implementation is
modularized and supplemented with documentation.

This patch does not contain functional change, except it fixes issue #2455

----

Aktivate limiter in the settings.yml and simulate a bot request by::

    curl -H 'Accept-Language: de-DE,en-US;q=0.7,en;q=0.3' \
         -H 'Accept: text/html'
         -H 'User-Agent: xyz' \
         -H 'Accept-Encoding: gzip' \
         'http://127.0.0.1:8888/search?q=foo'

In the LOG:

    DEBUG   searx.botdetection.link_token : missing ping for this request: .....

Since ``BURST_MAX_SUSPICIOUS = 2`` you can repeat the query above two time
before you get a "Too Many Requests" response.

Closes: https://github.com/searxng/searxng/issues/2455
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-29 14:54:56 +02:00
Markus Heiser
5226044c13 [mod] limiter: add random token to the limiter URL
By adding a random component in the limiter URL a bot can no longer send a ping
by request a static URL.

Related: https://github.com/searxng/searxng/pull/2357#issuecomment-1518525094
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-29 14:54:56 +02:00
Markus Heiser
dba569462d [mod] limiter: reduce request rates for requests without a ping
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-29 14:54:56 +02:00
dalf
c1b5ff7e1c Update searx.data - update_engine_descriptions.py 2023-05-29 07:28:50 +02:00
dalf
2ba50d392e Update searx.data - update_currencies.py 2023-05-29 07:28:18 +02:00
dalf
cb843ef13c Update searx.data - update_engine_traits.py 2023-05-29 07:27:40 +02:00
dalf
512e001277 Update searx.data - update_firefox_version.py 2023-05-29 07:27:07 +02:00
dalf
f03ac9b152 Update searx.data - update_wikidata_units.py 2023-05-29 07:26:47 +02:00
dalf
e12e350f7f Update searx.data - update_ahmia_blacklist.py 2023-05-29 07:26:20 +02:00
Markus Heiser
3ca97cf5e3 [fix] simple theme: move engine alerts in case of no results into sidebar
If there were no results but errors in the engines then the error dialogs of the
engines was displayed in the result list.

With the new design errors of the engines should only be displayed in the
sidebar and at the same time duplications of the (template) code will be
avoided.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-28 12:19:32 +02:00
mrpaulblack
60b94dfdca [build] /static 2023-05-28 12:19:32 +02:00
mrpaulblack
f087959b02 [mod] simple theme: build design for details (collapsables)
* set border top and bottom on sidebar collasables
* inrease peading on summary so its easier to click on mobile
* remove margins and add flex wrapper to normalize elements in sidebar
2023-05-28 12:19:32 +02:00
Markus Heiser
b7e315563d [mod] simple theme: collaps/expand elements in the sidebar
Make elements in the sidebar collapse able.  Except infoboxes all elements in
the sidebar are collapsed by default.

By folding out the sidebar elements, the UI looks less cluttered.  Especially on
small devices like smartphones, where the sidebar is above the results list, the
UX should be improved [1].

[1] https://github.com/searxng/searxng/issues/2140

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-28 12:19:32 +02:00
searxng-bot
e8054026fb [translations] update from Weblate
69171f12 - 2023-05-25 - fabiosantoscode <fabiosantosart@gmail.com>
2caaed0a - 2023-05-23 - trmx <borcan.cristian1@gmail.com>
84d1702b - 2023-05-21 - return42 <markus.heiser@darmarit.de>
65cc6eb8 - 2023-05-21 - return42 <markus.heiser@darmarit.de>
e0ab3383 - 2023-05-22 - return42 <markus.heiser@darmarit.de>
23e87f15 - 2023-05-21 - return42 <markus.heiser@darmarit.de>
14f0fc6b - 2023-05-21 - return42 <markus.heiser@darmarit.de>
5b7c7b7d - 2023-05-21 - return42 <markus.heiser@darmarit.de>
c725b38d - 2023-05-21 - return42 <markus.heiser@darmarit.de>
2023-05-26 07:08:21 +00:00
Markus Heiser
ebe22a4319 [fix] typo: dues --> does
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-22 13:18:22 +02:00
Markus Heiser
bc647fabaf [fix] ClientPref - don't raise exception if Accept-Language is invalid
If the Accept-Language header [1] is set but empty or holds a value that is
unknown to babel, an excpetion is raised::

    $ curl --header 'Accept-Language: xyz' 'http://127.0.0.1:8888/search?q=foo'
    ...
    Traceback (most recent call last):
      File "searx/preferences.py", line 335, in from_http_request
        return cls(locale=pairs[0][0])
    IndexError: list index out of range

[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Language

Reported by: @Eolien55 in https://github.com/searxng/searxng/issues/2434#issuecomment-1556199789
Closes: https://github.com/searxng/searxng/issues/2434
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-22 12:38:59 +02:00
Markus Heiser
bab730c8a8
Merge pull request #2446 from searxng/translations_update
Update translations
2023-05-19 19:59:58 +02:00
pankaj
4900c091a6 use logger.warning
logger.warn() is depricated.
logger.warning is already being used in some files.
2023-05-19 19:35:29 +05:30
searxng-bot
23b53a03c2 [translations] update from Weblate
2eeec66c - 2023-05-13 - return42 <markus.heiser@darmarit.de>
87058e51 - 2023-05-13 - return42 <markus.heiser@darmarit.de>
2023-05-19 07:07:56 +00:00
Markus Heiser
007a615ffa [mod] donation_url: disable by default
SearXNG's donation campaign has been ended.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-15 09:19:17 +02:00
Markus Heiser
6625b6193e
Merge pull request #2420 from searxng/translations_update
Update translations
2023-05-13 06:06:07 +02:00
Markus Heiser
caebd297e9 [fix] engine ddg: minor change in the API of ddg
Closes: https://github.com/searxng/searxng/issues/2419
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-12 18:58:49 +02:00
searxng-bot
f5c407eaa0 [translations] update from Weblate
df7e1be3 - 2023-05-10 - return42 <markus.heiser@darmarit.de>
7ae9877e - 2023-05-08 - return42 <markus.heiser@darmarit.de>
c2fe5131 - 2023-05-07 - KDesp73 <kdesp2003@gmail.com>
2023-05-12 07:07:40 +00:00
Markus Heiser
a60851bd59 [fix] version format string generated by 'git show'
Newer versions of git [1] do no longer support a format string that includes a minus
to remove leading zeros [2].  The format string '%Y.%m.%d'  is more version rod.

[1] https://github.com/searxng/searxng/issues/2413#issuecomment-1542320387
[2] https://github.com/searxng/searxng/pull/2122/files

Closes: https://github.com/searxng/searxng/issues/2413
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-10 18:33:45 +02:00
Markus Heiser
db95c4713b
Merge pull request #2406 from searxng/dependabot/npm_and_yarn/searx/static/themes/simple/master/grunt-contrib-cssmin-5.0.0
Bump grunt-contrib-cssmin from 4.0.0 to 5.0.0 in /searx/static/themes/simple
2023-05-06 05:54:15 +02:00
dependabot[bot]
673837f83e
Bump grunt-contrib-cssmin in /searx/static/themes/simple
Bumps [grunt-contrib-cssmin](https://github.com/gruntjs/grunt-contrib-cssmin) from 4.0.0 to 5.0.0.
- [Release notes](https://github.com/gruntjs/grunt-contrib-cssmin/releases)
- [Changelog](https://github.com/gruntjs/grunt-contrib-cssmin/blob/main/CHANGELOG)
- [Commits](https://github.com/gruntjs/grunt-contrib-cssmin/compare/v4.0.0...v5.0.0)

---
updated-dependencies:
- dependency-name: grunt-contrib-cssmin
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-05-05 07:57:12 +00:00
searxng-bot
54c4d08167 [translations] update from Weblate
70336613 - 2023-05-03 - return42 <markus.heiser@darmarit.de>
55d82b96 - 2023-05-03 - artnay <jiri.gronroos@iki.fi>
3911fe35 - 2023-05-03 - return42 <markus.heiser@darmarit.de>
81b6ebd1 - 2023-05-03 - return42 <markus.heiser@darmarit.de>
6655ac63 - 2023-05-01 - return42 <markus.heiser@darmarit.de>
3b9cccb8 - 2023-04-30 - return42 <markus.heiser@darmarit.de>
51601c0b - 2023-04-30 - return42 <markus.heiser@darmarit.de>
2023-05-05 07:07:41 +00:00
Markus Heiser
823c490c84 [mod] limiter: block requests from PetalBot
Block requests from PetalBlock.  Normally robots.txt is enough to stop
PetalBlock from making requests [1].  However, if SearXNG is offered below a
path (example.org/search), then the robots.txt is not available in the root
paths of the domain / subdomain.

[1] https://webmaster.petalsearch.com/site/petalbot

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-30 09:49:26 +02:00
Markus Heiser
7b43f98ebb
Merge pull request #2382 from searxng/translations_update
Update translations
2023-04-30 08:11:00 +02:00
Markus Heiser
e0c8e1923d
Merge pull request #2390 from searxng/update_data_update_wikidata_units.py
Update searx.data - update_wikidata_units.py
2023-04-30 07:16:31 +02:00
Markus Heiser
d63dbb10fc
Merge pull request #2391 from searxng/update_data_update_firefox_version.py
Update searx.data - update_firefox_version.py
2023-04-30 07:15:55 +02:00
Markus Heiser
e9fdfab76e
Merge pull request #2392 from searxng/update_data_update_currencies.py
Update searx.data - update_currencies.py
2023-04-30 07:15:28 +02:00
Markus Heiser
836827517d
Merge pull request #2393 from searxng/update_data_update_ahmia_blacklist.py
Update searx.data - update_ahmia_blacklist.py
2023-04-30 07:14:53 +02:00
Markus Heiser
cfc01ea068
Merge pull request #2394 from searxng/update_data_update_engine_traits.py
Update searx.data - update_engine_traits.py
2023-04-30 07:14:23 +02:00
dalf
c2fbace534 Update searx.data - update_engine_descriptions.py 2023-04-29 00:24:27 +00:00
dalf
4f31ab7d4b Update searx.data - update_engine_traits.py 2023-04-29 00:14:09 +00:00
dalf
df4cc070ec Update searx.data - update_ahmia_blacklist.py 2023-04-29 00:13:21 +00:00
dalf
5b93f97fb2 Update searx.data - update_currencies.py 2023-04-29 00:13:19 +00:00
dalf
7c90a6a222 Update searx.data - update_firefox_version.py 2023-04-29 00:13:09 +00:00
dalf
4336f70b59 Update searx.data - update_wikidata_units.py 2023-04-29 00:13:08 +00:00
searxng-bot
274fcd46b6 [translations] update from Weblate
b6658877 - 2023-04-27 - return42 <markus.heiser@darmarit.de>
d7a3917b - 2023-04-25 - return42 <markus.heiser@darmarit.de>
879248ad - 2023-04-25 - return42 <markus.heiser@darmarit.de>
6ccafe4e - 2023-04-25 - return42 <markus.heiser@darmarit.de>
d202aed8 - 2023-04-23 - Parsa Ranjbar <parsa@disr.it>
2023-04-28 07:08:09 +00:00
Markus Heiser
45529f51a1
Merge pull request #2347 from return42/mod-lang-detection
If language recognition fails use the Accept-Language
2023-04-25 15:46:26 +02:00
Jakub Łukasiewicz
7da47edd93 [mod] external bang: go to main instead of search page when query is empty
Closes: #2368
2023-04-25 15:02:34 +02:00
Markus Heiser
4080eca3dd [build] /static 2023-04-21 13:38:29 +02:00
searxng-bot
b41e47eea6 [translations] update from Weblate
72d42638 - 2023-04-15 - tentsbet <remendne@pentrens.jp>
560e1885 - 2023-04-16 - return42 <markus.heiser@darmarit.de>
7370b026 - 2023-04-16 - return42 <markus.heiser@darmarit.de>
20946697 - 2023-04-16 - return42 <markus.heiser@darmarit.de>
2023-04-21 07:07:58 +00:00
Markus Heiser
9b575a997b [fix] doc of locales.get_engine_locale() / zh-classical is missleading
Wikipedia's zh-classical is not zh_Hant (see doc-string of engines.wikipedia).
Fixed the example in the doc-string of locales.get_engine_locale() to 'zh_TW'.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-17 08:48:57 +02:00
Markus Heiser
f1b6351ae1 [fix] engine: google play movies
Closes: https://github.com/searxng/searxng/pull/1746
Closes: https://github.com/searxng/searxng/issues/1599

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-16 19:15:44 +02:00
Marc Abonce Seguin
be1f6ee1c4 Update searx.data - update_engine_traits.py 2023-04-16 08:40:44 +02:00
Markus Heiser
8adbc4fcec [mod] settings.yml: enable language detection by default_lang (auto)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-15 22:24:59 +02:00
Markus Heiser
d5ecda9930 [mod] move language recognition to get_search_query_from_webapp
To set the language from language recognition and hold the value selected by the
client, the previous implementation creates a copy of the SearchQuery object and
manipulates the SearchQuery object by calling function replace_auto_language().

This patch tries to implement a similar functionality in a more central place,
in function get_search_query_from_webapp() when the SearchQuery object is build
up.

Additional this patch uses the language preferred by the client, if language
recognition does not have a match / the existing implementation does not care
about client preferences and uses 'all' in case of no match.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-15 22:23:33 +02:00
Markus Heiser
c03b0ea650 [mod] add a Preferences.client property to store client prefs
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-15 21:41:57 +02:00
Markus Heiser
6592f0d36e [build] /static 2023-04-15 18:59:46 +02:00
Markus Heiser
abf574dc0c [mod] Ignore autocomplete_min on queries that include '!' (!bang)
Closes: https://github.com/searxng/searxng/issues/1566
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-15 18:54:47 +02:00
Markus Heiser
09295a3fd1 Update searx.data - update_engine_descriptions.py
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-15 16:04:05 +02:00
Markus Heiser
a8fb6dffb2 [upd] make data.traits --> searx/data/engine_traits.json
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-15 16:04:05 +02:00
Markus Heiser
27369ebec2 [fix] searxng_extra/update/update_engine_descriptions.py (part 1)
Follow up of #2269

The script to update the descriptions of the engines does no longer work since
PR #2269 has been merged.

searx/engines/wikipedia.py
==========================

1. There was a misusage of zh-classical.wikipedia.org:

   - `zh-classical` is dedicate to classical Chinese [1] which is not
     traditional Chinese [2].

   - zh.wikipedia.org has LanguageConverter enabled [3] and is going to
     dynamically show simplified or traditional Chinese according to the
     HTTP Accept-Language header.

2. The update_engine_descriptions.py needs a list of all wikipedias.  The
   implementation from #2269 included only a reduced list:

   - https://meta.wikimedia.org/wiki/Wikipedia_article_depth
   - https://meta.wikimedia.org/wiki/List_of_Wikipedias

searxng_extra/update/update_engine_descriptions.py
==================================================

Before PR #2269 there was a match_language() function that did an approximation
using various methods.  With PR #2269 there are only the types in the data model
of the languages, which can be recognized by babel.  The approximation methods,
which are needed (only here) in the determination of the descriptions, must be
replaced by other methods.

[1] https://en.wikipedia.org/wiki/Classical_Chinese
[2] https://en.wikipedia.org/wiki/Traditional_Chinese_characters
[3] https://www.mediawiki.org/wiki/Writing_systems#LanguageConverter

Closes: https://github.com/searxng/searxng/issues/2330
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-15 16:03:59 +02:00
Markus Heiser
0adfed195e
Merge pull request #2338 from return42/block-old-farside
[mod] limiter: block unmaintained Farside instances
2023-04-15 08:40:30 +02:00
Markus Heiser
5f6ee2f82c
Merge pull request #2342 from searxng/dependabot/pip/master/pygments-2.15.0
Bump pygments from 2.14.0 to 2.15.0
2023-04-15 08:28:05 +02:00
Markus Heiser
b3d8d305d2 [build] /static 2023-04-15 08:21:02 +02:00
searxng-bot
a938aeaf1b [translations] update from Weblate
8d1975e8 - 2023-04-12 - return42 <markus.heiser@darmarit.de>
76ff8083 - 2023-04-13 - return42 <markus.heiser@darmarit.de>
a53721ef - 2023-04-13 - return42 <markus.heiser@darmarit.de>
9d8b4810 - 2023-04-08 - Eryk Michalak <gnu.ewm@protonmail.com>
1587a991 - 2023-04-10 - tentsbet <remendne@pentrens.jp>
16c84cef - 2023-04-10 - return42 <markus.heiser@darmarit.de>
29845d20 - 2023-04-09 - ghose <correo@xmgz.eu>
ccdee956 - 2023-04-08 - gallegonovato <fran-carro@hotmail.es>
402b3d27 - 2023-04-08 - return42 <markus.heiser@darmarit.de>
44b1ea99 - 2023-04-10 - return42 <markus.heiser@darmarit.de>
2023-04-14 07:07:48 +00:00
Markus Heiser
8c83547683 [mod] limiter: block unmaintained Farside instances
Since [bb3a01f8] has been merged to the Farside project, Farside instances do no
longer need to send requests to SearXNG instances [1].

There are some old unmaintained Farside instances on the web that continue to
query SearXNG instances --> we can safely block their requests.

[1] https://github.com/benbusby/farside/issues/95
[bb3a01f8] https://github.com/benbusby/farside/commit/bb3a01f8

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-13 16:53:37 +02:00
Markus Heiser
c678b82625 [build] /static 2023-04-10 16:26:41 +02:00
Markus Heiser
962b4c719f [mod] Update input when selecting by TAB
When the user press [TAB] the input form should be filled with the highlighted
item from the autocomplete list, but not release a search / with other words:
what we now have by pressing once on [ENTER] should be mapped to the [TAB] key
and pressing [ENTER] once should release a search query. [1]

[1] https://github.com/searxng/searxng/issues/778#issuecomment-1016593816

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-10 16:26:41 +02:00
rinagorsha
e9fb9f2705 [mod] Update input when selecting autocomplete prediction with keyboard
- Update input when selecting autocomplete prediction with keyboard
- Search immediately by pressing enter key
- Search immediately by clicking on an autocomplete suggestion

Related:
- https://github.com/searxng/searxng/issues/778
2023-04-10 14:36:05 +02:00
Markus Heiser
7d17d37ac7 [fix] don't show a category if there is no active engine in
When deactivate all the engines of a category, this category should disappeare.
This feature has been lost in commit 8e9ad1cc.

For better readability, webapp.get_enabled_categories() has been rewritten with
identical functionality.

Related:

- https://github.com/searxng/searxng/issues/1020
- https://github.com/searxng/searxng/issues/1604

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-10 09:22:21 +02:00
Markus Heiser
fed2bab65b
Merge pull request #2329 from return42/fix-bing-no-descr
[fix] Bing-WEB: use <span class='algoSlug_icon'> for the description
2023-04-08 17:03:34 +02:00
Markus Heiser
0def555869 [build] /static 2023-04-08 11:10:34 +02:00
Markus Heiser
f117f969d8 [mod] in the preference page, show !bang of subgrouping categories
The names of the subgrouping categories in the preference page are translated,
to use this categories the user needs to know by which !bang the category can be
selected.  Related to "Make 'non tab category' bangs discoverable" in [#690].

Related:

- [#690] https://github.com/searxng/searxng/issues/690
- https://github.com/searxng/searxng/issues/1604
- https://github.com/searxng/searxng/pull/1545

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-08 11:10:14 +02:00
Markus Heiser
bbb2af7d94 [fix] minor typo in de/search-syntax page
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-08 10:25:52 +02:00
Markus Heiser
23ac964e35 [fix] Bing-WEB: use <span class='algoSlug_icon'> for the description
On some result items from Bing-WEB the `<span class='algoSlug_icon'>` tag is the
only tag that contains a description.  The issue can be reproduced by [1]::

    !bi vmware

[1] https://github.com/searxng/searxng/issues/1764#issuecomment-1417990531

Reported-by: @AlyoshaVasilieva
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-08 09:43:04 +02:00
Markus Heiser
393e14965a
Merge pull request #2326 from return42/ungrouped
[mod] clarify the difference of the default category and subgrouping
2023-04-07 13:34:24 +02:00
Markus Heiser
8f79dd7659 [doc] additional descriptions about categories & categories_as_tabs
Add missing documentation of PR [#634].  Related to checkbox "Document how to
categorize engines" in [#690].

Related:

- [#634] https://github.com/searxng/searxng/pull/634#issuecomment-1004757502
- [#690] https://github.com/searxng/searxng/issues/690
- https://github.com/searxng/searxng/issues/1604
- https://github.com/searxng/searxng/pull/1545

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-07 12:56:45 +02:00
Markus Heiser
2ffd446e5c [mod] clarify the difference of the default category and subgrouping
This PR does no functional change it is just an attempt to make more clear in
the code, what a default category is and what a subcategory is.  The previous
name 'others' leads to confusion with the **category 'other'**.

If a engine is not assigned to a category, the default is assigned::

    DEFAULT_CATEGORY = 'other'

If an engine has only one category and this category is shown as tab in the user
interface, this engine has no further subgrouping::

    NO_SUBGROUPING = 'without further subgrouping'

Related:

- https://github.com/searxng/searxng/issues/1604
- https://github.com/searxng/searxng/pull/1545

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-07 11:03:25 +02:00
searxng-bot
7850c77c2c [translations] update from Weblate
d2fb466c - 2023-04-02 - return42 <markus.heiser@darmarit.de>
5576f597 - 2023-04-02 - return42 <markus.heiser@darmarit.de>
4b28cab9 - 2023-03-31 - Vistaus <vistausss@fastmail.com>
2023-04-07 07:07:51 +00:00
Markus Heiser
f46d0584ef
Merge pull request #2322 from return42/fix-2321
[fix] Gigablast.com has been erased
2023-04-06 23:49:58 +02:00
Markus Heiser
64e221426a
Merge pull request #2312 from return42/fix-1020-part-2
[fix] categories can't be removed from UI (categories_as_tabs)
2023-04-06 10:36:51 +02:00
Markus Heiser
5234e45010 [fix] Gigablast.com has been erased
[1] https://www.reddit.com/r/searchengines/comments/128wdcp/gigablastcom_has_been_erased/

Closes: https://github.com/searxng/searxng/issues/2321
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-06 08:22:57 +02:00
Markus Heiser
03f94962b6 [fix] limiter: never block a /healthz request
Related: https://github.com/searxng/searxng/issues/2310#issuecomment-1494417531
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-03 19:36:28 +02:00
Markus Heiser
0c4003ab2d [fix] categories can't be removed from UI (categories_as_tabs)
When using ``use_default_settings: true``, removing default categories from
settings.yml will not remove them from the UI.

The value ``categories_as_tabs`` is a dictionary type (a4c2cfb) and dictionary
types are merged additive by ``settings_loader.update_settings()``.

This patch replaces the default ``categories_as_tabs`` by the one from the
``user_settings``.

Related: https://github.com/searxng/searxng/issues/1019#issuecomment-1193145654
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-03 19:08:27 +02:00
Markus Heiser
a762172bf7 [fix] engine ddg: quote !bangs in a request send to ddg
Closes: https://github.com/searxng/searxng/issues/392
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-03 09:52:16 +02:00
Markus Heiser
0430662189 [fix] engine google-News: fix decoding of URLs (part 2)
Follow up of 8de8070ed to fix the issue reported by AlyoshaVasilieva [1].

[1] https://github.com/searxng/searxng/issues/1959#issuecomment-1493300574

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-02 19:19:59 +02:00
Markus Heiser
a5155a32c0
Merge pull request #2306 from return42/fix-1959
[fix] engine google-News: fix decoding of URLs
2023-04-02 08:02:37 +02:00
Markus Heiser
66810ce711 [mod] limiter: minor improvements
- requests without HTTP header 'Connection' or missing 'User-Agent' will be
  blocked by the limiter

- re_bot is related to 'User-Agent' and has been renamed to block_user_agent

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-01 19:42:49 +02:00
Markus Heiser
8de8070ed9 [fix] engine google-News: fix decoding of URLs
Google-News returns internal links where the origin URL is encoded in a
base64 (RFC 2045 aka URL-safe) string.

Closes: https://github.com/searxng/searxng/issues/1959
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-01 19:33:13 +02:00
Markus Heiser
afd8fcce36 [mod] plugin limiter: improve the log messages
In debug mode more detailed logging is needed to evaluate if an access should
have been blocked by the limiter.

BTW: remove duplicate code checking bot signature ``re_bot.match(user_agent)``

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-01 09:20:58 +02:00
Markus Heiser
509afbbb84 [fix] engine seznam: fix issues reported by black & pylint
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-31 17:25:39 +02:00
Venca24
c8d78355ff [fix] engine seznam 2023-03-31 16:11:27 +02:00
dependabot[bot]
b99e028ed0
Bump sharp from 0.31.3 to 0.32.0 in /searx/static/themes/simple
Bumps [sharp](https://github.com/lovell/sharp) from 0.31.3 to 0.32.0.
- [Release notes](https://github.com/lovell/sharp/releases)
- [Changelog](https://github.com/lovell/sharp/blob/main/docs/changelog.md)
- [Commits](https://github.com/lovell/sharp/compare/v0.31.3...v0.32.0)

---
updated-dependencies:
- dependency-name: sharp
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-03-31 11:39:17 +00:00
Markus Heiser
68703ba22e
Merge pull request #2282 from searxng/dependabot/npm_and_yarn/searx/static/themes/simple/master/ionicons-7.1.0
Bump ionicons from 6.1.3 to 7.1.0 in /searx/static/themes/simple
2023-03-31 13:36:12 +02:00
searxng-bot
c1c24fc231 [translations] update from Weblate
17ad1118 - 2023-03-29 - return42 <markus.heiser@darmarit.de>
61446791 - 2023-03-29 - return42 <markus.heiser@darmarit.de>
2023-03-31 07:07:59 +00:00
Markus Heiser
270ad18897 [fix] engine flickr: adapt to the new data model from flicker's response
Closes: https://github.com/searxng/searxng/issues/1879
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-30 21:04:53 +02:00
Markus Heiser
2b8dfab33f [fix] engine gigablast: add &userid=<User ID>&code=<Feed Code>
Gigablast's API does block unauthorized request[1].

[1] https://gigablast.com/searchfeed.html

Closes: https://github.com/searxng/searxng/issues/1454
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-29 16:18:02 +02:00
Markus Heiser
d179b62cf5
Merge pull request #2288 from searxng/update_data_update_firefox_version.py
Update searx.data - update_firefox_version.py
2023-03-29 13:48:32 +02:00
Markus Heiser
381c6751d6
Merge pull request #2289 from searxng/update_data_update_ahmia_blacklist.py
Update searx.data - update_ahmia_blacklist.py
2023-03-29 13:48:03 +02:00
Markus Heiser
fc51d9a0fe
Merge pull request #2291 from searxng/update_data_update_currencies.py
Update searx.data - update_currencies.py
2023-03-29 13:46:40 +02:00
Markus Heiser
2fbe4ab0c0
Merge pull request #2292 from searxng/update_data_update_engine_descriptions.py
Update searx.data - update_engine_descriptions.py
2023-03-29 13:46:14 +02:00
dalf
4c80340b62 Update searx.data - update_engine_descriptions.py 2023-03-29 00:28:45 +00:00
dalf
b39ce7ff82 Update searx.data - update_currencies.py 2023-03-29 00:16:21 +00:00
dalf
814ac8cacb Update searx.data - update_ahmia_blacklist.py 2023-03-29 00:16:17 +00:00
dalf
43d30cab81 Update searx.data - update_firefox_version.py 2023-03-29 00:16:14 +00:00
Markus Heiser
6f9e678346 [fix] engine: google has changed the layout of its response
Since 28. March google has changed its response, this patch fixes the google
engine to scrap out the results & images from the new designed response.

closes: https://github.com/searxng/searxng/issues/2287

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-28 14:39:16 +02:00
dalf
1498202b0b Update searx.data - update_engine_traits.py 2023-03-24 11:30:18 +01:00
Markus Heiser
16f0db4493 [mod] replace utils.match_language by locales.match_locale
This patch replaces the *full of magic* ``utils.match_language`` function by a
``locales.match_locale``.  The ``locales.match_locale`` function is based on the
``locales.build_engine_locales`` introduced in 9ae409a0 [1].

In the past SearXNG did only support a search by a language but not in a region.
This has been changed a long time ago and regions have been added to SearXNG
core but not to the engines.  The ``utils.match_language`` was the function to
handle the different aspects of language/regions in SearXNG core and the
supported *languages* in the engine.  The ``utils.match_language`` did it with
some magic and works good for most use cases but fails in some edge case.

To replace the concurrence of languages and regions in the SearXNG core the
``locales.build_engine_locales`` was introduced in 9ae409a0 [1].  With the last
patches all engines has been migrated to a ``fetch_traits`` and a
language/region concept that is based on ``locales.build_engine_locales``.

To summarize: there is no longer a need for the ``locales.match_language``.

[1] https://github.com/searxng/searxng/pull/1652

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser
4d4aa13e1f [mod] remove obsolete EngineTraits.supported_languages
All engines has been migrated from ``supported_languages`` to the
``fetch_traits`` concept.  There is no longer a need for the obsolete code that
implements the ``supported_languages`` concept.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser
96a2eec3b5 [mod] Archlinux Wiki: improved request API & upgrade to data_type: traits_v1
re-implementation of the Archlinux Wiki:

- fetch_traits(): fetch languages, wiki URLs and title arguments
- add content field to the result list
- add documentation

Wikis from wiki.archlinux.fr, wiki.archlinux.ro, archtr.org/wiki do no longer
exists (has been merged in the main wiki).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser
057e9bc1d1 [mod] SepiaSearch: re-engineered & upgrade to data_type: traits_v1
- fetch_traits() SepiaSearch and Peertube are using identical languages.
  Replace module's dictionary `supported_languages` by `engine.traits.languages`
  (data_type: `traits_v1`).
- fixed code to pass pylint
- request(): add argument boostLanguages
- response(): is replaced by peertube's video_response() function, which adds
  metadata from channel name, host & tags

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser
8a8c584fec [mod] Dailymotion: improved request API & upgrade to data_type: traits_v1
- fetch_traits(): fetch locales (and languages) from dailymotion API
- removed obsolete data-type "supported_languages"
- add documentation
- improved argument list of the HTTP request:
  - add argument: family_filter_map
  - add conditional argument: localization
    Don't add localization and country arguments if the user does select a
    language (:de, :en, ..)
- improve code quality (mainly improve readability)

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser
2499899554 [mod] Google: reversed engineered & upgrade to data_type: traits_v1
Partial reverse engineering of the Google engines including a improved language
and region handling based on the engine.traits_v1 data.

When ever possible the implementations of the Google engines try to make use of
the async REST APIs.  The get_lang_info() has been generalized to a
get_google_info() function / especially the region handling has been improved by
adding the cr parameter.

searx/data/engine_traits.json
  Add data type "traits_v1" generated by the fetch_traits() functions from:

  - Google (WEB),
  - Google images,
  - Google news,
  - Google scholar and
  - Google videos

  and remove data from obsolete data type "supported_languages".

  A traits.custom type that maps region codes to *supported_domains* is fetched
  from https://www.google.com/supported_domains

searx/autocomplete.py:
  Reversed engineered autocomplete from Google WEB.  Supports Google's languages and
  subdomains.  The old API suggestqueries.google.com/complete has been replaced
  by the async REST API: https://{subdomain}/complete/search?{args}

searx/engines/google.py
  Reverse engineering and extensive testing ..
  - fetch_traits():  Fetch languages & regions from Google properties.
  - always use the async REST API (formally known as 'use_mobile_ui')
  - use *supported_domains* from traits
  - improved the result list by fetching './/div[@data-content-feature]'
    and parsing the type of the various *content features* --> thumbnails are
    added

searx/engines/google_images.py
  Reverse engineering and extensive testing ..
  - fetch_traits():  Fetch languages & regions from Google properties.
  - use *supported_domains* from traits
  - if exists, freshness_date is added to the result
  - issue 1864: result list has been improved a lot (due to the new cr parameter)

searx/engines/google_news.py
  Reverse engineering and extensive testing ..
  - fetch_traits():  Fetch languages & regions from Google properties.
    *supported_domains* is not needed but a ceid list has been added.
  - different region handling compared to Google WEB
  - fixed for various languages & regions (due to the new ceid parameter) /
    avoid CONSENT page
  - Google News do no longer support time range
  - result list has been fixed: XPath of pub_date and pub_origin

searx/engines/google_videos.py
  - fetch_traits():  Fetch languages & regions from Google properties.
  - use *supported_domains* from traits
  - add paging support
  - implement a async request ('asearch': 'arc' & 'async':
    'use_ac:true,_fmt:html')
  - simplified code (thanks to '_fmt:html' request)
  - issue 1359: fixed xpath of video length data

searx/engines/google_scholar.py
  - fetch_traits():  Fetch languages & regions from Google properties.
  - use *supported_domains* from traits
  - request(): include patents & citations
  - response(): fixed CAPTCHA detection (Scholar has its own CATCHA manager)
  - hardening XPath to iterate over results
  - fixed XPath of pub_type (has been change from gs_ct1 to gs_cgt2 class)
  - issue 1769 fixed: new request implementation is no longer incompatible

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser
c80e82a855 [mod] DuckDuckGo: reversed engineered & upgrade to data_type: traits_v1
Partial reverse engineering of the DuckDuckGo (DDG) engines including a
improved language and region handling based on the enigne.traits_v1 data.

- DDG Lite
- DDG Instant Answer API
- DDG Images
- DDG Weather

docs/src/searx.engine.duckduckgo.rst:
  Online documentation of the DDG engines (make docs.live)

searx/data/engine_traits.json
  Add data type "traits_v1" generated by the fetch_traits() functions from:

  - "duckduckgo" (WEB),
  - "duckduckgo images" and
  - "duckduckgo weather"

  and remove data from obsolete data type "supported_languages".

searx/autocomplete.py:
  Reversed engineered Autocomplete from DDG.  Supports DDG's languages.

searx/engines/duckduckgo.py:
  - fetch_traits():  Fetch languages & regions from DDG.

  - get_ddg_lang(): Get DDG's language identifier from SearXNG's locale.  DDG
    defines its languages by region codes.  DDG-Lite does not offer a language
    selection to the user, only a region can be selected by the user.

  - Cache ``vqd`` value: The vqd value depends on the query string and is needed
    for the follow up pages or the images loaded by a XMLHttpRequest (DDG
    images).  The ``vqd`` value of a search term is stored for 10min in the
    redis DB.

  - DDG Lite engine: reversed engineered request method with improved Language
    and region support and better ``vqd`` handling.

searx/engines/duckduckgo_definitions.py: DDG Instant Answer API
  The *instant answers* API does not support languages, or at least we could not
  find out how language support should work.  It seems that most of the features
  are based on English terms.

searx/engines/duckduckgo_images.py: DDG Images
  Reversed engineered request method.  Improved language and region handling
  based on cookies and the enigne.traits_v1 data.  Response: add image format to
  the result list

searx/engines/duckduckgo_weather.py: DDG Weather
  Improved language and region handling based on cookies and the
  enigne.traits_v1 data.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser
e9afc4f8ce [mod] Startpage: reversed engineered & upgrade to data_type: traits_v1
One reason for the often seen CAPTCHA of the Startpage requests are the
incomplete requests SearXNG sends to startpage.com: this patch is a complete new
implementation of the ``request()`` function, reversed engineered from the
Startpage's search form.  The new implementation:

- use traits of data_type: traits_v1 and drop deprecated data_type: supported_languages
- adds time-range support
- adds save-search support
- fix searxng/searxng/issues 1884
- fix searxng/searxng/issues 1081 --> improvements to avoid CAPTCHA

In preparation for more categories (News, Images, Videos ..) from Startpage, the
variable ``startpage_categ`` was set up.  The default value is ``web`` and other
categories from Startpage are not yet implemented.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser
858aa3e604 [mod] wikipedia & wikidata: upgrade to data_type: traits_v1
BTW this fix an issue in wikipedia: SearXNG's locales zh-TW and zh-HK are now
using language `zh-classical` from wikipedia (and not `zh`).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser
e0a6ca96cc [doc] add a description of bing engines (web, news, video, images)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser
15eaf0f15f [mod] bing_news: use async API & upgrade to data_type: traits_v1
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser
ff80e7637e [mod] bing_images: use async API & upgrade to data_type: traits_v1
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser
bc21d28298 [mod] bing_videos: use async API & upgrade to data_type: traits_v1
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00