[refactor] typification of SearXNG (initial) / result items (part 1)

Typification of SearXNG
=======================

This patch introduces the typing of the results.  The why and how is described
in the documentation, please generate the documentation ..

    $ make docs.clean docs.live

and read the following articles in the "Developer documentation":

- result types --> http://0.0.0.0:8000/dev/result_types/index.html

The result types are available from the `searx.result_types` module.  The
following have been implemented so far:

- base result type: `searx.result_type.Result`
  --> http://0.0.0.0:8000/dev/result_types/base_result.html

- answer results
  --> http://0.0.0.0:8000/dev/result_types/answer.html

including the type for translations (inspired by #3925).  For all other
types (which still need to be set up in subsequent PRs), template documentation
has been created for the transition period.

Doc of the fields used in Templates
===================================

The template documentation is the basis for the typing and is the first complete
documentation of the results (needed for engine development).  It is the
"working paper" (the plan) with which further typifications can be implemented
in subsequent PRs.

- https://github.com/searxng/searxng/issues/357

Answer Templates
================

With the new (sub) types for `Answer`, the templates for the answers have also
been revised, `Translation` are now displayed with collapsible entries (inspired
by #3925).

    !en-de dog

Plugins & Answerer
==================

The implementation for `Plugin` and `Answer` has been revised, see
documentation:

- Plugin: http://0.0.0.0:8000/dev/plugins/index.html
- Answerer: http://0.0.0.0:8000/dev/answerers/index.html

With `AnswerStorage` and `AnswerStorage` to manage those items (in follow up
PRs, `ArticleStorage`, `InfoStorage` and .. will be implemented)

Autocomplete
============

The autocompletion had a bug where the results from `Answer` had not been shown
in the past.  To test activate autocompletion and try search terms for which we
have answerers

- statistics: type `min 1 2 3` .. in the completion list you should find an
  entry like `[de] min(1, 2, 3) = 1`

- random: type `random uuid` .. in the completion list, the first item is a
  random UUID

Extended Types
==============

SearXNG extends e.g. the request and response types of flask and httpx, a module
has been set up for type extensions:

- Extended Types
  --> http://0.0.0.0:8000/dev/extended_types.html

Unit-Tests
==========

The unit tests have been completely revised.  In the previous implementation,
the runtime (the global variables such as `searx.settings`) was not initialized
before each test, so the runtime environment with which a test ran was always
determined by the tests that ran before it.  This was also the reason why we
sometimes had to observe non-deterministic errors in the tests in the past:

- https://github.com/searxng/searxng/issues/2988 is one example for the Runtime
  issues, with non-deterministic behavior ..

- https://github.com/searxng/searxng/pull/3650
- https://github.com/searxng/searxng/pull/3654
- https://github.com/searxng/searxng/pull/3642#issuecomment-2226884469
- https://github.com/searxng/searxng/pull/3746#issuecomment-2300965005

Why msgspec.Struct
==================

We have already discussed typing based on e.g. `TypeDict` or `dataclass` in the past:

- https://github.com/searxng/searxng/pull/1562/files
- https://gist.github.com/dalf/972eb05e7a9bee161487132a7de244d2
- https://github.com/searxng/searxng/pull/1412/files
- https://github.com/searxng/searxng/pull/1356

In my opinion, TypeDict is unsuitable because the objects are still dictionaries
and not instances of classes / the `dataclass` are classes but ...

The `msgspec.Struct` combine the advantages of typing, runtime behaviour and
also offer the option of (fast) serializing (incl. type check) the objects.

Currently not possible but conceivable with `msgspec`: Outsourcing the engines
into separate processes, what possibilities this opens up in the future is left
to the imagination!

Internally, we have already defined that it is desirable to decouple the
development of the engines from the development of the SearXNG core / The
serialization of the `Result` objects is a prerequisite for this.

HINT: The threads listed above were the template for this PR, even though the
implementation here is based on msgspec.  They should also be an inspiration for
the following PRs of typification, as the models and implementations can provide
a good direction.

Why just one commit?
====================

I tried to create several (thematically separated) commits, but gave up at some
point ... there are too many things to tackle at once / The comprehensibility of
the commits would not be improved by a thematic separation. On the contrary, we
would have to make multiple changes at the same places and the goal of a change
would be vaguely recognizable in the fog of the commits.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
This commit is contained in:
Markus Heiser 2024-12-15 09:59:50 +01:00 committed by Markus Heiser
parent 9079d0cac0
commit edfbf1e118
143 changed files with 3877 additions and 2118 deletions

View file

@ -50,8 +50,8 @@ search.checker.%: install
$(Q)./manage pyenv.cmd searxng-checker -v "$(subst _, ,$(patsubst search.checker.%,%,$@))"
PHONY += test ci.test test.shell
ci.test: test.yamllint test.black test.pyright test.pylint test.unit test.robot test.rst test.pybabel test.themes
test: test.yamllint test.black test.pyright test.pylint test.unit test.robot test.rst test.shell
ci.test: test.yamllint test.black test.types.ci test.pylint test.unit test.robot test.rst test.shell test.pybabel test.themes
test: test.yamllint test.black test.types.dev test.pylint test.unit test.robot test.rst test.shell
test.shell:
$(Q)shellcheck -x -s dash \
dockerfiles/docker-entrypoint.sh
@ -83,7 +83,7 @@ MANAGE += node.env node.env.dev node.clean
MANAGE += py.build py.clean
MANAGE += pyenv pyenv.install pyenv.uninstall
MANAGE += format.python
MANAGE += test.yamllint test.pylint test.pyright test.black test.pybabel test.unit test.coverage test.robot test.rst test.clean test.themes
MANAGE += test.yamllint test.pylint test.black test.pybabel test.unit test.coverage test.robot test.rst test.clean test.themes test.types.dev test.types.ci
MANAGE += themes.all themes.fix themes.test
MANAGE += themes.simple themes.simple.pygments themes.simple.fix
MANAGE += static.build.commit static.build.drop static.build.restore

View file

@ -1,12 +1,14 @@
.. _plugins generic:
.. _plugins admin:
===============
Plugins builtin
List of plugins
===============
.. sidebar:: Further reading ..
- :ref:`SearXNG settings <settings plugins>`
- :ref:`dev plugin`
- :ref:`builtin plugins`
Configuration defaults (at built time):
@ -25,15 +27,10 @@ Configuration defaults (at built time):
- DO
- Description
JS & CSS dependencies
{% for plg in plugins %}
{% for plgin in plugins %}
* - {{plgin.name}}
- {{(plgin.default_on and "y") or ""}}
- {{plgin.description}}
{% for dep in (plgin.js_dependencies + plgin.css_dependencies) %}
| ``{{dep}}`` {% endfor %}
* - {{plg.info.name}}
- {{(plg.default_on and "y") or ""}}
- {{plg.info.description}}
{% endfor %}

View file

@ -22,6 +22,6 @@ Settings
settings_redis
settings_outgoing
settings_categories_as_tabs
settings_plugins

View file

@ -0,0 +1,67 @@
.. _settings plugins:
=======
Plugins
=======
.. sidebar:: Further reading ..
- :ref:`plugins admin`
- :ref:`dev plugin`
- :ref:`builtin plugins`
The built-in plugins can be activated or deactivated via the settings
(:ref:`settings enabled_plugins`) and external plugins can be integrated into
SearXNG (:ref:`settings external_plugins`).
.. _settings enabled_plugins:
``enabled_plugins:`` (internal)
===============================
In :ref:`plugins admin` you find a complete list of all plugins, the default
configuration looks like:
.. code:: yaml
enabled_plugins:
- 'Basic Calculator'
- 'Hash plugin'
- 'Self Information'
- 'Tracker URL remover'
- 'Unit converter plugin'
- 'Ahmia blacklist'
.. _settings external_plugins:
``plugins:`` (external)
=======================
SearXNG supports *external plugins* / there is no need to install one, SearXNG
runs out of the box. But to demonstrate; in the example below we install the
SearXNG plugins from *The Green Web Foundation* `[ref]
<https://www.thegreenwebfoundation.org/news/searching-the-green-web-with-searx/>`__:
.. code:: bash
$ sudo utils/searxng.sh instance cmd bash -c
(searxng-pyenv)$ pip install git+https://github.com/return42/tgwf-searx-plugins
In the :ref:`settings.yml` activate the ``plugins:`` section and add module
``only_show_green_results`` from ``tgwf-searx-plugins``.
.. code:: yaml
plugins:
- only_show_green_results
# - mypackage.mymodule.MyPlugin
# - mypackage.mymodule.MyOtherPlugin
.. hint::
``only_show_green_results`` is an old plugin that was still implemented in
the old style. There is a legacy treatment for backward compatibility, but
new plugins should be implemented as a :py:obj:`searx.plugins.Plugin` class.

View file

@ -54,7 +54,7 @@ searx.engines.load_engines(searx.settings['engines'])
jinja_contexts = {
'searx': {
'engines': searx.engines.engines,
'plugins': searx.plugins.plugins,
'plugins': searx.plugins.STORAGE,
'version': {
'node': os.getenv('NODE_MINIMUM_VERSION')
},
@ -129,8 +129,9 @@ extensions = [
'notfound.extension', # https://github.com/readthedocs/sphinx-notfound-page
]
# autodoc_typehints = "description"
autodoc_default_options = {
'member-order': 'groupwise',
'member-order': 'bysource',
}
myst_enable_extensions = [

View file

@ -0,0 +1,11 @@
.. _builtin answerers:
==================
Built-in Answerers
==================
.. toctree::
:maxdepth: 1
random
statistics

View file

@ -0,0 +1,7 @@
.. _dev answerers:
====================
Answerer Development
====================
.. automodule:: searx.answerers

View file

@ -0,0 +1,9 @@
=========
Answerers
=========
.. toctree::
:maxdepth: 2
development
builtins

View file

@ -0,0 +1,8 @@
.. _answerer.random:
======
Random
======
.. autoclass:: searx.answerers.random.SXNGAnswerer
:members:

View file

@ -0,0 +1,8 @@
.. _answerer.statistics:
==========
Statistics
==========
.. autoclass:: searx.answerers.statistics.SXNGAnswerer
:members:

View file

@ -237,335 +237,18 @@ following parameters can be used to specify a search request:
=================== =========== ==========================================================================
.. _engine results:
.. _engine media types:
Making a Response
=================
Result Types (``template``)
===========================
In the ``response`` function of the engine, the HTTP response (``resp``) is
parsed and a list of results is returned.
Each result item of an engine can be of different media-types. Currently the
following media-types are supported. To set another media-type as
:ref:`template default`, the parameter ``template`` must be set to the desired
type.
A engine can append result-items of different media-types and different
result-types to the result list. The list of the result items is render to HTML
by templates. For more details read section:
.. _template default:
- :ref:`simple theme templates`
- :ref:`result types`
``default``
-----------
.. table:: Parameter of the **default** media type:
:width: 100%
========================= =====================================================
result-parameter information
========================= =====================================================
url string, url of the result
title string, title of the result
content string, general result-text
publishedDate :py:class:`datetime.datetime`, time of publish
========================= =====================================================
.. _template images:
``images``
----------
.. list-table:: Parameter of the **images** media type
:header-rows: 2
:width: 100%
* - result-parameter
- Python type
- information
* - template
- :py:class:`str`
- is set to ``images.html``
* - url
- :py:class:`str`
- url to the result site
* - title
- :py:class:`str`
- title of the result
* - content
- :py:class:`str`
- description of the image
* - publishedDate
- :py:class:`datetime <datetime.datetime>`
- time of publish
* - img_src
- :py:class:`str`
- url to the result image
* - thumbnail_src
- :py:class:`str`
- url to a small-preview image
* - resolution
- :py:class:`str`
- the resolution of the image (e.g. ``1920 x 1080`` pixel)
* - img_format
- :py:class:`str`
- the format of the image (e.g. ``png``)
* - filesize
- :py:class:`str`
- size of bytes in :py:obj:`human readable <searx.humanize_bytes>` notation
(e.g. ``MB`` for 1024 \* 1024 Bytes filesize).
.. _template videos:
``videos``
----------
.. table:: Parameter of the **videos** media type:
:width: 100%
========================= =====================================================
result-parameter information
------------------------- -----------------------------------------------------
template is set to ``videos.html``
========================= =====================================================
url string, url of the result
title string, title of the result
content *(not implemented yet)*
publishedDate :py:class:`datetime.datetime`, time of publish
thumbnail string, url to a small-preview image
length :py:class:`datetime.timedelta`, duration of result
views string, view count in humanized number format
========================= =====================================================
.. _template torrent:
``torrent``
-----------
.. _magnetlink: https://en.wikipedia.org/wiki/Magnet_URI_scheme
.. table:: Parameter of the **torrent** media type:
:width: 100%
========================= =====================================================
result-parameter information
------------------------- -----------------------------------------------------
template is set to ``torrent.html``
========================= =====================================================
url string, url of the result
title string, title of the result
content string, general result-text
publishedDate :py:class:`datetime.datetime`,
time of publish *(not implemented yet)*
seed int, number of seeder
leech int, number of leecher
filesize int, size of file in bytes
files int, number of files
magnetlink string, magnetlink_ of the result
torrentfile string, torrentfile of the result
========================= =====================================================
.. _template map:
``map``
-------
.. table:: Parameter of the **map** media type:
:width: 100%
========================= =====================================================
result-parameter information
------------------------- -----------------------------------------------------
template is set to ``map.html``
========================= =====================================================
url string, url of the result
title string, title of the result
content string, general result-text
publishedDate :py:class:`datetime.datetime`, time of publish
latitude latitude of result (in decimal format)
longitude longitude of result (in decimal format)
boundingbox boundingbox of result (array of 4. values
``[lat-min, lat-max, lon-min, lon-max]``)
geojson geojson of result (https://geojson.org/)
osm.type type of osm-object (if OSM-Result)
osm.id id of osm-object (if OSM-Result)
address.name name of object
address.road street name of object
address.house_number house number of object
address.locality city, place of object
address.postcode postcode of object
address.country country of object
========================= =====================================================
.. _template paper:
``paper``
---------
.. _BibTeX format: https://www.bibtex.com/g/bibtex-format/
.. _BibTeX field types: https://en.wikipedia.org/wiki/BibTeX#Field_types
.. list-table:: Parameter of the **paper** media type /
see `BibTeX field types`_ and `BibTeX format`_
:header-rows: 2
:width: 100%
* - result-parameter
- Python type
- information
* - template
- :py:class:`str`
- is set to ``paper.html``
* - title
- :py:class:`str`
- title of the result
* - content
- :py:class:`str`
- abstract
* - comments
- :py:class:`str`
- free text display in italic below the content
* - tags
- :py:class:`List <list>`\ [\ :py:class:`str`\ ]
- free tag list
* - publishedDate
- :py:class:`datetime <datetime.datetime>`
- last publication date
* - type
- :py:class:`str`
- short description of medium type, e.g. *book*, *pdf* or *html* ...
* - authors
- :py:class:`List <list>`\ [\ :py:class:`str`\ ]
- list of authors of the work (authors with a "s")
* - editor
- :py:class:`str`
- list of editors of a book
* - publisher
- :py:class:`str`
- name of the publisher
* - journal
- :py:class:`str`
- name of the journal or magazine the article was
published in
* - volume
- :py:class:`str`
- volume number
* - pages
- :py:class:`str`
- page range where the article is
* - number
- :py:class:`str`
- number of the report or the issue number for a journal article
* - doi
- :py:class:`str`
- DOI number (like ``10.1038/d41586-018-07848-2``)
* - issn
- :py:class:`List <list>`\ [\ :py:class:`str`\ ]
- ISSN number like ``1476-4687``
* - isbn
- :py:class:`List <list>`\ [\ :py:class:`str`\ ]
- ISBN number like ``9780201896831``
* - pdf_url
- :py:class:`str`
- URL to the full article, the PDF version
* - html_url
- :py:class:`str`
- URL to full article, HTML version
.. _template packages:
``packages``
------------
.. list-table:: Parameter of the **packages** media type
:header-rows: 2
:width: 100%
* - result-parameter
- Python type
- information
* - template
- :py:class:`str`
- is set to ``packages.html``
* - title
- :py:class:`str`
- title of the result
* - content
- :py:class:`str`
- abstract
* - package_name
- :py:class:`str`
- the name of the package
* - version
- :py:class:`str`
- the current version of the package
* - maintainer
- :py:class:`str`
- the maintainer or author of the project
* - publishedDate
- :py:class:`datetime <datetime.datetime>`
- date of latest update or release
* - tags
- :py:class:`List <list>`\ [\ :py:class:`str`\ ]
- free tag list
* - popularity
- :py:class:`str`
- the popularity of the package, e.g. rating or download count
* - license_name
- :py:class:`str`
- the name of the license
* - license_url
- :py:class:`str`
- the web location of a license copy
* - homepage
- :py:class:`str`
- the url of the project's homepage
* - source_code_url
- :py:class:`str`
- the location of the project's source code
* - links
- :py:class:`dict`
- additional links in the form of ``{'link_name': 'http://example.com'}``

View file

@ -0,0 +1,7 @@
.. _extended_types.:
==============
Extended Types
==============
.. automodule:: searx.extended_types

View file

@ -8,9 +8,13 @@ Developer documentation
quickstart
rtm_asdf
contribution_guide
extended_types
engines/index
result_types/index
templates
search_api
plugins
plugins/index
answerers/index
translation
lxcdev
makefile

View file

@ -1,106 +0,0 @@
.. _dev plugin:
=======
Plugins
=======
.. sidebar:: Further reading ..
- :ref:`plugins generic`
Plugins can extend or replace functionality of various components of searx.
Example plugin
==============
.. code:: python
name = 'Example plugin'
description = 'This plugin extends the suggestions with the word "example"'
default_on = False # disabled by default
# attach callback to the post search hook
# request: flask request object
# ctx: the whole local context of the post search hook
def post_search(request, search):
search.result_container.suggestions.add('example')
return True
External plugins
================
SearXNG supports *external plugins* / there is no need to install one, SearXNG
runs out of the box. But to demonstrate; in the example below we install the
SearXNG plugins from *The Green Web Foundation* `[ref]
<https://www.thegreenwebfoundation.org/news/searching-the-green-web-with-searx/>`__:
.. code:: bash
$ sudo utils/searxng.sh instance cmd bash -c
(searxng-pyenv)$ pip install git+https://github.com/return42/tgwf-searx-plugins
In the :ref:`settings.yml` activate the ``plugins:`` section and add module
``only_show_green_results`` from ``tgwf-searx-plugins``.
.. code:: yaml
plugins:
...
- only_show_green_results
...
Plugin entry points
===================
Entry points (hooks) define when a plugin runs. Right now only three hooks are
implemented. So feel free to implement a hook if it fits the behaviour of your
plugin. A plugin doesn't need to implement all the hooks.
.. py:function:: pre_search(request, search) -> bool
Runs BEFORE the search request.
`search.result_container` can be changed.
Return a boolean:
* True to continue the search
* False to stop the search
:param flask.request request:
:param searx.search.SearchWithPlugins search:
:return: False to stop the search
:rtype: bool
.. py:function:: post_search(request, search) -> None
Runs AFTER the search request.
:param flask.request request: Flask request.
:param searx.search.SearchWithPlugins search: Context.
.. py:function:: on_result(request, search, result) -> bool
Runs for each result of each engine.
`result` can be changed.
If `result["url"]` is defined, then `result["parsed_url"] = urlparse(result['url'])`
.. warning::
`result["url"]` can be changed, but `result["parsed_url"]` must be updated too.
Return a boolean:
* True to keep the result
* False to remove the result
:param flask.request request:
:param searx.search.SearchWithPlugins search:
:param typing.Dict result: Result, see - :ref:`engine results`
:return: True to keep the result
:rtype: bool

View file

@ -0,0 +1,15 @@
.. _builtin plugins:
================
Built-in Plugins
================
.. toctree::
:maxdepth: 1
calculator
hash_plugin
hostnames
self_info
tor_check
unit_converter

View file

@ -0,0 +1,8 @@
.. _plugins.calculator:
==========
Calculator
==========
.. automodule:: searx.plugins.calculator
:members:

View file

@ -0,0 +1,7 @@
.. _dev plugin:
==================
Plugin Development
==================
.. automodule:: searx.plugins

View file

@ -0,0 +1,8 @@
.. _hash_plugin plugin:
===========
Hash Values
===========
.. autoclass:: searx.plugins.hash_plugin.SXNGPlugin
:members:

View file

@ -1,9 +1,8 @@
.. _hostnames plugin:
================
Hostnames plugin
================
=========
Hostnames
=========
.. automodule:: searx.plugins.hostnames
:members:
:members:

View file

@ -0,0 +1,9 @@
=======
Plugins
=======
.. toctree::
:maxdepth: 2
development
builtins

View file

@ -0,0 +1,8 @@
.. _self_info plugin:
=========
Self-Info
=========
.. autoclass:: searx.plugins.self_info.SXNGPlugin
:members:

View file

@ -1,9 +1,8 @@
.. _tor check plugin:
================
Tor check plugin
================
=========
Tor check
=========
.. automodule:: searx.plugins.tor_check
:members:
:members:

View file

@ -1,9 +1,8 @@
.. _unit converter plugin:
=====================
Unit converter plugin
=====================
==============
Unit Converter
==============
.. automodule:: searx.plugins.unit_converter
:members:

View file

@ -0,0 +1,7 @@
.. _result_types.answer:
==============
Answer Results
==============
.. automodule:: searx.result_types.answer

View file

@ -0,0 +1,5 @@
======
Result
======
.. automodule:: searx.result_types._base

View file

@ -0,0 +1,34 @@
.. _result_types.corrections:
==================
Correction Results
==================
.. hint::
There is still no typing for these result items. The templates can be used as
orientation until the final typing is complete.
The corrections area shows the user alternative search terms.
A result of this type is a very simple dictionary with only one key/value pair
.. code:: python
{"correction" : "lorem ipsum .."}
From this simple dict another dict is build up:
.. code:: python
# use RawTextQuery to get the corrections URLs with the same bang
{"url" : "!bang lorem ipsum ..", "title": "lorem ipsum .." }
and used in the template :origin:`corrections.html
<searx/templates/simple/elements/corrections.html>`:
title : :py:class:`str`
Corrected search term.
url : :py:class:`str`
Not really an URL, its the value to insert in a HTML form for a SearXNG query.

View file

@ -0,0 +1,95 @@
.. _result types:
============
Result Types
============
To understand the typification of the results, let's take a brief look at the
structure of SearXNG .. At its core, SearXNG is nothing more than an aggregator
that aggregates the results from various sources, renders them via templates and
displays them to the user.
The **sources** can be:
1. :ref:`engines <engine implementations>`
2. :ref:`plugins <dev plugin>`
3. :ref:`answerers <dev answerers>`
The sources provide the results, which are displayed in different **areas**
depending on the type of result. The areas are:
main results:
It is the main area in which -- as is typical for search engines -- the
results that a search engine has found for the search term are displayed.
answers:
This area displays short answers that could be found for the search term.
info box:
An area in which additional information can be displayed, e.g. excerpts from
wikipedia or other sources such as maps.
suggestions:
Suggestions for alternative search terms can be found in this area. These can
be clicked on and a search is carried out with these search terms.
corrections:
Results in this area are like the suggestion of alternative search terms,
which usually result from spelling corrections
At this point it is important to note that all **sources** can contribute
results to all of the areas mentioned above.
In most cases, however, the :ref:`engines <engine implementations>` will fill
the *main results* and the :ref:`answerers <dev answerers>` will generally
provide the contributions for the *answer* area. Not necessary to mention here
but for a better understanding: the plugins can also filter out or change
results from the main results area (e.g. the URL of the link).
The result items are organized in the :py:obj:`results.ResultContainer` and
after all sources have delivered their results, this container is passed to the
templating to build a HTML output. The output is usually HTML, but it is also
possible to output the result lists as JSON or RSS feed. Thats quite all we need
to know before we dive into typification of result items.
.. hint::
Typification of result items: we are at the very first beginng!
The first thing we have to realize is that there is no typification of the
result items so far, we have to build it up first .. and that is quite a big
task, which we will only be able to accomplish gradually.
The foundation for the typeless results was laid back in 2013 in the very first
commit :commit:`ae9fb1d7d`, and the principle has not changed since then. At
the time, the approach was perfectly adequate, but we have since evolved and the
demands on SearXNG increase with every feature request.
**Motivation:** in the meantime, it has become very difficult to develop new
features that require structural changes and it is especially hard for newcomers
to find their way in this typeless world. As long as the results are only
simple key/value dictionaries, it is not even possible for the IDEs to support
the application developer in his work.
**Planning:** The procedure for subsequent typing will have to be based on the
circumstances ..
.. attention::
As long as there is no type defined for a kind of result the HTML template
specify what the properties of a type are.
In this sense, you will either find a type definition here in the
documentation or, if this does not yet exist, a description of the HTML
template.
.. toctree::
:maxdepth: 2
base_result
main_result
answer
correction
suggestion
infobox

View file

@ -0,0 +1,60 @@
.. _result_types.infobox:
===============
Infobox Results
===============
.. hint::
There is still no typing for these result items. The templates can be used as
orientation until the final typing is complete.
The infobox is an area where addtional infos shown to the user.
Fields used in the :origin:`infobox.html
<searx/templates/simple/elements/infobox.html>`:
img_src: :py:class:`str`
URL of a image or thumbnail that is displayed in the infobox.
infobox: :py:class:`str`
Title of the info box.
content: :py:class:`str`
Text of the info box.
The infobox has additional subsections for *attributes*, *urls* and
*relatedTopics*:
attributes: :py:class:`List <list>`\ [\ :py:class:`dict`\ ]
A list of attributes. An *attribute* is a dictionary with keys:
- label :py:class:`str`: (mandatory)
- value :py:class:`str`: (mandatory)
- image :py:class:`List <list>`\ [\ :py:class:`dict`\ ] (optional)
A list of images. An *image* is a dictionary with keys:
- src :py:class:`str`: URL of an image/thumbnail (mandatory)
- alt :py:class:`str`: alternative text for the image (mandatory)
urls: :py:class:`List <list>`\ [\ :py:class:`dict`\ ]
A list of links. An *link* is a dictionary with keys:
- url :py:class:`str`: URL of the link (mandatory)
- title :py:class:`str`: Title of the link (mandatory)
relatedTopics: :py:class:`List <list>`\ [\ :py:class:`dict`\ ]
A list of topics. An *topic* is a dictionary with keys:
- name: :py:class:`str`: (mandatory)
- suggestions: :py:class:`List <list>`\ [\ :py:class:`dict`\ ] (optional)
A list of suggestions. A *suggestion* is simple dictionary with just one
key/value pair:
- suggestion: :py:class:`str`: suggested search term (mandatory)

View file

@ -0,0 +1,17 @@
============
Main Results
============
There is still no typing for the items in the :ref:`main result list`. The
templates can be used as orientation until the final typing is complete.
- :ref:`template default`
- :ref:`template images`
- :ref:`template videos`
- :ref:`template torrent`
- :ref:`template map`
- :ref:`template paper`
- :ref:`template packages`
- :ref:`template code`
- :ref:`template files`
- :ref:`template products`

View file

@ -0,0 +1,38 @@
.. _result_types.suggestion:
==================
Suggestion Results
==================
.. hint::
There is still no typing for these result items. The templates can be used as
orientation until the final typing is complete.
The suggestions area shows the user alternative search terms.
A result of this type is a very simple dictionary with only one key/value pair
.. code:: python
{"suggestion" : "lorem ipsum .."}
From this simple dict another dict is build up:
.. code:: python
{"url" : "!bang lorem ipsum ..", "title": "lorem ipsum" }
and used in the template :origin:`suggestions.html
<searx/templates/simple/elements/suggestions.html>`:
.. code:: python
# use RawTextQuery to get the suggestion URLs with the same bang
{"url" : "!bang lorem ipsum ..", "title": "lorem ipsum" }
title : :py:class:`str`
Suggested search term
url : :py:class:`str`
Not really an URL, its the value to insert in a HTML form for a SearXNG query.

View file

@ -60,6 +60,7 @@ Scripts to update static data in :origin:`searx/data/`
.. automodule:: searxng_extra.update.update_engine_traits
:members:
.. _update_osm_keys_tags.py:
``update_osm_keys_tags.py``
===========================

577
docs/dev/templates.rst Normal file
View file

@ -0,0 +1,577 @@
.. _simple theme templates:
======================
Simple Theme Templates
======================
The simple template is complex, it consists of many different elements and also
uses macros and include statements. The following is a rough overview that we
would like to give the developerat hand, details must still be taken from the
:origin:`sources <searx/templates/simple/>`.
A :ref:`result item <result types>` can be of different media types. The media
type of a result is defined by the :py:obj:`result_type.Result.template`. To
set another media-type as :ref:`template default`, the field ``template``
in the result item must be set to the desired type.
.. contents:: Contents
:depth: 2
:local:
:backlinks: entry
.. _result template macros:
Result template macros
======================
.. _macro result_header:
``result_header``
-----------------
Execpt ``image.html`` and some others this macro is used in nearly all result
types in the :ref:`main result list`.
Fields used in the template :origin:`macro result_header
<searx/templates/simple/macros.html>`:
url : :py:class:`str`
Link URL of the result item.
title : :py:class:`str`
Link title of the result item.
img_src, thumbnail : :py:class:`str`
URL of a image or thumbnail that is displayed in the result item.
.. _macro result_sub_header:
``result_sub_header``
---------------------
Execpt ``image.html`` and some others this macro is used in nearly all result
types in the :ref:`main result list`.
Fields used in the template :origin:`macro result_sub_header
<searx/templates/simple/macros.html>`:
publishedDate : :py:obj:`datetime.datetime`
The date on which the object was published.
length: :py:obj:`time.struct_time`
Playing duration in seconds.
views: :py:class:`str`
View count in humanized number format.
author : :py:class:`str`
Author of the title.
metadata : :py:class:`str`
Miscellaneous metadata.
.. _engine_data:
``engine_data_form``
--------------------
The ``engine_data_form`` macro is used in :origin:`results,html
<searx/templates/simple/results.html>` in a HTML ``<form/>`` element. The
intention of this macro is to pass data of a engine from one :py:obj:`response
<searx.engines.demo_online.response>` to the :py:obj:`searx.search.SearchQuery`
of the next :py:obj:`request <searx.engines.demo_online.request>`.
To pass data, engine's response handler can append result items of typ
``engine_data``. This is by example used to pass a token from the response to
the next request:
.. code:: python
def response(resp):
...
results.append({
'engine_data': token,
'key': 'next_page_token',
})
...
return results
def request(query, params):
page_token = params['engine_data'].get('next_page_token')
.. _main result list:
Main Result List
================
The **media types** of the **main result type** are the template files in
the :origin:`result_templates <searx/templates/simple/result_templates>`.
.. _template default:
``default.html``
----------------
Displays result fields from:
- :ref:`macro result_header` and
- :ref:`macro result_sub_header`
Additional fields used in the :origin:`default.html
<searx/templates/simple/result_templates/default.html>`:
content : :py:class:`str`
General text of the result item.
iframe_src : :py:class:`str`
URL of an embedded ``<iframe>`` / the frame is collapsible.
audio_src : uri,
URL of an embedded ``<audio controls>``.
.. _template images:
``images.html``
---------------
The images are displayed as small thumbnails in the main results list.
title : :py:class:`str`
Title of the image.
thumbnail_src : :py:class:`str`
URL of a preview of the image.
resolution :py:class:`str`
The resolution of the image (e.g. ``1920 x 1080`` pixel)
Image labels
~~~~~~~~~~~~
Clicking on the preview opens a gallery view in which all further metadata for
the image is displayed. Addition fields used in the :origin:`images.html
<searx/templates/simple/result_templates/images.html>`:
img_src : :py:class:`str`
URL of the full size image.
content: :py:class:`str`
Description of the image.
author: :py:class:`str`
Name of the author of the image.
img_format : :py:class:`str`
The format of the image (e.g. ``png``).
source : :py:class:`str`
Source of the image.
filesize: :py:class:`str`
Size of bytes in :py:obj:`human readable <searx.humanize_bytes>` notation
(e.g. ``MB`` for 1024 \* 1024 Bytes filesize).
url : :py:class:`str`
URL of the page from where the images comes from (source).
.. _template videos:
``videos.html``
---------------
Displays result fields from:
- :ref:`macro result_header` and
- :ref:`macro result_sub_header`
Additional fields used in the :origin:`videos.html
<searx/templates/simple/result_templates/videos.html>`:
iframe_src : :py:class:`str`
URL of an embedded ``<iframe>`` / the frame is collapsible.
The videos are displayed as small thumbnails in the main results list, there
is an additional button to collaps/open the embeded video.
content : :py:class:`str`
Description of the code fragment.
.. _template torrent:
``torrent.html``
----------------
.. _magnet link: https://en.wikipedia.org/wiki/Magnet_URI_scheme
.. _torrent file: https://en.wikipedia.org/wiki/Torrent_file
Displays result fields from:
- :ref:`macro result_header` and
- :ref:`macro result_sub_header`
Additional fields used in the :origin:`torrent.html
<searx/templates/simple/result_templates/torrent.html>`:
magnetlink:
URL of the `magnet link`_.
torrentfile
URL of the `torrent file`_.
seed : ``int``
Number of seeders.
leech : ``int``
Number of leecher
filesize : ``int``
Size in Bytes (rendered to human readable unit of measurement).
files : ``int``
Number of files.
.. _template map:
``map.html``
------------
.. _GeoJSON: https://en.wikipedia.org/wiki/GeoJSON
.. _Leaflet: https://github.com/Leaflet/Leaflet
.. _bbox: https://wiki.openstreetmap.org/wiki/Bounding_Box
.. _HTMLElement.dataset: https://developer.mozilla.org/en-US/docs/Web/API/HTMLElement/dataset
.. _Nominatim: https://nominatim.org/release-docs/latest/
.. _Lookup: https://nominatim.org/release-docs/latest/api/Lookup/
.. _place_id is not a persistent id:
https://nominatim.org/release-docs/latest/api/Output/#place_id-is-not-a-persistent-id
.. _perma_id: https://wiki.openstreetmap.org/wiki/Permanent_ID
.. _country code: https://wiki.openstreetmap.org/wiki/Country_code
Displays result fields from:
- :ref:`macro result_header` and
- :ref:`macro result_sub_header`
Additional fields used in the :origin:`map.html
<searx/templates/simple/result_templates/map.html>`:
content : :py:class:`str`
Description of the item.
address_label : :py:class:`str`
Label of the address / default ``_('address')``.
geojson : GeoJSON_
Geometries mapped to HTMLElement.dataset_ (``data-map-geojson``) and used by
Leaflet_.
boundingbox : ``[ min-lon, min-lat, max-lon, max-lat]``
A bbox_ area defined by min longitude , min latitude , max longitude and max
latitude. The bounding box is mapped to HTMLElement.dataset_
(``data-map-boundingbox``) and is used by Leaflet_.
longitude, latitude : :py:class:`str`
Geographical coordinates, mapped to HTMLElement.dataset_ (``data-map-lon``,
``data-map-lat``) and is used by Leaflet_.
address : ``{...}``
A dicticonary with the address data:
.. code:: python
address = {
'name' : str, # name of object
'road' : str, # street name of object
'house_number' : str, # house number of object
'postcode' : str, # postcode of object
'country' : str, # country of object
'country_code' : str,
'locality' : str,
}
country_code : :py:class:`str`
`Country code`_ of the object.
locality : :py:class:`str`
The name of the city, town, township, village, borough, etc. in which this
object is located.
links : ``[link1, link2, ...]``
A list of links with labels:
.. code:: python
links.append({
'label' : str,
'url' : str,
'url_label' : str, # set by some engines but unused (oscar)
})
data : ``[data1, data2, ...]``
A list of additional data, shown in two columns and containing a label and
value.
.. code:: python
data.append({
'label' : str,
'value' : str,
'key' : str, # set by some engines but unused
})
type : :py:class:`str` # set by some engines but unused (oscar)
Tag label from :ref:`OSM_KEYS_TAGS['tags'] <update_osm_keys_tags.py>`.
type_icon : :py:class:`str` # set by some engines but unused (oscar)
Type's icon.
osm : ``{...}``
OSM-type and OSM-ID, can be used to Lookup_ OSM data (Nominatim_). There is
also a discussion about "`place_id is not a persistent id`_" and the
perma_id_.
.. code:: python
osm = {
'type': str,
'id': str,
}
type : :py:class:`str`
Type of osm-object (if OSM-Result).
id :
ID of osm-object (if OSM-Result).
.. hint::
The ``osm`` property is set by engine ``openstreetmap.py``, but it is not
used in the ``map.html`` template yet.
.. _template paper:
``paper.html``
--------------
.. _BibTeX format: https://www.bibtex.com/g/bibtex-format/
.. _BibTeX field types: https://en.wikipedia.org/wiki/BibTeX#Field_types
Displays result fields from:
- :ref:`macro result_header`
Additional fields used in the :origin:`paper.html
<searx/templates/simple/result_templates/paper.html>`:
content : :py:class:`str`
An abstract or excerpt from the document.
comments : :py:class:`str`
Free text display in italic below the content.
tags : :py:class:`List <list>`\ [\ :py:class:`str`\ ]
Free tag list.
type : :py:class:`str`
Short description of medium type, e.g. *book*, *pdf* or *html* ...
authors : :py:class:`List <list>`\ [\ :py:class:`str`\ ]
List of authors of the work (authors with a "s" suffix, the "author" is in the
:ref:`macro result_sub_header`).
editor : :py:class:`str`
Editor of the book/paper.
publisher : :py:class:`str`
Name of the publisher.
journal : :py:class:`str`
Name of the journal or magazine the article was published in.
volume : :py:class:`str`
Volume number.
pages : :py:class:`str`
Page range where the article is.
number : :py:class:`str`
Number of the report or the issue number for a journal article.
doi : :py:class:`str`
DOI number (like ``10.1038/d41586-018-07848-2``).
issn : :py:class:`List <list>`\ [\ :py:class:`str`\ ]
ISSN number like ``1476-4687``
isbn : :py:class:`List <list>`\ [\ :py:class:`str`\ ]
ISBN number like ``9780201896831``
pdf_url : :py:class:`str`
URL to the full article, the PDF version
html_url : :py:class:`str`
URL to full article, HTML version
.. _template packages:
``packages``
------------
Displays result fields from:
- :ref:`macro result_header`
Additional fields used in the :origin:`packages.html
<searx/templates/simple/result_templates/packages.html>`:
package_name : :py:class:`str`
The name of the package.
version : :py:class:`str`
The current version of the package.
maintainer : :py:class:`str`
The maintainer or author of the project.
publishedDate : :py:class:`datetime <datetime.datetime>`
Date of latest update or release.
tags : :py:class:`List <list>`\ [\ :py:class:`str`\ ]
Free tag list.
popularity : :py:class:`str`
The popularity of the package, e.g. rating or download count.
license_name : :py:class:`str`
The name of the license.
license_url : :py:class:`str`
The web location of a license copy.
homepage : :py:class:`str`
The url of the project's homepage.
source_code_url: :py:class:`str`
The location of the project's source code.
links : :py:class:`dict`
Additional links in the form of ``{'link_name': 'http://example.com'}``
.. _template code:
``code.html``
-------------
Displays result fields from:
- :ref:`macro result_header` and
- :ref:`macro result_sub_header`
Additional fields used in the :origin:`code.html
<searx/templates/simple/result_templates/code.html>`:
content : :py:class:`str`
Description of the code fragment.
codelines : ``[line1, line2, ...]``
Lines of the code fragment.
code_language : :py:class:`str`
Name of the code language, the value is passed to
:py:obj:`pygments.lexers.get_lexer_by_name`.
repository : :py:class:`str`
URL of the repository of the code fragment.
.. _template files:
``files.html``
--------------
Displays result fields from:
- :ref:`macro result_header` and
- :ref:`macro result_sub_header`
Additional fields used in the :origin:`code.html
<searx/templates/simple/result_templates/files.html>`:
filename, size, time: :py:class:`str`
Filename, Filesize and Date of the file.
mtype : ``audio`` | ``video`` | :py:class:`str`
Mimetype type of the file.
subtype : :py:class:`str`
Mimetype / subtype of the file.
abstract : :py:class:`str`
Abstract of the file.
author : :py:class:`str`
Name of the author of the file
embedded : :py:class:`str`
URL of an embedded media type (``audio`` or ``video``) / is collapsible.
.. _template products:
``products.html``
-----------------
Displays result fields from:
- :ref:`macro result_header` and
- :ref:`macro result_sub_header`
Additional fields used in the :origin:`products.html
<searx/templates/simple/result_templates/products.html>`:
content : :py:class:`str`
Description of the product.
price : :py:class:`str`
The price must include the currency.
shipping : :py:class:`str`
Shipping details.
source_country : :py:class:`str`
Place from which the shipment is made.
.. _template answer results:
Answer results
==============
See :ref:`result_types.answer`
Suggestion results
==================
See :ref:`result_types.suggestion`
Correction results
==================
See :ref:`result_types.corrections`
Infobox results
===============
See :ref:`result_types.infobox`

View file

@ -2,9 +2,9 @@
Source-Code
===========
This is a partial documentation of our source code. We are not aiming to document
every item from the source code, but we will add documentation when requested.
This is a partial documentation of our source code. We are not aiming to
document every item from the source code, but we will add documentation when
requested.
.. toctree::
:maxdepth: 2

View file

@ -9,8 +9,7 @@ import logging
import searx.unixthreadname
import searx.settings_loader
from searx.settings_defaults import settings_set_defaults
from searx.settings_defaults import SCHEMA, apply_schema
# Debug
LOG_FORMAT_DEBUG = '%(levelname)-7s %(name)-30.30s: %(message)s'
@ -21,14 +20,52 @@ LOG_LEVEL_PROD = logging.WARNING
searx_dir = abspath(dirname(__file__))
searx_parent_dir = abspath(dirname(dirname(__file__)))
settings, settings_load_message = searx.settings_loader.load_settings()
if settings is not None:
settings = settings_set_defaults(settings)
settings = {}
searx_debug = False
logger = logging.getLogger('searx')
_unset = object()
def init_settings():
"""Initialize global ``settings`` and ``searx_debug`` variables and
``logger`` from ``SEARXNG_SETTINGS_PATH``.
"""
global settings, searx_debug # pylint: disable=global-variable-not-assigned
cfg, msg = searx.settings_loader.load_settings(load_user_settings=True)
cfg = cfg or {}
apply_schema(cfg, SCHEMA, [])
settings.clear()
settings.update(cfg)
searx_debug = settings['general']['debug']
if searx_debug:
_logging_config_debug()
else:
logging.basicConfig(level=LOG_LEVEL_PROD, format=LOG_FORMAT_PROD)
logging.root.setLevel(level=LOG_LEVEL_PROD)
logging.getLogger('werkzeug').setLevel(level=LOG_LEVEL_PROD)
logger.info(msg)
# log max_request_timeout
max_request_timeout = settings['outgoing']['max_request_timeout']
if max_request_timeout is None:
logger.info('max_request_timeout=%s', repr(max_request_timeout))
else:
logger.info('max_request_timeout=%i second(s)', max_request_timeout)
if settings['server']['public_instance']:
logger.warning(
"Be aware you have activated features intended only for public instances. "
"This force the usage of the limiter and link_token / "
"see https://docs.searxng.org/admin/searx.limiter.html"
)
def get_setting(name, default=_unset):
"""Returns the value to which ``name`` point. If there is no such name in the
settings and the ``default`` is unset, a :py:obj:`KeyError` is raised.
@ -50,20 +87,20 @@ def get_setting(name, default=_unset):
return value
def is_color_terminal():
def _is_color_terminal():
if os.getenv('TERM') in ('dumb', 'unknown'):
return False
return sys.stdout.isatty()
def logging_config_debug():
def _logging_config_debug():
try:
import coloredlogs # pylint: disable=import-outside-toplevel
except ImportError:
coloredlogs = None
log_level = os.environ.get('SEARXNG_DEBUG_LOG_LEVEL', 'DEBUG')
if coloredlogs and is_color_terminal():
if coloredlogs and _is_color_terminal():
level_styles = {
'spam': {'color': 'green', 'faint': True},
'debug': {},
@ -87,26 +124,4 @@ def logging_config_debug():
logging.basicConfig(level=logging.getLevelName(log_level), format=LOG_FORMAT_DEBUG)
searx_debug = settings['general']['debug']
if searx_debug:
logging_config_debug()
else:
logging.basicConfig(level=LOG_LEVEL_PROD, format=LOG_FORMAT_PROD)
logging.root.setLevel(level=LOG_LEVEL_PROD)
logging.getLogger('werkzeug').setLevel(level=LOG_LEVEL_PROD)
logger = logging.getLogger('searx')
logger.info(settings_load_message)
# log max_request_timeout
max_request_timeout = settings['outgoing']['max_request_timeout']
if max_request_timeout is None:
logger.info('max_request_timeout=%s', repr(max_request_timeout))
else:
logger.info('max_request_timeout=%i second(s)', max_request_timeout)
if settings['server']['public_instance']:
logger.warning(
"Be aware you have activated features intended only for public instances. "
"This force the usage of the limiter and link_token / "
"see https://docs.searxng.org/admin/searx.limiter.html"
)
init_settings()

View file

@ -1,51 +1,49 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=missing-module-docstring
"""The *answerers* give instant answers related to the search query, they
usually provide answers of type :py:obj:`Answer <searx.result_types.Answer>`.
import sys
from os import listdir
from os.path import realpath, dirname, join, isdir
from collections import defaultdict
Here is an example of a very simple answerer that adds a "Hello" into the answer
area:
from searx.utils import load_module
.. code::
answerers_dir = dirname(realpath(__file__))
from flask_babel import gettext as _
from searx.answerers import Answerer
from searx.result_types import Answer
class MyAnswerer(Answerer):
keywords = [ "hello", "hello world" ]
def info(self):
return AnswererInfo(name=_("Hello"), description=_("lorem .."), keywords=self.keywords)
def answer(self, request, search):
return [ Answer(answer="Hello") ]
----
.. autoclass:: Answerer
:members:
.. autoclass:: AnswererInfo
:members:
.. autoclass:: AnswerStorage
:members:
.. autoclass:: searx.answerers._core.ModuleAnswerer
:members:
:show-inheritance:
"""
from __future__ import annotations
__all__ = ["AnswererInfo", "Answerer", "AnswerStorage"]
def load_answerers():
answerers = [] # pylint: disable=redefined-outer-name
from ._core import AnswererInfo, Answerer, AnswerStorage
for filename in listdir(answerers_dir):
if not isdir(join(answerers_dir, filename)) or filename.startswith('_'):
continue
module = load_module('answerer.py', join(answerers_dir, filename))
if not hasattr(module, 'keywords') or not isinstance(module.keywords, tuple) or not module.keywords:
sys.exit(2)
answerers.append(module)
return answerers
def get_answerers_by_keywords(answerers): # pylint:disable=redefined-outer-name
by_keyword = defaultdict(list)
for answerer in answerers:
for keyword in answerer.keywords:
for keyword in answerer.keywords:
by_keyword[keyword].append(answerer.answer)
return by_keyword
def ask(query):
results = []
query_parts = list(filter(None, query.query.split()))
if not query_parts or query_parts[0] not in answerers_by_keywords:
return results
for answerer in answerers_by_keywords[query_parts[0]]:
result = answerer(query)
if result:
results.append(result)
return results
answerers = load_answerers()
answerers_by_keywords = get_answerers_by_keywords(answerers)
STORAGE: AnswerStorage = AnswerStorage()
STORAGE.load_builtins()

169
searx/answerers/_core.py Normal file
View file

@ -0,0 +1,169 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=too-few-public-methods, missing-module-docstring
from __future__ import annotations
import abc
import importlib
import logging
import pathlib
import warnings
from dataclasses import dataclass
from searx.utils import load_module
from searx.result_types.answer import BaseAnswer
_default = pathlib.Path(__file__).parent
log: logging.Logger = logging.getLogger("searx.answerers")
@dataclass
class AnswererInfo:
"""Object that holds informations about an answerer, these infos are shown
to the user in the Preferences menu.
To be able to translate the information into other languages, the text must
be written in English and translated with :py:obj:`flask_babel.gettext`.
"""
name: str
"""Name of the *answerer*."""
description: str
"""Short description of the *answerer*."""
examples: list[str]
"""List of short examples of the usage / of query terms."""
keywords: list[str]
"""See :py:obj:`Answerer.keywords`"""
class Answerer(abc.ABC):
"""Abstract base class of answerers."""
keywords: list[str]
"""Keywords to which the answerer has *answers*."""
@abc.abstractmethod
def answer(self, query: str) -> list[BaseAnswer]:
"""Function that returns a list of answers to the question/query."""
@abc.abstractmethod
def info(self) -> AnswererInfo:
"""Informations about the *answerer*, see :py:obj:`AnswererInfo`."""
class ModuleAnswerer(Answerer):
"""A wrapper class for legacy *answerers* where the names (keywords, answer,
info) are implemented on the module level (not in a class).
.. note::
For internal use only!
"""
def __init__(self, mod):
for name in ["keywords", "self_info", "answer"]:
if not getattr(mod, name, None):
raise SystemExit(2)
if not isinstance(mod.keywords, tuple):
raise SystemExit(2)
self.module = mod
self.keywords = mod.keywords # type: ignore
def answer(self, query: str) -> list[BaseAnswer]:
return self.module.answer(query)
def info(self) -> AnswererInfo:
kwargs = self.module.self_info()
kwargs["keywords"] = self.keywords
return AnswererInfo(**kwargs)
class AnswerStorage(dict):
"""A storage for managing the *answerers* of SearXNG. With the
:py:obj:`AnswerStorage.ask` method, a caller can ask questions to all
*answerers* and receives a list of the results."""
answerer_list: set[Answerer]
"""The list of :py:obj:`Answerer` in this storage."""
def __init__(self):
super().__init__()
self.answerer_list = set()
def load_builtins(self):
"""Loads ``answerer.py`` modules from the python packages in
:origin:`searx/answerers`. The python modules are wrapped by
:py:obj:`ModuleAnswerer`."""
for f in _default.iterdir():
if f.name.startswith("_"):
continue
if f.is_file() and f.suffix == ".py":
self.register_by_fqn(f"searx.answerers.{f.stem}.SXNGAnswerer")
continue
# for backward compatibility (if a fork has additional answerers)
if f.is_dir() and (f / "answerer.py").exists():
warnings.warn(
f"answerer module {f} is deprecated / migrate to searx.answerers.Answerer", DeprecationWarning
)
mod = load_module("answerer.py", str(f))
self.register(ModuleAnswerer(mod))
def register_by_fqn(self, fqn: str):
"""Register a :py:obj:`Answerer` via its fully qualified class namen(FQN)."""
mod_name, _, obj_name = fqn.rpartition('.')
mod = importlib.import_module(mod_name)
code_obj = getattr(mod, obj_name, None)
if code_obj is None:
msg = f"answerer {fqn} is not implemented"
log.critical(msg)
raise ValueError(msg)
self.register(code_obj())
def register(self, answerer: Answerer):
"""Register a :py:obj:`Answerer`."""
self.answerer_list.add(answerer)
for _kw in answerer.keywords:
self[_kw] = self.get(_kw, [])
self[_kw].append(answerer)
def ask(self, query: str) -> list[BaseAnswer]:
"""An answerer is identified via keywords, if there is a keyword at the
first position in the ``query`` for which there is one or more
answerers, then these are called, whereby the entire ``query`` is passed
as argument to the answerer function."""
results = []
keyword = None
for keyword in query.split():
if keyword:
break
if not keyword or keyword not in self:
return results
for answerer in self[keyword]:
for answer in answerer.answer(query):
# In case of *answers* prefix ``answerer:`` is set, see searx.result_types.Result
answer.engine = f"answerer: {keyword}"
results.append(answer)
return results
@property
def info(self) -> list[AnswererInfo]:
return [a.info() for a in self.answerer_list]

80
searx/answerers/random.py Normal file
View file

@ -0,0 +1,80 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=missing-module-docstring
from __future__ import annotations
import hashlib
import random
import string
import uuid
from flask_babel import gettext
from searx.result_types import Answer
from searx.result_types.answer import BaseAnswer
from . import Answerer, AnswererInfo
def random_characters():
random_string_letters = string.ascii_lowercase + string.digits + string.ascii_uppercase
return [random.choice(random_string_letters) for _ in range(random.randint(8, 32))]
def random_string():
return ''.join(random_characters())
def random_float():
return str(random.random())
def random_int():
random_int_max = 2**31
return str(random.randint(-random_int_max, random_int_max))
def random_sha256():
m = hashlib.sha256()
m.update(''.join(random_characters()).encode())
return str(m.hexdigest())
def random_uuid():
return str(uuid.uuid4())
def random_color():
color = "%06x" % random.randint(0, 0xFFFFFF)
return f"#{color.upper()}"
class SXNGAnswerer(Answerer):
"""Random value generator"""
keywords = ["random"]
random_types = {
"string": random_string,
"int": random_int,
"float": random_float,
"sha256": random_sha256,
"uuid": random_uuid,
"color": random_color,
}
def info(self):
return AnswererInfo(
name=gettext(self.__doc__),
description=gettext("Generate different random values"),
keywords=self.keywords,
examples=[f"random {x}" for x in self.random_types],
)
def answer(self, query: str) -> list[BaseAnswer]:
parts = query.split()
if len(parts) != 2 or parts[1] not in self.random_types:
return []
return [Answer(answer=self.random_types[parts[1]]())]

View file

@ -1,2 +0,0 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=missing-module-docstring

View file

@ -1,79 +0,0 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=missing-module-docstring
import hashlib
import random
import string
import uuid
from flask_babel import gettext
# required answerer attribute
# specifies which search query keywords triggers this answerer
keywords = ('random',)
random_int_max = 2**31
random_string_letters = string.ascii_lowercase + string.digits + string.ascii_uppercase
def random_characters():
return [random.choice(random_string_letters) for _ in range(random.randint(8, 32))]
def random_string():
return ''.join(random_characters())
def random_float():
return str(random.random())
def random_int():
return str(random.randint(-random_int_max, random_int_max))
def random_sha256():
m = hashlib.sha256()
m.update(''.join(random_characters()).encode())
return str(m.hexdigest())
def random_uuid():
return str(uuid.uuid4())
def random_color():
color = "%06x" % random.randint(0, 0xFFFFFF)
return f"#{color.upper()}"
random_types = {
'string': random_string,
'int': random_int,
'float': random_float,
'sha256': random_sha256,
'uuid': random_uuid,
'color': random_color,
}
# required answerer function
# can return a list of results (any result type) for a given query
def answer(query):
parts = query.query.split()
if len(parts) != 2:
return []
if parts[1] not in random_types:
return []
return [{'answer': random_types[parts[1]]()}]
# required answerer function
# returns information about the answerer
def self_info():
return {
'name': gettext('Random value generator'),
'description': gettext('Generate different random values'),
'examples': ['random {}'.format(x) for x in random_types],
}

View file

@ -0,0 +1,64 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=missing-module-docstring
from __future__ import annotations
from functools import reduce
from operator import mul
import babel
import babel.numbers
from flask_babel import gettext
from searx.extended_types import sxng_request
from searx.result_types import Answer
from searx.result_types.answer import BaseAnswer
from . import Answerer, AnswererInfo
kw2func = [
("min", min),
("max", max),
("avg", lambda args: sum(args) / len(args)),
("sum", sum),
("prod", lambda args: reduce(mul, args, 1)),
]
class SXNGAnswerer(Answerer):
"""Statistics functions"""
keywords = [kw for kw, _ in kw2func]
def info(self):
return AnswererInfo(
name=gettext(self.__doc__),
description=gettext(f"Compute {'/'.join(self.keywords)} of the arguments"),
keywords=self.keywords,
examples=["avg 123 548 2.04 24.2"],
)
def answer(self, query: str) -> list[BaseAnswer]:
results = []
parts = query.split()
if len(parts) < 2:
return results
ui_locale = babel.Locale.parse(sxng_request.preferences.get_value('locale'), sep='-')
try:
args = [babel.numbers.parse_decimal(num, ui_locale, numbering_system="latn") for num in parts[1:]]
except: # pylint: disable=bare-except
# seems one of the args is not a float type, can't be converted to float
return results
for k, func in kw2func:
if k == parts[0]:
res = func(args)
res = babel.numbers.format_decimal(res, locale=ui_locale)
f_str = ', '.join(babel.numbers.format_decimal(arg, locale=ui_locale) for arg in args)
results.append(Answer(answer=f"[{ui_locale}] {k}({f_str}) = {res} "))
break
return results

View file

@ -1,2 +0,0 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=missing-module-docstring

View file

@ -1,53 +0,0 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=missing-module-docstring
from functools import reduce
from operator import mul
from flask_babel import gettext
keywords = ('min', 'max', 'avg', 'sum', 'prod')
# required answerer function
# can return a list of results (any result type) for a given query
def answer(query):
parts = query.query.split()
if len(parts) < 2:
return []
try:
args = list(map(float, parts[1:]))
except: # pylint: disable=bare-except
return []
func = parts[0]
_answer = None
if func == 'min':
_answer = min(args)
elif func == 'max':
_answer = max(args)
elif func == 'avg':
_answer = sum(args) / len(args)
elif func == 'sum':
_answer = sum(args)
elif func == 'prod':
_answer = reduce(mul, args, 1)
if _answer is None:
return []
return [{'answer': str(_answer)}]
# required answerer function
# returns information about the answerer
def self_info():
return {
'name': gettext('Statistics functions'),
'description': gettext('Compute {functions} of the arguments').format(functions='/'.join(keywords)),
'examples': ['avg 123 548 2.04 24.2'],
}

View file

@ -8,9 +8,11 @@ import json
import html
from urllib.parse import urlencode, quote_plus
import lxml
import lxml.etree
import lxml.html
from httpx import HTTPError
from searx.extended_types import SXNG_Response
from searx import settings
from searx.engines import (
engines,
@ -26,12 +28,12 @@ def update_kwargs(**kwargs):
kwargs['raise_for_httperror'] = True
def get(*args, **kwargs):
def get(*args, **kwargs) -> SXNG_Response:
update_kwargs(**kwargs)
return http_get(*args, **kwargs)
def post(*args, **kwargs):
def post(*args, **kwargs) -> SXNG_Response:
update_kwargs(**kwargs)
return http_post(*args, **kwargs)
@ -126,7 +128,7 @@ def google_complete(query, sxng_locale):
)
results = []
resp = get(url.format(subdomain=google_info['subdomain'], args=args))
if resp.ok:
if resp and resp.ok:
json_txt = resp.text[resp.text.find('[') : resp.text.find(']', -3) + 1]
data = json.loads(json_txt)
for item in data[0]:
@ -220,7 +222,7 @@ def wikipedia(query, sxng_locale):
results = []
eng_traits = engines['wikipedia'].traits
wiki_lang = eng_traits.get_language(sxng_locale, 'en')
wiki_netloc = eng_traits.custom['wiki_netloc'].get(wiki_lang, 'en.wikipedia.org')
wiki_netloc = eng_traits.custom['wiki_netloc'].get(wiki_lang, 'en.wikipedia.org') # type: ignore
url = 'https://{wiki_netloc}/w/api.php?{args}'
args = urlencode(

View file

@ -13,12 +13,14 @@ import flask
import werkzeug
from searx import logger
from searx.extended_types import SXNG_Request
from . import config
logger = logger.getChild('botdetection')
def dump_request(request: flask.Request):
def dump_request(request: SXNG_Request):
return (
request.path
+ " || X-Forwarded-For: %s" % request.headers.get('X-Forwarded-For')
@ -66,7 +68,7 @@ def _log_error_only_once(err_msg):
_logged_errors.append(err_msg)
def get_real_ip(request: flask.Request) -> str:
def get_real_ip(request: SXNG_Request) -> str:
"""Returns real IP of the request. Since not all proxies set all the HTTP
headers and incoming headers can be faked it may happen that the IP cannot
be determined correctly.

View file

@ -12,7 +12,6 @@ Accept_ header ..
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept
"""
# pylint: disable=unused-argument
from __future__ import annotations
from ipaddress import (
@ -20,17 +19,18 @@ from ipaddress import (
IPv6Network,
)
import flask
import werkzeug
from searx.extended_types import SXNG_Request
from . import config
from ._helpers import too_many_requests
def filter_request(
network: IPv4Network | IPv6Network,
request: flask.Request,
cfg: config.Config,
request: SXNG_Request,
cfg: config.Config, # pylint: disable=unused-argument
) -> werkzeug.Response | None:
if 'text/html' not in request.accept_mimetypes:

View file

@ -13,7 +13,6 @@ bot if the Accept-Encoding_ header ..
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Encoding
"""
# pylint: disable=unused-argument
from __future__ import annotations
from ipaddress import (
@ -21,17 +20,18 @@ from ipaddress import (
IPv6Network,
)
import flask
import werkzeug
from searx.extended_types import SXNG_Request
from . import config
from ._helpers import too_many_requests
def filter_request(
network: IPv4Network | IPv6Network,
request: flask.Request,
cfg: config.Config,
request: SXNG_Request,
cfg: config.Config, # pylint: disable=unused-argument
) -> werkzeug.Response | None:
accept_list = [l.strip() for l in request.headers.get('Accept-Encoding', '').split(',')]

View file

@ -10,24 +10,25 @@ if the Accept-Language_ header is unset.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent
"""
# pylint: disable=unused-argument
from __future__ import annotations
from ipaddress import (
IPv4Network,
IPv6Network,
)
import flask
import werkzeug
from searx.extended_types import SXNG_Request
from . import config
from ._helpers import too_many_requests
def filter_request(
network: IPv4Network | IPv6Network,
request: flask.Request,
cfg: config.Config,
request: SXNG_Request,
cfg: config.Config, # pylint: disable=unused-argument
) -> werkzeug.Response | None:
if request.headers.get('Accept-Language', '').strip() == '':
return too_many_requests(network, "missing HTTP header Accept-Language")

View file

@ -10,7 +10,6 @@ the Connection_ header is set to ``close``.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Connection
"""
# pylint: disable=unused-argument
from __future__ import annotations
from ipaddress import (
@ -18,17 +17,18 @@ from ipaddress import (
IPv6Network,
)
import flask
import werkzeug
from searx.extended_types import SXNG_Request
from . import config
from ._helpers import too_many_requests
def filter_request(
network: IPv4Network | IPv6Network,
request: flask.Request,
cfg: config.Config,
request: SXNG_Request,
cfg: config.Config, # pylint: disable=unused-argument
) -> werkzeug.Response | None:
if request.headers.get('Connection', '').strip() == 'close':

View file

@ -11,7 +11,6 @@ the User-Agent_ header is unset or matches the regular expression
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent
"""
# pylint: disable=unused-argument
from __future__ import annotations
import re
@ -20,9 +19,10 @@ from ipaddress import (
IPv6Network,
)
import flask
import werkzeug
from searx.extended_types import SXNG_Request
from . import config
from ._helpers import too_many_requests
@ -56,8 +56,8 @@ def regexp_user_agent():
def filter_request(
network: IPv4Network | IPv6Network,
request: flask.Request,
cfg: config.Config,
request: SXNG_Request,
cfg: config.Config, # pylint: disable=unused-argument
) -> werkzeug.Response | None:
user_agent = request.headers.get('User-Agent', 'unknown')

View file

@ -45,6 +45,7 @@ from ipaddress import (
import flask
import werkzeug
from searx.extended_types import SXNG_Request
from searx import redisdb
from searx.redislib import incr_sliding_window, drop_counter
@ -91,7 +92,7 @@ SUSPICIOUS_IP_MAX = 3
def filter_request(
network: IPv4Network | IPv6Network,
request: flask.Request,
request: SXNG_Request,
cfg: config.Config,
) -> werkzeug.Response | None:

View file

@ -43,11 +43,11 @@ from ipaddress import (
import string
import random
import flask
from searx import logger
from searx import redisdb
from searx.redislib import secret_hash
from searx.extended_types import SXNG_Request
from ._helpers import (
get_network,
@ -69,7 +69,7 @@ TOKEN_KEY = 'SearXNG_limiter.token'
logger = logger.getChild('botdetection.link_token')
def is_suspicious(network: IPv4Network | IPv6Network, request: flask.Request, renew: bool = False):
def is_suspicious(network: IPv4Network | IPv6Network, request: SXNG_Request, renew: bool = False):
"""Checks whether a valid ping is exists for this (client) network, if not
this request is rated as *suspicious*. If a valid ping exists and argument
``renew`` is ``True`` the expire time of this ping is reset to
@ -92,7 +92,7 @@ def is_suspicious(network: IPv4Network | IPv6Network, request: flask.Request, re
return False
def ping(request: flask.Request, token: str):
def ping(request: SXNG_Request, token: str):
"""This function is called by a request to URL ``/client<token>.css``. If
``token`` is valid a :py:obj:`PING_KEY` for the client is stored in the DB.
The expire time of this ping-key is :py:obj:`PING_LIVE_TIME`.
@ -113,7 +113,7 @@ def ping(request: flask.Request, token: str):
redis_client.set(ping_key, 1, ex=PING_LIVE_TIME)
def get_ping_key(network: IPv4Network | IPv6Network, request: flask.Request) -> str:
def get_ping_key(network: IPv4Network | IPv6Network, request: SXNG_Request) -> str:
"""Generates a hashed key that fits (more or less) to a *WEB-browser
session* in a network."""
return (

View file

@ -139,6 +139,7 @@ from searx.utils import (
get_embeded_stream_url,
)
from searx.enginelib.traits import EngineTraits
from searx.result_types import Answer
if TYPE_CHECKING:
import logging
@ -274,10 +275,14 @@ def _parse_search(resp):
result_list = []
dom = html.fromstring(resp.text)
# I doubt that Brave is still providing the "answer" class / I haven't seen
# answers in brave for a long time.
answer_tag = eval_xpath_getindex(dom, '//div[@class="answer"]', 0, default=None)
if answer_tag:
url = eval_xpath_getindex(dom, '//div[@id="featured_snippet"]/a[@class="result-header"]/@href', 0, default=None)
result_list.append({'answer': extract_text(answer_tag), 'url': url})
answer = extract_text(answer_tag)
if answer is not None:
Answer(results=result_list, answer=answer, url=url)
# xpath_results = '//div[contains(@class, "snippet fdb") and @data-type="web"]'
xpath_results = '//div[contains(@class, "snippet ")]'

View file

@ -1,6 +1,8 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Deepl translation engine"""
from searx.result_types import Translations
about = {
"website": 'https://deepl.com',
"wikidata_id": 'Q43968444',
@ -45,8 +47,7 @@ def response(resp):
if not result.get('translations'):
return results
translations = [{'text': translation['text']} for translation in result['translations']]
results.append({'answer': translations[0]['text'], 'answer_type': 'translations', 'translations': translations})
translations = [Translations.Item(text=t['text']) for t in result['translations']]
Translations(results=results, translations=translations)
return results

View file

@ -3,8 +3,12 @@
Dictzone
"""
import urllib.parse
from lxml import html
from searx.utils import eval_xpath
from searx.utils import eval_xpath, extract_text
from searx.result_types import Translations
from searx.network import get as http_get # https://github.com/searxng/searxng/issues/762
# about
about = {
@ -18,46 +22,83 @@ about = {
engine_type = 'online_dictionary'
categories = ['general', 'translate']
url = 'https://dictzone.com/{from_lang}-{to_lang}-dictionary/{query}'
base_url = "https://dictzone.com"
weight = 100
results_xpath = './/table[@id="r"]/tr'
https_support = True
def request(query, params): # pylint: disable=unused-argument
params['url'] = url.format(from_lang=params['from_lang'][2], to_lang=params['to_lang'][2], query=params['query'])
from_lang = params["from_lang"][2] # "english"
to_lang = params["to_lang"][2] # "german"
query = params["query"]
params["url"] = f"{base_url}/{from_lang}-{to_lang}-dictionary/{urllib.parse.quote_plus(query)}"
return params
def _clean_up_node(node):
for x in ["./i", "./span", "./button"]:
for n in node.xpath(x):
n.getparent().remove(n)
def response(resp):
results = []
item_list = []
if not resp.ok:
return results
dom = html.fromstring(resp.text)
translations = []
for result in eval_xpath(dom, results_xpath)[1:]:
try:
from_result, to_results_raw = eval_xpath(result, './td')
except: # pylint: disable=bare-except
for result in eval_xpath(dom, ".//table[@id='r']//tr"):
# each row is an Translations.Item
td_list = result.xpath("./td")
if len(td_list) != 2:
# ignore header columns "tr/th"
continue
to_results = []
for to_result in eval_xpath(to_results_raw, './p/a'):
t = to_result.text_content()
if t.strip():
to_results.append(to_result.text_content())
col_from, col_to = td_list
_clean_up_node(col_from)
translations.append(
{
'text': f"{from_result.text_content()} - {'; '.join(to_results)}",
}
)
text = f"{extract_text(col_from)}"
if translations:
result = {
'answer': translations[0]['text'],
'translations': translations,
'answer_type': 'translations',
}
synonyms = []
p_list = col_to.xpath(".//p")
return [result]
for i, p_item in enumerate(p_list):
smpl: str = extract_text(p_list[i].xpath("./i[@class='smpl']")) # type: ignore
_clean_up_node(p_item)
p_text: str = extract_text(p_item) # type: ignore
if smpl:
p_text += " // " + smpl
if i == 0:
text += f" : {p_text}"
continue
synonyms.append(p_text)
item = Translations.Item(text=text, synonyms=synonyms)
item_list.append(item)
# the "autotranslate" of dictzone is loaded by the JS from URL:
# https://dictzone.com/trans/hello%20world/en_de
from_lang = resp.search_params["from_lang"][1] # "en"
to_lang = resp.search_params["to_lang"][1] # "de"
query = resp.search_params["query"]
# works only sometimes?
autotranslate = http_get(f"{base_url}/trans/{query}/{from_lang}_{to_lang}", timeout=1.0)
if autotranslate.ok and autotranslate.text:
item_list.insert(0, Translations.Item(text=autotranslate.text))
Translations(results=results, translations=item_list, url=resp.search_params["url"])
return results

View file

@ -27,6 +27,7 @@ from searx.network import get # see https://github.com/searxng/searxng/issues/7
from searx import redisdb
from searx.enginelib.traits import EngineTraits
from searx.exceptions import SearxEngineCaptchaException
from searx.result_types import Answer
if TYPE_CHECKING:
import logging
@ -398,12 +399,7 @@ def response(resp):
):
current_query = resp.search_params["data"].get("q")
results.append(
{
'answer': zero_click,
'url': "https://duckduckgo.com/?" + urlencode({"q": current_query}),
}
)
Answer(results=results, answer=zero_click, url="https://duckduckgo.com/?" + urlencode({"q": current_query}))
return results

View file

@ -21,6 +21,7 @@ from lxml import html
from searx.data import WIKIDATA_UNITS
from searx.utils import extract_text, html_to_text, get_string_replaces_function
from searx.external_urls import get_external_url, get_earth_coordinates_url, area_to_osm_zoom
from searx.result_types import Answer
if TYPE_CHECKING:
import logging
@ -99,9 +100,10 @@ def response(resp):
# add answer if there is one
answer = search_res.get('Answer', '')
if answer:
logger.debug('AnswerType="%s" Answer="%s"', search_res.get('AnswerType'), answer)
if search_res.get('AnswerType') not in ['calc', 'ip']:
results.append({'answer': html_to_text(answer), 'url': search_res.get('AbstractURL', '')})
answer_type = search_res.get('AnswerType')
logger.debug('AnswerType="%s" Answer="%s"', answer_type, answer)
if isinstance(answer, str) and answer_type not in ['calc', 'ip']:
Answer(results=results, answer=html_to_text(answer), url=search_res.get('AbstractURL', ''))
# add infobox
if 'Definition' in search_res:

View file

@ -25,6 +25,7 @@ from searx.locales import language_tag, region_tag, get_official_locales
from searx.network import get # see https://github.com/searxng/searxng/issues/762
from searx.exceptions import SearxEngineCaptchaException
from searx.enginelib.traits import EngineTraits
from searx.result_types import Answer
if TYPE_CHECKING:
import logging
@ -331,12 +332,7 @@ def response(resp):
for item in answer_list:
for bubble in eval_xpath(item, './/div[@class="nnFGuf"]'):
bubble.drop_tree()
results.append(
{
'answer': extract_text(item),
'url': (eval_xpath(item, '../..//a/@href') + [None])[0],
}
)
Answer(results=results, answer=extract_text(item), url=(eval_xpath(item, '../..//a/@href') + [None])[0])
# parse results

View file

@ -2,7 +2,8 @@
"""LibreTranslate (Free and Open Source Machine Translation API)"""
import random
from json import dumps
import json
from searx.result_types import Translations
about = {
"website": 'https://libretranslate.com',
@ -16,19 +17,27 @@ about = {
engine_type = 'online_dictionary'
categories = ['general', 'translate']
base_url = "https://translate.terraprint.co"
api_key = ''
base_url = "https://libretranslate.com/translate"
api_key = ""
def request(_query, params):
request_url = random.choice(base_url) if isinstance(base_url, list) else base_url
if request_url.startswith("https://libretranslate.com") and not api_key:
return None
params['url'] = f"{request_url}/translate"
args = {'source': params['from_lang'][1], 'target': params['to_lang'][1], 'q': params['query'], 'alternatives': 3}
args = {
'q': params['query'],
'source': params['from_lang'][1],
'target': params['to_lang'][1],
'alternatives': 3,
}
if api_key:
args['api_key'] = api_key
params['data'] = dumps(args)
params['data'] = json.dumps(args)
params['method'] = 'POST'
params['headers'] = {'Content-Type': 'application/json'}
params['req_url'] = request_url
@ -41,12 +50,10 @@ def response(resp):
json_resp = resp.json()
text = json_resp.get('translatedText')
if not text:
return results
translations = [{'text': text}] + [{'text': alternative} for alternative in json_resp.get('alternatives', [])]
results.append({'answer': text, 'answer_type': 'translations', 'translations': translations})
item = Translations.Item(text=text, examples=json_resp.get('alternatives', []))
Translations(results=results, translations=[item])
return results

View file

@ -1,6 +1,8 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Lingva (alternative Google Translate frontend)"""
from searx.result_types import Translations
about = {
"website": 'https://lingva.ml',
"wikidata_id": None,
@ -14,13 +16,10 @@ engine_type = 'online_dictionary'
categories = ['general', 'translate']
url = "https://lingva.thedaviddelta.com"
search_url = "{url}/api/v1/{from_lang}/{to_lang}/{query}"
def request(_query, params):
params['url'] = search_url.format(
url=url, from_lang=params['from_lang'][1], to_lang=params['to_lang'][1], query=params['query']
)
params['url'] = f"{url}/api/v1/{params['from_lang'][1]}/{params['to_lang'][1]}/{params['query']}"
return params
@ -45,32 +44,30 @@ def response(resp):
for definition in info['definitions']:
for translation in definition['list']:
data.append(
{
'text': result['translation'],
'definitions': [translation['definition']] if translation['definition'] else [],
'examples': [translation['example']] if translation['example'] else [],
'synonyms': translation['synonyms'],
}
Translations.Item(
text=result['translation'],
definitions=[translation['definition']] if translation['definition'] else [],
examples=[translation['example']] if translation['example'] else [],
synonyms=translation['synonyms'],
)
)
for translation in info["extraTranslations"]:
for word in translation["list"]:
data.append(
{
'text': word['word'],
'definitions': word['meanings'],
}
Translations.Item(
text=word['word'],
definitions=word['meanings'],
)
)
if not data and result['translation']:
data.append({'text': result['translation']})
data.append(Translations.Item(text=result['translation']))
results.append(
{
'answer': data[0]['text'],
'answer_type': 'translations',
'translations': data,
}
params = resp.search_params
Translations(
results=results,
translations=data,
url=f"{url}/{params['from_lang'][1]}/{params['to_lang'][1]}/{params['query']}",
)
return results

View file

@ -3,7 +3,9 @@
import random
import re
from urllib.parse import urlencode
import urllib.parse
from searx.result_types import Translations
about = {
"website": 'https://codeberg.org/aryak/mozhi',
@ -27,34 +29,33 @@ def request(_query, params):
request_url = random.choice(base_url) if isinstance(base_url, list) else base_url
args = {'from': params['from_lang'][1], 'to': params['to_lang'][1], 'text': params['query'], 'engine': mozhi_engine}
params['url'] = f"{request_url}/api/translate?{urlencode(args)}"
params['url'] = f"{request_url}/api/translate?{urllib.parse.urlencode(args)}"
return params
def response(resp):
results = []
translation = resp.json()
data = {'text': translation['translated-text'], 'definitions': [], 'examples': []}
item = Translations.Item(text=translation['translated-text'])
if translation['target_transliteration'] and not re.match(
re_transliteration_unsupported, translation['target_transliteration']
):
data['transliteration'] = translation['target_transliteration']
item.transliteration = translation['target_transliteration']
if translation['word_choices']:
for word in translation['word_choices']:
if word.get('definition'):
data['definitions'].append(word['definition'])
item.definitions.append(word['definition'])
for example in word.get('examples_target', []):
data['examples'].append(re.sub(r"<|>", "", example).lstrip('- '))
item.examples.append(re.sub(r"<|>", "", example).lstrip('- '))
data['synonyms'] = translation.get('source_synonyms', [])
item.synonyms = translation.get('source_synonyms', [])
result = {
'answer': translation['translated-text'],
'answer_type': 'translations',
'translations': [data],
}
return [result]
url = urllib.parse.urlparse(resp.search_params["url"])
# remove the api path
url = url._replace(path="", fragment="").geturl()
Translations(results=results, translations=[item], url=url)
return results

View file

@ -4,16 +4,16 @@
"""
import re
from json import loads
from urllib.parse import urlencode
import urllib.parse
from functools import partial
from flask_babel import gettext
from searx.data import OSM_KEYS_TAGS, CURRENCIES
from searx.utils import searx_useragent
from searx.external_urls import get_external_url
from searx.engines.wikidata import send_wikidata_query, sparql_string_escape, get_thumbnail
from searx.result_types import Answer
# about
about = {
@ -37,8 +37,7 @@ search_string = 'search?{query}&polygon_geojson=1&format=jsonv2&addressdetails=1
result_id_url = 'https://openstreetmap.org/{osm_type}/{osm_id}'
result_lat_lon_url = 'https://www.openstreetmap.org/?mlat={lat}&mlon={lon}&zoom={zoom}&layers=M'
route_url = 'https://graphhopper.com/maps/?point={}&point={}&locale=en-US&vehicle=car&weighting=fastest&turn_costs=true&use_miles=false&layer=Omniscale' # pylint: disable=line-too-long
route_re = re.compile('(?:from )?(.+) to (.+)')
route_url = 'https://graphhopper.com/maps'
wikidata_image_sparql = """
select ?item ?itemLabel ?image ?sign ?symbol ?website ?wikipediaName
@ -138,27 +137,25 @@ KEY_RANKS = {k: i for i, k in enumerate(KEY_ORDER)}
def request(query, params):
"""do search-request"""
params['url'] = base_url + search_string.format(query=urlencode({'q': query}))
params['route'] = route_re.match(query)
params['headers']['User-Agent'] = searx_useragent()
if 'Accept-Language' not in params['headers']:
params['headers']['Accept-Language'] = 'en'
params['url'] = base_url + search_string.format(query=urllib.parse.urlencode({'q': query}))
return params
def response(resp):
"""get response from search-request"""
results = []
nominatim_json = loads(resp.text)
nominatim_json = resp.json()
user_language = resp.search_params['language']
if resp.search_params['route']:
results.append(
{
'answer': gettext('Get directions'),
'url': route_url.format(*resp.search_params['route'].groups()),
}
l = re.findall(r"from\s+(.*)\s+to\s+(.+)", resp.search_params["query"])
if not l:
l = re.findall(r"\s*(.*)\s+to\s+(.+)", resp.search_params["query"])
if l:
point1, point2 = [urllib.parse.quote_plus(p) for p in l[0]]
Answer(
results=results,
answer=gettext('Show route in map ..'),
url=f"{route_url}/?point={point1}&point={point2}",
)
# simplify the code below: make sure extratags is a dictionary

View file

@ -156,6 +156,7 @@ def parse_tineye_match(match_json):
def response(resp):
"""Parse HTTP response from TinEye."""
results = []
# handle the 422 client side errors, and the possible 400 status code error
if resp.status_code in (400, 422):
@ -182,14 +183,14 @@ def response(resp):
message = ','.join(description)
# see https://github.com/searxng/searxng/pull/1456#issuecomment-1193105023
# results.append({'answer': message})
logger.error(message)
return []
# from searx.result_types import Answer
# Answer(results=results, answer=message)
logger.info(message)
return results
# Raise for all other responses
resp.raise_for_status()
results = []
json_data = resp.json()
for match_json in json_data['matches']:

View file

@ -3,6 +3,10 @@
"""
import urllib.parse
from searx.result_types import Translations
# about
about = {
"website": 'https://mymemory.translated.net/',
@ -15,8 +19,8 @@ about = {
engine_type = 'online_dictionary'
categories = ['general', 'translate']
url = 'https://api.mymemory.translated.net/get?q={query}&langpair={from_lang}|{to_lang}{key}'
web_url = 'https://mymemory.translated.net/en/{from_lang}/{to_lang}/{query}'
api_url = "https://api.mymemory.translated.net"
web_url = "https://mymemory.translated.net"
weight = 100
https_support = True
@ -24,27 +28,32 @@ api_key = ''
def request(query, params): # pylint: disable=unused-argument
args = {"q": params["query"], "langpair": f"{params['from_lang'][1]}|{params['to_lang'][1]}"}
if api_key:
key_form = '&key=' + api_key
else:
key_form = ''
params['url'] = url.format(
from_lang=params['from_lang'][1], to_lang=params['to_lang'][1], query=params['query'], key=key_form
)
args["key"] = api_key
params['url'] = f"{api_url}/get?{urllib.parse.urlencode(args)}"
return params
def response(resp):
json_resp = resp.json()
text = json_resp['responseData']['translatedText']
results = []
data = resp.json()
alternatives = [match['translation'] for match in json_resp['matches'] if match['translation'] != text]
translations = [{'text': translation} for translation in [text] + alternatives]
result = {
'answer': translations[0]['text'],
'answer_type': 'translations',
'translations': translations,
args = {
"q": resp.search_params["query"],
"lang": resp.search_params.get("searxng_locale", "en"), # ui language
"sl": resp.search_params['from_lang'][1],
"tl": resp.search_params['to_lang'][1],
}
return [result]
link = f"{web_url}/search.php?{urllib.parse.urlencode(args)}"
text = data['responseData']['translatedText']
examples = [f"{m['segment']} : {m['translation']}" for m in data['matches'] if m['translation'] != text]
item = Translations.Item(text=text, examples=examples)
Translations(results=results, translations=[item], url=link)
return results

View file

@ -262,7 +262,7 @@ def request(query, params):
def response(resp): # pylint: disable=too-many-branches
'''Scrap *results* from the response (see :ref:`engine results`).'''
'''Scrap *results* from the response (see :ref:`result types`).'''
if no_result_for_http_status and resp.status_code in no_result_for_http_status:
return []

82
searx/extended_types.py Normal file
View file

@ -0,0 +1,82 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""This module implements the type extensions applied by SearXNG.
- :py:obj:`flask.request` is replaced by :py:obj:`sxng_request`
- :py:obj:`flask.Request` is replaced by :py:obj:`SXNG_Request`
- :py:obj:`httpx.response` is replaced by :py:obj:`SXNG_Response`
----
.. py:attribute:: sxng_request
:type: SXNG_Request
A replacement for :py:obj:`flask.request` with type cast :py:obj:`SXNG_Request`.
.. autoclass:: SXNG_Request
:members:
.. autoclass:: SXNG_Response
:members:
"""
# pylint: disable=invalid-name
from __future__ import annotations
__all__ = ["SXNG_Request", "sxng_request", "SXNG_Response"]
import typing
import flask
import httpx
if typing.TYPE_CHECKING:
import searx.preferences
import searx.results
class SXNG_Request(flask.Request):
"""SearXNG extends the class :py:obj:`flask.Request` with properties from
*this* class definition, see type cast :py:obj:`sxng_request`.
"""
user_plugins: list[str]
"""list of searx.plugins.Plugin.id (the id of the plugins)"""
preferences: "searx.preferences.Preferences"
"""The prefernces of the request."""
errors: list[str]
"""A list of errors (translated text) added by :py:obj:`searx.webapp` in
case of errors."""
# request.form is of type werkzeug.datastructures.ImmutableMultiDict
# form: dict[str, str]
start_time: float
"""Start time of the request, :py:obj:`timeit.default_timer` added by
:py:obj:`searx.webapp` to calculate the total time of the request."""
render_time: float
"""Duration of the rendering, calculated and added by
:py:obj:`searx.webapp`."""
timings: list["searx.results.Timing"]
"""A list of :py:obj:`searx.results.Timing` of the engines, calculatid in
and hold by :py:obj:`searx.results.ResultContainer.timings`."""
#: A replacement for :py:obj:`flask.request` with type cast :py:`SXNG_Request`.
sxng_request = typing.cast(SXNG_Request, flask.request)
class SXNG_Response(httpx.Response):
"""SearXNG extends the class :py:obj:`httpx.Response` with properties from
*this* class (type cast of :py:obj:`httpx.Response`).
.. code:: python
response = httpx.get("https://example.org")
response = typing.cast(SXNG_Response, response)
if response.ok:
...
"""
ok: bool

View file

@ -18,6 +18,7 @@ from searx import get_setting
from searx.webutils import new_hmac, is_hmac_of
from searx.exceptions import SearxEngineResponseException
from searx.extended_types import sxng_request
from .resolvers import DEFAULT_RESOLVER_MAP
from . import cache
@ -124,7 +125,7 @@ def favicon_proxy():
server>` setting.
"""
authority = flask.request.args.get('authority')
authority = sxng_request.args.get('authority')
# malformed request or RFC 3986 authority
if not authority or "/" in authority:
@ -134,11 +135,11 @@ def favicon_proxy():
if not is_hmac_of(
CFG.secret_key,
authority.encode(),
flask.request.args.get('h', ''),
sxng_request.args.get('h', ''),
):
return '', 400
resolver = flask.request.preferences.get_value('favicon_resolver') # type: ignore
resolver = sxng_request.preferences.get_value('favicon_resolver') # type: ignore
# if resolver is empty or not valid, just return HTTP 400.
if not resolver or resolver not in CFG.resolver_map.keys():
return "", 400
@ -151,7 +152,7 @@ def favicon_proxy():
return resp
# return default favicon from static path
theme = flask.request.preferences.get_value("theme") # type: ignore
theme = sxng_request.preferences.get_value("theme") # type: ignore
fav, mimetype = CFG.favicon(theme=theme)
return flask.send_from_directory(fav.parent, fav.name, mimetype=mimetype)
@ -215,7 +216,7 @@ def favicon_url(authority: str) -> str:
"""
resolver = flask.request.preferences.get_value('favicon_resolver') # type: ignore
resolver = sxng_request.preferences.get_value('favicon_resolver') # type: ignore
# if resolver is empty or not valid, just return nothing.
if not resolver or resolver not in CFG.resolver_map.keys():
return ""
@ -224,7 +225,7 @@ def favicon_url(authority: str) -> str:
if data_mime == (None, None):
# we have already checked, the resolver does not have a favicon
theme = flask.request.preferences.get_value("theme") # type: ignore
theme = sxng_request.preferences.get_value("theme") # type: ignore
return CFG.favicon_data_url(theme=theme)
if data_mime is not None:

View file

@ -6,13 +6,14 @@ Usage in a Flask app route:
.. code:: python
from searx import infopage
from searx.extended_types import sxng_request
_INFO_PAGES = infopage.InfoPageSet(infopage.MistletoePage)
@app.route('/info/<pagename>', methods=['GET'])
def info(pagename):
locale = request.preferences.get_value('locale')
locale = sxng_request.preferences.get_value('locale')
page = _INFO_PAGES.get_page(pagename, locale)
"""

View file

@ -105,6 +105,7 @@ from searx import (
redisdb,
)
from searx import botdetection
from searx.extended_types import SXNG_Request, sxng_request
from searx.botdetection import (
config,
http_accept,
@ -144,7 +145,7 @@ def get_cfg() -> config.Config:
return CFG
def filter_request(request: flask.Request) -> werkzeug.Response | None:
def filter_request(request: SXNG_Request) -> werkzeug.Response | None:
# pylint: disable=too-many-return-statements
cfg = get_cfg()
@ -201,13 +202,13 @@ def filter_request(request: flask.Request) -> werkzeug.Response | None:
val = func.filter_request(network, request, cfg)
if val is not None:
return val
logger.debug(f"OK {network}: %s", dump_request(flask.request))
logger.debug(f"OK {network}: %s", dump_request(sxng_request))
return None
def pre_request():
"""See :py:obj:`flask.Flask.before_request`"""
return filter_request(flask.request)
return filter_request(sxng_request)
def is_installed():

View file

@ -35,13 +35,14 @@ from babel.support import Translations
import babel.languages
import babel.core
import flask_babel
import flask
from flask.ctx import has_request_context
from searx import (
data,
logger,
searx_dir,
)
from searx.extended_types import sxng_request
logger = logger.getChild('locales')
@ -85,13 +86,13 @@ Kong."""
def localeselector():
locale = 'en'
if has_request_context():
value = flask.request.preferences.get_value('locale')
value = sxng_request.preferences.get_value('locale')
if value:
locale = value
# first, set the language that is not supported by babel
if locale in ADDITIONAL_TRANSLATIONS:
flask.request.form['use-translation'] = locale
sxng_request.form['use-translation'] = locale
# second, map locale to a value python-babel supports
locale = LOCALE_BEST_MATCH.get(locale, locale)
@ -109,7 +110,7 @@ def localeselector():
def get_translations():
"""Monkey patch of :py:obj:`flask_babel.get_translations`"""
if has_request_context():
use_translation = flask.request.form.get('use-translation')
use_translation = sxng_request.form.get('use-translation')
if use_translation in ADDITIONAL_TRANSLATIONS:
babel_ext = flask_babel.current_app.extensions['babel']
return Translations.load(babel_ext.translation_directories[0], use_translation)

View file

@ -13,6 +13,7 @@ from contextlib import contextmanager
import httpx
import anyio
from searx.extended_types import SXNG_Response
from .network import get_network, initialize, check_network_configuration # pylint:disable=cyclic-import
from .client import get_loop
from .raise_for_httperror import raise_for_httperror
@ -85,7 +86,7 @@ def _get_timeout(start_time, kwargs):
return timeout
def request(method, url, **kwargs):
def request(method, url, **kwargs) -> SXNG_Response:
"""same as requests/requests/api.py request(...)"""
with _record_http_time() as start_time:
network = get_context_network()
@ -159,34 +160,34 @@ class Request(NamedTuple):
return Request('DELETE', url, kwargs)
def get(url, **kwargs):
def get(url, **kwargs) -> SXNG_Response:
kwargs.setdefault('allow_redirects', True)
return request('get', url, **kwargs)
def options(url, **kwargs):
def options(url, **kwargs) -> SXNG_Response:
kwargs.setdefault('allow_redirects', True)
return request('options', url, **kwargs)
def head(url, **kwargs):
def head(url, **kwargs) -> SXNG_Response:
kwargs.setdefault('allow_redirects', False)
return request('head', url, **kwargs)
def post(url, data=None, **kwargs):
def post(url, data=None, **kwargs) -> SXNG_Response:
return request('post', url, data=data, **kwargs)
def put(url, data=None, **kwargs):
def put(url, data=None, **kwargs) -> SXNG_Response:
return request('put', url, data=data, **kwargs)
def patch(url, data=None, **kwargs):
def patch(url, data=None, **kwargs) -> SXNG_Response:
return request('patch', url, data=data, **kwargs)
def delete(url, **kwargs):
def delete(url, **kwargs) -> SXNG_Response:
return request('delete', url, **kwargs)

View file

@ -1,7 +1,9 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=global-statement
# pylint: disable=missing-module-docstring, missing-class-docstring
from __future__ import annotations
import typing
import atexit
import asyncio
import ipaddress
@ -11,6 +13,7 @@ from typing import Dict
import httpx
from searx import logger, searx_debug
from searx.extended_types import SXNG_Response
from .client import new_client, get_loop, AsyncHTTPTransportNoHttp
from .raise_for_httperror import raise_for_httperror
@ -233,8 +236,9 @@ class Network:
del kwargs['raise_for_httperror']
return do_raise_for_httperror
def patch_response(self, response, do_raise_for_httperror):
def patch_response(self, response, do_raise_for_httperror) -> SXNG_Response:
if isinstance(response, httpx.Response):
response = typing.cast(SXNG_Response, response)
# requests compatibility (response is not streamed)
# see also https://www.python-httpx.org/compatibility/#checking-for-4xx5xx-responses
response.ok = not response.is_error
@ -258,7 +262,7 @@ class Network:
return False
return True
async def call_client(self, stream, method, url, **kwargs):
async def call_client(self, stream, method, url, **kwargs) -> SXNG_Response:
retries = self.retries
was_disconnected = False
do_raise_for_httperror = Network.extract_do_raise_for_httperror(kwargs)

View file

@ -1,232 +1,68 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=missing-module-docstring, missing-class-docstring
""".. sidebar:: Further reading ..
import sys
from hashlib import sha256
from importlib import import_module
from os import listdir, makedirs, remove, stat, utime
from os.path import abspath, basename, dirname, exists, join
from shutil import copyfile
from pkgutil import iter_modules
from logging import getLogger
from typing import List, Tuple
- :ref:`plugins admin`
- :ref:`SearXNG settings <settings plugins>`
- :ref:`builtin plugins`
from searx import logger, settings
Plugins can extend or replace functionality of various components of SearXNG.
Here is an example of a very simple plugin that adds a "Hello" into the answer
area:
.. code:: python
class Plugin: # pylint: disable=too-few-public-methods
"""This class is currently never initialized and only used for type hinting."""
from flask_babel import gettext as _
from searx.plugins import Plugin
from searx.result_types import Answer
id: str
name: str
description: str
default_on: bool
js_dependencies: Tuple[str]
css_dependencies: Tuple[str]
preference_section: str
class MyPlugin(Plugin):
id = "self_info"
default_on = True
logger = logger.getChild("plugins")
def __init__(self):
super().__init__()
info = PluginInfo(id=self.id, name=_("Hello"), description=_("demo plugin"))
required_attrs = (
# fmt: off
("name", str),
("description", str),
("default_on", bool)
# fmt: on
)
def post_search(self, request, search):
return [ Answer(answer="Hello") ]
optional_attrs = (
# fmt: off
("js_dependencies", tuple),
("css_dependencies", tuple),
("preference_section", str),
# fmt: on
)
Entry points (hooks) define when a plugin runs. Right now only three hooks are
implemented. So feel free to implement a hook if it fits the behaviour of your
plugin / a plugin doesn't need to implement all the hooks.
- pre search: :py:obj:`Plugin.pre_search`
- post search: :py:obj:`Plugin.post_search`
- on each result item: :py:obj:`Plugin.on_result`
def sha_sum(filename):
with open(filename, "rb") as f:
file_content_bytes = f.read()
return sha256(file_content_bytes).hexdigest()
For a coding example have a look at :ref:`self_info plugin`.
----
def sync_resource(base_path, resource_path, name, target_dir, plugin_dir):
dep_path = join(base_path, resource_path)
file_name = basename(dep_path)
resource_path = join(target_dir, file_name)
if not exists(resource_path) or sha_sum(dep_path) != sha_sum(resource_path):
try:
copyfile(dep_path, resource_path)
# copy atime_ns and mtime_ns, so the weak ETags (generated by
# the HTTP server) do not change
dep_stat = stat(dep_path)
utime(resource_path, ns=(dep_stat.st_atime_ns, dep_stat.st_mtime_ns))
except IOError:
logger.critical("failed to copy plugin resource {0} for plugin {1}".format(file_name, name))
sys.exit(3)
.. autoclass:: Plugin
:members:
# returning with the web path of the resource
return join("plugins/external_plugins", plugin_dir, file_name)
.. autoclass:: PluginInfo
:members:
.. autoclass:: PluginStorage
:members:
def prepare_package_resources(plugin, plugin_module_name):
plugin_base_path = dirname(abspath(plugin.__file__))
.. autoclass:: searx.plugins._core.ModulePlugin
:members:
:show-inheritance:
plugin_dir = plugin_module_name
target_dir = join(settings["ui"]["static_path"], "plugins/external_plugins", plugin_dir)
try:
makedirs(target_dir, exist_ok=True)
except IOError:
logger.critical("failed to create resource directory {0} for plugin {1}".format(target_dir, plugin_module_name))
sys.exit(3)
"""
resources = []
from __future__ import annotations
if hasattr(plugin, "js_dependencies"):
resources.extend(map(basename, plugin.js_dependencies))
plugin.js_dependencies = [
sync_resource(plugin_base_path, x, plugin_module_name, target_dir, plugin_dir)
for x in plugin.js_dependencies
]
__all__ = ["PluginInfo", "Plugin", "PluginStorage"]
if hasattr(plugin, "css_dependencies"):
resources.extend(map(basename, plugin.css_dependencies))
plugin.css_dependencies = [
sync_resource(plugin_base_path, x, plugin_module_name, target_dir, plugin_dir)
for x in plugin.css_dependencies
]
from ._core import PluginInfo, Plugin, PluginStorage
for f in listdir(target_dir):
if basename(f) not in resources:
resource_path = join(target_dir, basename(f))
try:
remove(resource_path)
except IOError:
logger.critical(
"failed to remove unused resource file {0} for plugin {1}".format(resource_path, plugin_module_name)
)
sys.exit(3)
def load_plugin(plugin_module_name, external):
# pylint: disable=too-many-branches
try:
plugin = import_module(plugin_module_name)
except (
SyntaxError,
KeyboardInterrupt,
SystemExit,
SystemError,
ImportError,
RuntimeError,
) as e:
logger.critical("%s: fatal exception", plugin_module_name, exc_info=e)
sys.exit(3)
except BaseException:
logger.exception("%s: exception while loading, the plugin is disabled", plugin_module_name)
return None
# difference with searx: use module name instead of the user name
plugin.id = plugin_module_name
#
plugin.logger = getLogger(plugin_module_name)
for plugin_attr, plugin_attr_type in required_attrs:
if not hasattr(plugin, plugin_attr):
logger.critical('%s: missing attribute "%s", cannot load plugin', plugin, plugin_attr)
sys.exit(3)
attr = getattr(plugin, plugin_attr)
if not isinstance(attr, plugin_attr_type):
type_attr = str(type(attr))
logger.critical(
'{1}: attribute "{0}" is of type {2}, must be of type {3}, cannot load plugin'.format(
plugin, plugin_attr, type_attr, plugin_attr_type
)
)
sys.exit(3)
for plugin_attr, plugin_attr_type in optional_attrs:
if not hasattr(plugin, plugin_attr) or not isinstance(getattr(plugin, plugin_attr), plugin_attr_type):
setattr(plugin, plugin_attr, plugin_attr_type())
if not hasattr(plugin, "preference_section"):
plugin.preference_section = "general"
# query plugin
if plugin.preference_section == "query":
for plugin_attr in ("query_keywords", "query_examples"):
if not hasattr(plugin, plugin_attr):
logger.critical('missing attribute "{0}", cannot load plugin: {1}'.format(plugin_attr, plugin))
sys.exit(3)
if settings.get("enabled_plugins"):
# searx compatibility: plugin.name in settings['enabled_plugins']
plugin.default_on = plugin.name in settings["enabled_plugins"] or plugin.id in settings["enabled_plugins"]
# copy resources if this is an external plugin
if external:
prepare_package_resources(plugin, plugin_module_name)
logger.debug("%s: loaded", plugin_module_name)
return plugin
def load_and_initialize_plugin(plugin_module_name, external, init_args):
plugin = load_plugin(plugin_module_name, external)
if plugin and hasattr(plugin, 'init'):
try:
return plugin if plugin.init(*init_args) else None
except Exception: # pylint: disable=broad-except
plugin.logger.exception("Exception while calling init, the plugin is disabled")
return None
return plugin
class PluginStore:
def __init__(self):
self.plugins: List[Plugin] = []
def __iter__(self):
yield from self.plugins
def register(self, plugin):
self.plugins.append(plugin)
def call(self, ordered_plugin_list, plugin_type, *args, **kwargs):
ret = True
for plugin in ordered_plugin_list:
if hasattr(plugin, plugin_type):
try:
ret = getattr(plugin, plugin_type)(*args, **kwargs)
if not ret:
break
except Exception: # pylint: disable=broad-except
plugin.logger.exception("Exception while calling %s", plugin_type)
return ret
plugins = PluginStore()
def plugin_module_names():
yield_plugins = set()
# embedded plugins
for module in iter_modules(path=[dirname(__file__)]):
yield (__name__ + "." + module.name, False)
yield_plugins.add(module.name)
# external plugins
for module_name in settings['plugins']:
if module_name not in yield_plugins:
yield (module_name, True)
yield_plugins.add(module_name)
STORAGE: PluginStorage = PluginStorage()
def initialize(app):
for module_name, external in plugin_module_names():
plugin = load_and_initialize_plugin(module_name, external, (app, settings))
if plugin:
plugins.register(plugin)
STORAGE.load_builtins()
STORAGE.init(app)

394
searx/plugins/_core.py Normal file
View file

@ -0,0 +1,394 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=too-few-public-methods,missing-module-docstring
from __future__ import annotations
__all__ = ["PluginInfo", "Plugin", "PluginStorage"]
import abc
import importlib
import logging
import pathlib
import types
import typing
import warnings
from dataclasses import dataclass, field
import flask
import searx
from searx.utils import load_module
from searx.extended_types import SXNG_Request
from searx.result_types import Result
if typing.TYPE_CHECKING:
from searx.search import SearchWithPlugins
_default = pathlib.Path(__file__).parent
log: logging.Logger = logging.getLogger("searx.plugins")
@dataclass
class PluginInfo:
"""Object that holds informations about a *plugin*, these infos are shown to
the user in the Preferences menu.
To be able to translate the information into other languages, the text must
be written in English and translated with :py:obj:`flask_babel.gettext`.
"""
id: str
"""The ID-selector in HTML/CSS `#<id>`."""
name: str
"""Name of the *plugin*."""
description: str
"""Short description of the *answerer*."""
preference_section: typing.Literal["general", "ui", "privacy", "query"] | None = "general"
"""Section (tab/group) in the preferences where this plugin is shown to the
user.
The value ``query`` is reserved for plugins that are activated via a
*keyword* as part of a search query, see:
- :py:obj:`PluginInfo.examples`
- :py:obj:`Plugin.keywords`
Those plugins are shown in the preferences in tab *Special Queries*.
"""
examples: list[str] = field(default_factory=list)
"""List of short examples of the usage / of query terms."""
keywords: list[str] = field(default_factory=list)
"""See :py:obj:`Plugin.keywords`"""
class Plugin(abc.ABC):
"""Abstract base class of all Plugins."""
id: typing.ClassVar[str]
"""The ID (suffix) in the HTML form."""
default_on: typing.ClassVar[bool]
"""Plugin is enabled/disabled by default."""
keywords: list[str] = []
"""Keywords in the search query that activate the plugin. The *keyword* is
the first word in a search query. If a plugin should be executed regardless
of the search query, the list of keywords should be empty (which is also the
default in the base class for Plugins)."""
log: logging.Logger
"""A logger object, is automatically initialized when calling the
constructor (if not already set in the subclass)."""
info: PluginInfo
"""Informations about the *plugin*, see :py:obj:`PluginInfo`."""
def __init__(self) -> None:
super().__init__()
for attr in ["id", "default_on"]:
if getattr(self, attr, None) is None:
raise NotImplementedError(f"plugin {self} is missing attribute {attr}")
if not self.id:
self.id = f"{self.__class__.__module__}.{self.__class__.__name__}"
if not getattr(self, "log", None):
self.log = log.getChild(self.id)
def __hash__(self) -> int:
"""The hash value is used in :py:obj:`set`, for example, when an object
is added to the set. The hash value is also used in other contexts,
e.g. when checking for equality to identify identical plugins from
different sources (name collisions)."""
return id(self)
def __eq__(self, other):
"""py:obj:`Plugin` objects are equal if the hash values of the two
objects are equal."""
return hash(self) == hash(other)
def init(self, app: flask.Flask) -> bool: # pylint: disable=unused-argument
"""Initialization of the plugin, the return value decides whether this
plugin is active or not. Initialization only takes place once, at the
time the WEB application is set up. The base methode always returns
``True``, the methode can be overwritten in the inheritances,
- ``True`` plugin is active
- ``False`` plugin is inactive
"""
return True
# pylint: disable=unused-argument
def pre_search(self, request: SXNG_Request, search: "SearchWithPlugins") -> bool:
"""Runs BEFORE the search request and returns a boolean:
- ``True`` to continue the search
- ``False`` to stop the search
"""
return True
def on_result(self, request: SXNG_Request, search: "SearchWithPlugins", result: Result) -> bool:
"""Runs for each result of each engine and returns a boolean:
- ``True`` to keep the result
- ``False`` to remove the result from the result list
The ``result`` can be modified to the needs.
.. hint::
If :py:obj:`Result.url` is modified, :py:obj:`Result.parsed_url` must
be changed accordingly:
.. code:: python
result["parsed_url"] = urlparse(result["url"])
"""
return True
def post_search(self, request: SXNG_Request, search: "SearchWithPlugins") -> None | typing.Sequence[Result]:
"""Runs AFTER the search request. Can return a list of :py:obj:`Result`
objects to be added to the final result list."""
return
class ModulePlugin(Plugin):
"""A wrapper class for legacy *plugins*.
.. note::
For internal use only!
In a module plugin, the follwing names are mapped:
- `module.query_keywords` --> :py:obj:`Plugin.keywords`
- `module.plugin_id` --> :py:obj:`Plugin.id`
- `module.logger` --> :py:obj:`Plugin.log`
"""
_required_attrs = (("name", str), ("description", str), ("default_on", bool))
def __init__(self, mod: types.ModuleType):
"""In case of missing attributes in the module or wrong types are given,
a :py:obj:`TypeError` exception is raised."""
self.module = mod
self.id = getattr(self.module, "plugin_id", self.module.__name__)
self.log = logging.getLogger(self.module.__name__)
self.keywords = getattr(self.module, "query_keywords", [])
for attr, attr_type in self._required_attrs:
if not hasattr(self.module, attr):
msg = f"missing attribute {attr}, cannot load plugin"
self.log.critical(msg)
raise TypeError(msg)
if not isinstance(getattr(self.module, attr), attr_type):
msg = f"attribute {attr} is not of type {attr_type}"
self.log.critical(msg)
raise TypeError(msg)
self.default_on = mod.default_on
self.info = PluginInfo(
id=self.id,
name=self.module.name,
description=self.module.description,
preference_section=getattr(self.module, "preference_section", None),
examples=getattr(self.module, "query_examples", []),
keywords=self.keywords,
)
# monkeypatch module
self.module.logger = self.log # type: ignore
super().__init__()
def init(self, app: flask.Flask) -> bool:
if not hasattr(self.module, "init"):
return True
return self.module.init(app)
def pre_search(self, request: SXNG_Request, search: "SearchWithPlugins") -> bool:
if not hasattr(self.module, "pre_search"):
return True
return self.module.pre_search(request, search)
def on_result(self, request: SXNG_Request, search: "SearchWithPlugins", result: Result) -> bool:
if not hasattr(self.module, "on_result"):
return True
return self.module.on_result(request, search, result)
def post_search(self, request: SXNG_Request, search: "SearchWithPlugins") -> None | list[Result]:
if not hasattr(self.module, "post_search"):
return None
return self.module.post_search(request, search)
class PluginStorage:
"""A storage for managing the *plugins* of SearXNG."""
plugin_list: set[Plugin]
"""The list of :py:obj:`Plugins` in this storage."""
legacy_plugins = [
"ahmia_filter",
"calculator",
"hostnames",
"oa_doi_rewrite",
"tor_check",
"tracker_url_remover",
"unit_converter",
]
"""Internal plugins implemented in the legacy style (as module / deprecated!)."""
def __init__(self):
self.plugin_list = set()
def __iter__(self):
yield from self.plugin_list
def __len__(self):
return len(self.plugin_list)
@property
def info(self) -> list[PluginInfo]:
return [p.info for p in self.plugin_list]
def load_builtins(self):
"""Load plugin modules from:
- the python packages in :origin:`searx/plugins` and
- the external plugins from :ref:`settings plugins`.
"""
for f in _default.iterdir():
if f.name.startswith("_"):
continue
if f.stem not in self.legacy_plugins:
self.register_by_fqn(f"searx.plugins.{f.stem}.SXNGPlugin")
continue
# for backward compatibility
mod = load_module(f.name, str(f.parent))
self.register(ModulePlugin(mod))
for fqn in searx.get_setting("plugins"): # type: ignore
self.register_by_fqn(fqn)
def register(self, plugin: Plugin):
"""Register a :py:obj:`Plugin`. In case of name collision (if two
plugins have same ID) a :py:obj:`KeyError` exception is raised.
"""
if plugin in self.plugin_list:
msg = f"name collision '{plugin.id}'"
plugin.log.critical(msg)
raise KeyError(msg)
self.plugin_list.add(plugin)
plugin.log.debug("plugin has been loaded")
def register_by_fqn(self, fqn: str):
"""Register a :py:obj:`Plugin` via its fully qualified class name (FQN).
The FQNs of external plugins could be read from a configuration, for
example, and registered using this method
"""
mod_name, _, obj_name = fqn.rpartition('.')
if not mod_name:
# for backward compatibility
code_obj = importlib.import_module(fqn)
else:
mod = importlib.import_module(mod_name)
code_obj = getattr(mod, obj_name, None)
if code_obj is None:
msg = f"plugin {fqn} is not implemented"
log.critical(msg)
raise ValueError(msg)
if isinstance(code_obj, types.ModuleType):
# for backward compatibility
warnings.warn(
f"plugin {fqn} is implemented in a legacy module / migrate to searx.plugins.Plugin", DeprecationWarning
)
self.register(ModulePlugin(code_obj))
return
self.register(code_obj())
def init(self, app: flask.Flask) -> None:
"""Calls the method :py:obj:`Plugin.init` of each plugin in this
storage. Depending on its return value, the plugin is removed from
*this* storage or not."""
for plg in self.plugin_list.copy():
if not plg.init(app):
self.plugin_list.remove(plg)
def pre_search(self, request: SXNG_Request, search: "SearchWithPlugins") -> bool:
ret = True
for plugin in [p for p in self.plugin_list if p.id in search.user_plugins]:
try:
ret = bool(plugin.pre_search(request=request, search=search))
except Exception: # pylint: disable=broad-except
plugin.log.exception("Exception while calling pre_search")
continue
if not ret:
# skip this search on the first False from a plugin
break
return ret
def on_result(self, request: SXNG_Request, search: "SearchWithPlugins", result: Result) -> bool:
ret = True
for plugin in [p for p in self.plugin_list if p.id in search.user_plugins]:
try:
ret = bool(plugin.on_result(request=request, search=search, result=result))
except Exception: # pylint: disable=broad-except
plugin.log.exception("Exception while calling on_result")
continue
if not ret:
# ignore this result item on the first False from a plugin
break
return ret
def post_search(self, request: SXNG_Request, search: "SearchWithPlugins") -> None:
"""Extend :py:obj:`search.result_container
<searx.results.ResultContainer`> with result items from plugins listed
in :py:obj:`search.user_plugins <SearchWithPlugins.user_plugins>`.
"""
keyword = None
for keyword in search.search_query.query.split():
if keyword:
break
for plugin in [p for p in self.plugin_list if p.id in search.user_plugins]:
if plugin.keywords:
# plugin with keywords: skip plugin if no keyword match
if keyword and keyword not in plugin.keywords:
continue
try:
results = plugin.post_search(request=request, search=search) or []
except Exception: # pylint: disable=broad-except
plugin.log.exception("Exception while calling post_search")
continue
# In case of *plugins* prefix ``plugin:`` is set, see searx.result_types.Result
search.result_container.extend(f"plugin: {plugin.id}", results)

View file

@ -1,27 +1,33 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=missing-module-docstring
from __future__ import annotations
from hashlib import md5
import flask
from searx.data import ahmia_blacklist_loader
from searx import get_setting
name = "Ahmia blacklist"
description = "Filter out onion results that appear in Ahmia's blacklist. (See https://ahmia.fi/blacklist)"
default_on = True
preference_section = 'onions'
ahmia_blacklist = None
ahmia_blacklist: list = []
def on_result(_request, _search, result):
def on_result(_request, _search, result) -> bool:
if not result.get('is_onion') or not result.get('parsed_url'):
return True
result_hash = md5(result['parsed_url'].hostname.encode()).hexdigest()
return result_hash not in ahmia_blacklist
def init(_app, settings):
def init(app=flask.Flask) -> bool: # pylint: disable=unused-argument
global ahmia_blacklist # pylint: disable=global-statement
if not settings['outgoing']['using_tor_proxy']:
if not get_setting("outgoing.using_tor_proxy"):
# disable the plugin
return False
ahmia_blacklist = ahmia_blacklist_loader()

View file

@ -1,28 +1,27 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Calculate mathematical expressions using ack#eval
"""Calculate mathematical expressions using :py:obj`ast.parse` (mode="eval").
"""
from __future__ import annotations
from typing import Callable
import ast
import re
import operator
from multiprocessing import Process, Queue
from typing import Callable
import multiprocessing
import flask
import babel
import babel.numbers
from flask_babel import gettext
from searx.plugins import logger
from searx.result_types import Answer
name = "Basic Calculator"
description = gettext("Calculate mathematical expressions via the search bar")
default_on = True
preference_section = 'general'
plugin_id = 'calculator'
logger = logger.getChild(plugin_id)
operators: dict[type, Callable] = {
ast.Add: operator.add,
ast.Sub: operator.sub,
@ -33,11 +32,17 @@ operators: dict[type, Callable] = {
ast.USub: operator.neg,
}
# with multiprocessing.get_context("fork") we are ready for Py3.14 (by emulating
# the old behavior "fork") but it will not solve the core problem of fork, nor
# will it remove the deprecation warnings in py3.12 & py3.13. Issue is
# ddiscussed here: https://github.com/searxng/searxng/issues/4159
mp_fork = multiprocessing.get_context("fork")
def _eval_expr(expr):
"""
>>> _eval_expr('2^6')
4
64
>>> _eval_expr('2**6')
64
>>> _eval_expr('1 + 2*3**(4^5) / (6 + -7)')
@ -63,46 +68,49 @@ def _eval(node):
raise TypeError(node)
def handler(q: multiprocessing.Queue, func, args, **kwargs): # pylint:disable=invalid-name
try:
q.put(func(*args, **kwargs))
except:
q.put(None)
raise
def timeout_func(timeout, func, *args, **kwargs):
def handler(q: Queue, func, args, **kwargs): # pylint:disable=invalid-name
try:
q.put(func(*args, **kwargs))
except:
q.put(None)
raise
que = Queue()
p = Process(target=handler, args=(que, func, args), kwargs=kwargs)
que = mp_fork.Queue()
p = mp_fork.Process(target=handler, args=(que, func, args), kwargs=kwargs)
p.start()
p.join(timeout=timeout)
ret_val = None
# pylint: disable=used-before-assignment,undefined-variable
if not p.is_alive():
ret_val = que.get()
else:
logger.debug("terminate function after timeout is exceeded")
logger.debug("terminate function after timeout is exceeded") # type: ignore
p.terminate()
p.join()
p.close()
return ret_val
def post_search(_request, search):
def post_search(request, search) -> list[Answer]:
results = []
# only show the result of the expression on the first page
if search.search_query.pageno > 1:
return True
return results
query = search.search_query.query
# in order to avoid DoS attacks with long expressions, ignore long expressions
if len(query) > 100:
return True
return results
# replace commonly used math operators with their proper Python operator
query = query.replace("x", "*").replace(":", "/")
# use UI language
ui_locale = babel.Locale.parse(flask.request.preferences.get_value('locale'), sep='-')
ui_locale = babel.Locale.parse(request.preferences.get_value('locale'), sep='-')
# parse the number system in a localized way
def _decimal(match: re.Match) -> str:
@ -116,15 +124,17 @@ def post_search(_request, search):
# only numbers and math operators are accepted
if any(str.isalpha(c) for c in query):
return True
return results
# in python, powers are calculated via **
query_py_formatted = query.replace("^", "**")
# Prevent the runtime from being longer than 50 ms
result = timeout_func(0.05, _eval_expr, query_py_formatted)
if result is None or result == "":
return True
result = babel.numbers.format_decimal(result, locale=ui_locale)
search.result_container.answers['calculate'] = {'answer': f"{search.search_query.query} = {result}"}
return True
res = timeout_func(0.05, _eval_expr, query_py_formatted)
if res is None or res == "":
return results
res = babel.numbers.format_decimal(res, locale=ui_locale)
Answer(results=results, answer=f"{search.search_query.query} = {res}")
return results

View file

@ -1,43 +1,66 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=missing-module-docstring
# pylint: disable=missing-module-docstring, missing-class-docstring
from __future__ import annotations
import typing
import hashlib
import re
import hashlib
from flask_babel import gettext
name = "Hash plugin"
description = gettext("Converts strings to different hash digests.")
default_on = True
preference_section = 'query'
query_keywords = ['md5', 'sha1', 'sha224', 'sha256', 'sha384', 'sha512']
query_examples = 'sha512 The quick brown fox jumps over the lazy dog'
from searx.plugins import Plugin, PluginInfo
from searx.result_types import Answer
parser_re = re.compile('(md5|sha1|sha224|sha256|sha384|sha512) (.*)', re.I)
if typing.TYPE_CHECKING:
from searx.search import SearchWithPlugins
from searx.extended_types import SXNG_Request
def post_search(_request, search):
# process only on first page
if search.search_query.pageno > 1:
return True
m = parser_re.match(search.search_query.query)
if not m:
# wrong query
return True
class SXNGPlugin(Plugin):
"""Plugin converts strings to different hash digests. The results are
displayed in area for the "answers".
"""
function, string = m.groups()
if not string.strip():
# end if the string is empty
return True
id = "hash_plugin"
default_on = True
keywords = ["md5", "sha1", "sha224", "sha256", "sha384", "sha512"]
# select hash function
f = hashlib.new(function.lower())
def __init__(self):
super().__init__()
# make digest from the given string
f.update(string.encode('utf-8').strip())
answer = function + " " + gettext('hash digest') + ": " + f.hexdigest()
self.parser_re = re.compile(f"({'|'.join(self.keywords)}) (.*)", re.I)
self.info = PluginInfo(
id=self.id,
name=gettext("Hash plugin"),
description=gettext("Converts strings to different hash digests."),
examples=["sha512 The quick brown fox jumps over the lazy dog"],
preference_section="query",
)
# print result
search.result_container.answers.clear()
search.result_container.answers['hash'] = {'answer': answer}
return True
def post_search(self, request: "SXNG_Request", search: "SearchWithPlugins") -> list[Answer]:
"""Returns a result list only for the first page."""
results = []
if search.search_query.pageno > 1:
return results
m = self.parser_re.match(search.search_query.query)
if not m:
# wrong query
return results
function, string = m.groups()
if not string.strip():
# end if the string is empty
return results
# select hash function
f = hashlib.new(function.lower())
# make digest from the given string
f.update(string.encode("utf-8").strip())
answer = function + " " + gettext("hash digest") + ": " + f.hexdigest()
Answer(results=results, answer=answer)
return results

View file

@ -91,15 +91,17 @@ something like this:
"""
from __future__ import annotations
import re
from urllib.parse import urlunparse, urlparse
from flask_babel import gettext
from searx import settings
from searx.plugins import logger
from searx.settings_loader import get_yaml_cfg
name = gettext('Hostnames plugin')
description = gettext('Rewrite hostnames, remove results or prioritize them based on the hostname')
default_on = False
@ -107,16 +109,15 @@ preference_section = 'general'
plugin_id = 'hostnames'
logger = logger.getChild(plugin_id)
parsed = 'parsed_url'
_url_fields = ['iframe_src', 'audio_src']
def _load_regular_expressions(settings_key):
def _load_regular_expressions(settings_key) -> dict | set | None:
setting_value = settings.get(plugin_id, {}).get(settings_key)
if not setting_value:
return {}
return None
# load external file with configuration
if isinstance(setting_value, str):
@ -128,20 +129,20 @@ def _load_regular_expressions(settings_key):
if isinstance(setting_value, dict):
return {re.compile(p): r for (p, r) in setting_value.items()}
return {}
return None
replacements = _load_regular_expressions('replace')
removables = _load_regular_expressions('remove')
high_priority = _load_regular_expressions('high_priority')
low_priority = _load_regular_expressions('low_priority')
replacements: dict = _load_regular_expressions('replace') or {} # type: ignore
removables: set = _load_regular_expressions('remove') or set() # type: ignore
high_priority: set = _load_regular_expressions('high_priority') or set() # type: ignore
low_priority: set = _load_regular_expressions('low_priority') or set() # type: ignore
def _matches_parsed_url(result, pattern):
return parsed in result and pattern.search(result[parsed].netloc)
def on_result(_request, _search, result):
def on_result(_request, _search, result) -> bool:
for pattern, replacement in replacements.items():
if _matches_parsed_url(result, pattern):
# logger.debug(result['url'])

View file

@ -1,18 +1,21 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=missing-module-docstring
from __future__ import annotations
import re
from urllib.parse import urlparse, parse_qsl
from flask_babel import gettext
from searx import settings
regex = re.compile(r'10\.\d{4,9}/[^\s]+')
name = gettext('Open Access DOI rewrite')
description = gettext('Avoid paywalls by redirecting to open-access versions of publications when available')
default_on = False
preference_section = 'general'
preference_section = 'general/doi_resolver'
def extract_doi(url):
@ -34,8 +37,9 @@ def get_doi_resolver(preferences):
return doi_resolvers[selected_resolver]
def on_result(request, _search, result):
if 'parsed_url' not in result:
def on_result(request, _search, result) -> bool:
if not result.parsed_url:
return True
doi = extract_doi(result['parsed_url'])

View file

@ -1,32 +1,57 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=missing-module-docstring,invalid-name
# pylint: disable=missing-module-docstring, missing-class-docstring
from __future__ import annotations
import typing
import re
from flask_babel import gettext
from searx.botdetection._helpers import get_real_ip
from searx.result_types import Answer
name = gettext('Self Information')
description = gettext('Displays your IP if the query is "ip" and your user agent if the query contains "user agent".')
default_on = True
preference_section = 'query'
query_keywords = ['user-agent']
query_examples = ''
from . import Plugin, PluginInfo
# "ip" or "my ip" regex
ip_regex = re.compile('^ip$|my ip', re.IGNORECASE)
# Self User Agent regex
ua_regex = re.compile('.*user[ -]agent.*', re.IGNORECASE)
if typing.TYPE_CHECKING:
from searx.search import SearchWithPlugins
from searx.extended_types import SXNG_Request
def post_search(request, search):
if search.search_query.pageno > 1:
return True
if ip_regex.search(search.search_query.query):
ip = get_real_ip(request)
search.result_container.answers['ip'] = {'answer': gettext('Your IP is: ') + ip}
elif ua_regex.match(search.search_query.query):
ua = request.user_agent
search.result_container.answers['user-agent'] = {'answer': gettext('Your user-agent is: ') + ua.string}
return True
class SXNGPlugin(Plugin):
"""Simple plugin that displays information about user's request, including
the IP or HTTP User-Agent. The information is displayed in area for the
"answers".
"""
id = "self_info"
default_on = True
keywords = ["ip", "user-agent"]
def __init__(self):
super().__init__()
self.ip_regex = re.compile(r"^ip", re.IGNORECASE)
self.ua_regex = re.compile(r"^user-agent", re.IGNORECASE)
self.info = PluginInfo(
id=self.id,
name=gettext("Self Information"),
description=gettext(
"""Displays your IP if the query is "ip" and your user agent if the query is "user-agent"."""
),
preference_section="query",
)
def post_search(self, request: "SXNG_Request", search: "SearchWithPlugins") -> list[Answer]:
"""Returns a result list only for the first page."""
results = []
if search.search_query.pageno > 1:
return results
if self.ip_regex.search(search.search_query.query):
Answer(results=results, answer=gettext("Your IP is: ") + get_real_ip(request))
if self.ua_regex.match(search.search_query.query):
Answer(results=results, answer=gettext("Your user-agent is: ") + str(request.user_agent))
return results

View file

@ -1,8 +1,8 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""A plugin to check if the ip address of the request is a Tor exit-node if the
user searches for ``tor-check``. It fetches the tor exit node list from
https://check.torproject.org/exit-addresses and parses all the IPs into a list,
then checks if the user's IP address is in it.
:py:obj:`url_exit_list` and parses all the IPs into a list, then checks if the
user's IP address is in it.
Enable in ``settings.yml``:
@ -14,10 +14,15 @@ Enable in ``settings.yml``:
"""
from __future__ import annotations
import re
from flask_babel import gettext
from httpx import HTTPError
from searx.network import get
from searx.result_types import Answer
default_on = False
@ -42,27 +47,28 @@ query_examples = ''
# Regex for exit node addresses in the list.
reg = re.compile(r"(?<=ExitAddress )\S+")
url_exit_list = "https://check.torproject.org/exit-addresses"
"""URL to load Tor exit list from."""
def post_search(request, search):
def post_search(request, search) -> list[Answer]:
results = []
if search.search_query.pageno > 1:
return True
return results
if search.search_query.query.lower() == "tor-check":
# Request the list of tor exit nodes.
try:
resp = get("https://check.torproject.org/exit-addresses")
node_list = re.findall(reg, resp.text)
resp = get(url_exit_list)
node_list = re.findall(reg, resp.text) # type: ignore
except HTTPError:
# No answer, return error
search.result_container.answers["tor"] = {
"answer": gettext(
"Could not download the list of Tor exit-nodes from: https://check.torproject.org/exit-addresses"
)
}
return True
msg = gettext("Could not download the list of Tor exit-nodes from")
Answer(results=results, answer=f"{msg} {url_exit_list}")
return results
x_forwarded_for = request.headers.getlist("X-Forwarded-For")
@ -72,20 +78,11 @@ def post_search(request, search):
ip_address = request.remote_addr
if ip_address in node_list:
search.result_container.answers["tor"] = {
"answer": gettext(
"You are using Tor and it looks like you have this external IP address: {ip_address}".format(
ip_address=ip_address
)
)
}
else:
search.result_container.answers["tor"] = {
"answer": gettext(
"You are not using Tor and you have this external IP address: {ip_address}".format(
ip_address=ip_address
)
)
}
msg = gettext("You are using Tor and it looks like you have the external IP address")
Answer(results=results, answer=f"{msg} {ip_address}")
return True
else:
msg = gettext("You are not using Tor and you have the external IP address")
Answer(results=results, answer=f"{msg} {ip_address}")
return results

View file

@ -1,6 +1,8 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=missing-module-docstring
from __future__ import annotations
import re
from urllib.parse import urlunparse, parse_qsl, urlencode
@ -19,24 +21,24 @@ default_on = True
preference_section = 'privacy'
def on_result(_request, _search, result):
if 'parsed_url' not in result:
def on_result(_request, _search, result) -> bool:
parsed_url = getattr(result, "parsed_url", None)
if not parsed_url:
return True
query = result['parsed_url'].query
if query == "":
if parsed_url.query == "":
return True
parsed_query = parse_qsl(query)
parsed_query = parse_qsl(parsed_url.query)
changes = 0
for i, (param_name, _) in enumerate(list(parsed_query)):
for reg in regexes:
if reg.match(param_name):
parsed_query.pop(i - changes)
changes += 1
result['parsed_url'] = result['parsed_url']._replace(query=urlencode(parsed_query))
result['url'] = urlunparse(result['parsed_url'])
result.parsed_url = result.parsed_url._replace(query=urlencode(parsed_query))
result.url = urlunparse(result.parsed_url)
break
return True

View file

@ -18,11 +18,14 @@ Enable in ``settings.yml``:
"""
from __future__ import annotations
import re
import babel.numbers
from flask_babel import gettext, get_locale
from searx import data
from searx.result_types import Answer
name = "Unit converter plugin"
@ -171,16 +174,16 @@ def symbol_to_si():
return SYMBOL_TO_SI
def _parse_text_and_convert(search, from_query, to_query):
def _parse_text_and_convert(from_query, to_query) -> str | None:
# pylint: disable=too-many-branches, too-many-locals
if not (from_query and to_query):
return
return None
measured = re.match(RE_MEASURE, from_query, re.VERBOSE)
if not (measured and measured.group('number'), measured.group('unit')):
return
return None
# Symbols are not unique, if there are several hits for the from-unit, then
# the correct one must be determined by comparing it with the to-unit
@ -198,7 +201,7 @@ def _parse_text_and_convert(search, from_query, to_query):
target_list.append((si_name, from_si, orig_symbol))
if not (source_list and target_list):
return
return None
source_to_si = target_from_si = target_symbol = None
@ -212,7 +215,7 @@ def _parse_text_and_convert(search, from_query, to_query):
target_symbol = target[2]
if not (source_to_si and target_from_si):
return
return None
_locale = get_locale() or 'en_US'
@ -239,25 +242,28 @@ def _parse_text_and_convert(search, from_query, to_query):
else:
result = babel.numbers.format_decimal(value, locale=_locale, format='#,##0.##########;-#')
search.result_container.answers['conversion'] = {'answer': f'{result} {target_symbol}'}
return f'{result} {target_symbol}'
def post_search(_request, search):
def post_search(_request, search) -> list[Answer]:
results = []
# only convert between units on the first page
if search.search_query.pageno > 1:
return True
return results
query = search.search_query.query
query_parts = query.split(" ")
if len(query_parts) < 3:
return True
return results
for query_part in query_parts:
for keyword in CONVERT_KEYWORDS:
if query_part == keyword:
from_query, to_query = query.split(keyword, 1)
_parse_text_and_convert(search, from_query.strip(), to_query.strip())
return True
target_val = _parse_text_and_convert(from_query.strip(), to_query.strip())
if target_val:
Answer(results=results, answer=target_val)
return True
return results

View file

@ -1,6 +1,7 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Searx preferences implementation.
"""
from __future__ import annotations
# pylint: disable=useless-object-inheritance
@ -13,12 +14,14 @@ from collections import OrderedDict
import flask
import babel
import searx.plugins
from searx import settings, autocomplete, favicons
from searx.enginelib import Engine
from searx.plugins import Plugin
from searx.engines import DEFAULT_CATEGORY
from searx.extended_types import SXNG_Request
from searx.locales import LOCALE_NAMES
from searx.webutils import VALID_LANGUAGE_CODE
from searx.engines import DEFAULT_CATEGORY
COOKIE_MAX_AGE = 60 * 60 * 24 * 365 * 5 # 5 years
@ -312,7 +315,7 @@ class EnginesSetting(BooleanChoices):
class PluginsSetting(BooleanChoices):
"""Plugin settings"""
def __init__(self, default_value, plugins: Iterable[Plugin]):
def __init__(self, default_value, plugins: Iterable[searx.plugins.Plugin]):
super().__init__(default_value, {plugin.id: plugin.default_on for plugin in plugins})
def transform_form_items(self, items):
@ -340,7 +343,7 @@ class ClientPref:
return tag
@classmethod
def from_http_request(cls, http_request: flask.Request):
def from_http_request(cls, http_request: SXNG_Request):
"""Build ClientPref object from HTTP request.
- `Accept-Language used for locale setting
@ -375,11 +378,11 @@ class Preferences:
def __init__(
self,
themes: List[str],
categories: List[str],
engines: Dict[str, Engine],
plugins: Iterable[Plugin],
client: Optional[ClientPref] = None,
themes: list[str],
categories: list[str],
engines: dict[str, Engine],
plugins: searx.plugins.PluginStorage,
client: ClientPref | None = None,
):
super().__init__()

View file

@ -0,0 +1,18 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Typification of the result items generated by the *engines*, *answerers* and
*plugins*.
.. note::
We are at the beginning of typing the results. Further typing will follow,
but this is a very large task that we will only be able to implement
gradually. For more, please read :ref:`result types`.
"""
from __future__ import annotations
__all__ = ["Result", "AnswerSet", "Answer", "Translations"]
from ._base import Result, LegacyResult
from .answer import AnswerSet, Answer, Translations

223
searx/result_types/_base.py Normal file
View file

@ -0,0 +1,223 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=too-few-public-methods, missing-module-docstring
"""Basic types for the typification of results.
- :py:obj:`Result` base class
- :py:obj:`LegacyResult` for internal use only
----
.. autoclass:: Result
:members:
.. autoclass:: LegacyResult
:members:
"""
from __future__ import annotations
__all__ = ["Result"]
import re
import urllib.parse
import warnings
import msgspec
class Result(msgspec.Struct, kw_only=True):
"""Base class of all result types :ref:`result types`."""
url: str | None = None
"""A link related to this *result*"""
template: str = "default.html"
"""Name of the template used to render the result.
By default :origin:`result_templates/default.html
<searx/templates/simple/result_templates/default.html>` is used.
"""
engine: str | None = ""
"""Name of the engine *this* result comes from. In case of *plugins* a
prefix ``plugin:`` is set, in case of *answerer* prefix ``answerer:`` is
set.
The field is optional and is initialized from the context if necessary.
"""
parsed_url: urllib.parse.ParseResult | None = None
""":py:obj:`urllib.parse.ParseResult` of :py:obj:`Result.url`.
The field is optional and is initialized from the context if necessary.
"""
results: list = [] # https://jcristharif.com/msgspec/structs.html#default-values
"""Result list of an :origin:`engine <searx/engines>` response or a
:origin:`answerer <searx/answerers>` to which the answer should be added.
This field is only present for the sake of simplicity. Typically, the
response function of an engine has a result list that is returned at the
end. By specifying the result list in the constructor of the result, this
result is then immediately added to the list (this parameter does not have
another function).
.. code:: python
def response(resp):
results = []
...
Answer(results=results, answer=answer, url=url)
...
return results
"""
def normalize_result_fields(self):
"""Normalize a result ..
- if field ``url`` is set and field ``parse_url`` is unset, init
``parse_url`` from field ``url``. This method can be extended in the
inheritance.
"""
if not self.parsed_url and self.url:
self.parsed_url = urllib.parse.urlparse(self.url)
# if the result has no scheme, use http as default
if not self.parsed_url.scheme:
self.parsed_url = self.parsed_url._replace(scheme="http")
self.url = self.parsed_url.geturl()
def __post_init__(self):
"""Add *this* result to the result list."""
self.results.append(self)
def __hash__(self) -> int:
"""Generates a hash value that uniquely identifies the content of *this*
result. The method can be adapted in the inheritance to compare results
from different sources.
If two result objects are not identical but have the same content, their
hash values should also be identical.
The hash value is used in contexts, e.g. when checking for equality to
identify identical results from different sources (engines).
"""
return id(self)
def __eq__(self, other):
"""py:obj:`Result` objects are equal if the hash values of the two
objects are equal. If needed, its recommended to overwrite
"py:obj:`Result.__hash__`."""
return hash(self) == hash(other)
# for legacy code where a result is treated as a Python dict
def __setitem__(self, field_name, value):
return setattr(self, field_name, value)
def __getitem__(self, field_name):
if field_name not in self.__struct_fields__:
raise KeyError(f"{field_name}")
return getattr(self, field_name)
def __iter__(self):
return iter(self.__struct_fields__)
class LegacyResult(dict):
"""A wrapper around a legacy result item. The SearXNG core uses this class
for untyped dictionaries / to be downward compatible.
This class is needed until we have implemented an :py:obj:`Result` class for
each result type and the old usages in the codebase have been fully
migrated.
There is only one place where this class is used, in the
:py:obj:`searx.results.ResultContainer`.
.. attention::
Do not use this class in your own implementations!
"""
UNSET = object()
WHITESPACE_REGEX = re.compile('( |\t|\n)+', re.M | re.U)
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.__dict__ = self
# Init fields with defaults / compare with defaults of the fields in class Result
self.engine = self.get("engine", "")
self.template = self.get("template", "default.html")
self.url = self.get("url", None)
self.parsed_url = self.get("parsed_url", None)
self.content = self.get("content", "")
self.title = self.get("title", "")
# Legacy types that have already been ported to a type ..
if "answer" in self:
warnings.warn(
f"engine {self.engine} is using deprecated `dict` for answers"
f" / use a class from searx.result_types.answer",
DeprecationWarning,
)
self.template = "answer/legacy.html"
def __hash__(self) -> int: # type: ignore
if "answer" in self:
return hash(self["answer"])
if not any(cls in self for cls in ["suggestion", "correction", "infobox", "number_of_results", "engine_data"]):
# it is a commun url-result ..
return hash(self.url)
return id(self)
def __eq__(self, other):
return hash(self) == hash(other)
def __repr__(self) -> str:
return f"LegacyResult: {super().__repr__()}"
def __getattr__(self, name: str, default=UNSET):
if default == self.UNSET and name not in self:
raise AttributeError(f"LegacyResult object has no field named: {name}")
return self[name]
def __setattr__(self, name: str, val):
self[name] = val
def normalize_result_fields(self):
self.title = self.WHITESPACE_REGEX.sub(" ", self.title)
if not self.parsed_url and self.url:
self.parsed_url = urllib.parse.urlparse(self.url)
# if the result has no scheme, use http as default
if not self.parsed_url.scheme:
self.parsed_url = self.parsed_url._replace(scheme="http")
self.url = self.parsed_url.geturl()
if self.content:
self.content = self.WHITESPACE_REGEX.sub(" ", self.content)
if self.content == self.title:
# avoid duplicate content between the content and title fields
self.content = ""

View file

@ -0,0 +1,141 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""
Typification of the *answer* results. Results of this type are rendered in
the :origin:`answers.html <searx/templates/simple/elements/answers.html>`
template.
----
.. autoclass:: BaseAnswer
:members:
:show-inheritance:
.. autoclass:: Answer
:members:
:show-inheritance:
.. autoclass:: Translations
:members:
:show-inheritance:
.. autoclass:: AnswerSet
:members:
:show-inheritance:
"""
# pylint: disable=too-few-public-methods
from __future__ import annotations
__all__ = ["AnswerSet", "Answer", "Translations"]
import msgspec
from ._base import Result
class BaseAnswer(Result, kw_only=True):
"""Base class of all answer types. It is not intended to build instances of
this class (aka *abstract*)."""
class AnswerSet:
"""Aggregator for :py:obj:`BaseAnswer` items in a result container."""
def __init__(self):
self._answerlist = []
def __len__(self):
return len(self._answerlist)
def __bool__(self):
return bool(self._answerlist)
def add(self, answer: BaseAnswer) -> None:
a_hash = hash(answer)
for i in self._answerlist:
if hash(i) == a_hash:
return
self._answerlist.append(answer)
def __iter__(self):
"""Sort items in this set and iterate over the items."""
self._answerlist.sort(key=lambda answer: answer.template)
yield from self._answerlist
def __contains__(self, answer: BaseAnswer) -> bool:
a_hash = hash(answer)
for i in self._answerlist:
if hash(i) == a_hash:
return True
return False
class Answer(BaseAnswer, kw_only=True):
"""Simple answer type where the *answer* is a simple string with an optional
:py:obj:`url field <Result.url>` field to link a resource (article, map, ..)
related to the answer."""
template: str = "answer/legacy.html"
answer: str
"""Text of the answer."""
def __hash__(self):
"""The hash value of field *answer* is the hash value of the
:py:obj:`Answer` object. :py:obj:`Answer <Result.__eq__>` objects are
equal, when the hash values of both objects are equal."""
return hash(self.answer)
class Translations(BaseAnswer, kw_only=True):
"""Answer type with a list of translations.
The items in the list of :py:obj:`Translations.translations` are of type
:py:obj:`Translations.Item`:
.. code:: python
def response(resp):
results = []
...
foo_1 = Translations.Item(
text="foobar",
synonyms=["bar", "foo"],
examples=["foo and bar are placeholders"],
)
foo_url="https://www.deepl.com/de/translator#en/de/foo"
...
Translations(results=results, translations=[foo], url=foo_url)
"""
template: str = "answer/translations.html"
"""The template in :origin:`answer/translations.html
<searx/templates/simple/answer/translations.html>`"""
translations: list[Translations.Item]
"""List of translations."""
class Item(msgspec.Struct, kw_only=True):
"""A single element of the translations / a translation. A translation
consists of at least a mandatory ``text`` property (the translation) ,
optional properties such as *definitions*, *synonyms* and *examples* are
possible."""
text: str
"""Translated text."""
transliteration: str = ""
"""Transliteration_ of the requested translation.
.. _Transliteration: https://en.wikipedia.org/wiki/Transliteration
"""
examples: list[str] = []
"""List of examples for the requested translation."""
definitions: list[str] = []
"""List of definitions for the requested translation."""
synonyms: list[str] = []
"""List of synonyms for the requested translation."""

View file

@ -1,6 +1,8 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=missing-module-docstring
from __future__ import annotations
import warnings
import re
from collections import defaultdict
from operator import itemgetter
@ -12,8 +14,10 @@ from searx import logger
from searx.engines import engines
from searx.metrics import histogram_observe, counter_add, count_error
from searx.result_types import Result, LegacyResult
from searx.result_types.answer import AnswerSet, BaseAnswer
CONTENT_LEN_IGNORED_CHARS_REGEX = re.compile(r'[,;:!?\./\\\\ ()-_]', re.M | re.U)
WHITESPACE_REGEX = re.compile('( |\t|\n)+', re.M | re.U)
# return the meaningful length of the content for a result
@ -183,56 +187,76 @@ class ResultContainer:
def __init__(self):
super().__init__()
self._merged_results = []
self.infoboxes = []
self.suggestions = set()
self.answers = {}
self._merged_results: list[LegacyResult] = []
self.infoboxes: list[dict] = []
self.suggestions: set[str] = set()
self.answers = AnswerSet()
self.corrections = set()
self._number_of_results = []
self.engine_data = defaultdict(dict)
self._closed = False
self.paging = False
self._number_of_results: list[int] = []
self.engine_data: dict[str, str | dict] = defaultdict(dict)
self._closed: bool = False
self.paging: bool = False
self.unresponsive_engines: Set[UnresponsiveEngine] = set()
self.timings: List[Timing] = []
self.redirect_url = None
self.on_result = lambda _: True
self._lock = RLock()
def extend(self, engine_name, results): # pylint: disable=too-many-branches
def extend(self, engine_name: str | None, results): # pylint: disable=too-many-branches
if self._closed:
return
standard_result_count = 0
error_msgs = set()
for result in list(results):
result['engine'] = engine_name
if 'suggestion' in result and self.on_result(result):
self.suggestions.add(result['suggestion'])
elif 'answer' in result and self.on_result(result):
self.answers[result['answer']] = result
elif 'correction' in result and self.on_result(result):
self.corrections.add(result['correction'])
elif 'infobox' in result and self.on_result(result):
self._merge_infobox(result)
elif 'number_of_results' in result and self.on_result(result):
self._number_of_results.append(result['number_of_results'])
elif 'engine_data' in result and self.on_result(result):
self.engine_data[engine_name][result['key']] = result['engine_data']
elif 'url' in result:
# standard result (url, title, content)
if not self._is_valid_url_result(result, error_msgs):
continue
# normalize the result
self._normalize_url_result(result)
# call on_result call searx.search.SearchWithPlugins._on_result
# which calls the plugins
if not self.on_result(result):
continue
self.__merge_url_result(result, standard_result_count + 1)
standard_result_count += 1
elif self.on_result(result):
self.__merge_result_no_url(result, standard_result_count + 1)
standard_result_count += 1
if isinstance(result, Result):
result.engine = result.engine or engine_name
result.normalize_result_fields()
if isinstance(result, BaseAnswer) and self.on_result(result):
self.answers.add(result)
else:
# more types need to be implemented in the future ..
raise NotImplementedError(f"no handler implemented to process the result of type {result}")
else:
result['engine'] = result.get('engine') or engine_name or ""
result = LegacyResult(result) # for backward compatibility, will be romeved one day
if 'suggestion' in result and self.on_result(result):
self.suggestions.add(result['suggestion'])
elif 'answer' in result and self.on_result(result):
warnings.warn(
f"answer results from engine {result.engine}"
" are without typification / migrate to Answer class.",
DeprecationWarning,
)
self.answers.add(result)
elif 'correction' in result and self.on_result(result):
self.corrections.add(result['correction'])
elif 'infobox' in result and self.on_result(result):
self._merge_infobox(result)
elif 'number_of_results' in result and self.on_result(result):
self._number_of_results.append(result['number_of_results'])
elif 'engine_data' in result and self.on_result(result):
self.engine_data[result.engine][result['key']] = result['engine_data']
elif result.url:
# standard result (url, title, content)
if not self._is_valid_url_result(result, error_msgs):
continue
# normalize the result
result.normalize_result_fields()
# call on_result call searx.search.SearchWithPlugins._on_result
# which calls the plugins
if not self.on_result(result):
continue
self.__merge_url_result(result, standard_result_count + 1)
standard_result_count += 1
elif self.on_result(result):
self.__merge_result_no_url(result, standard_result_count + 1)
standard_result_count += 1
if len(error_msgs) > 0:
for msg in error_msgs:
@ -279,27 +303,6 @@ class ResultContainer:
return True
def _normalize_url_result(self, result):
"""Return True if the result is valid"""
result['parsed_url'] = urlparse(result['url'])
# if the result has no scheme, use http as default
if not result['parsed_url'].scheme:
result['parsed_url'] = result['parsed_url']._replace(scheme="http")
result['url'] = result['parsed_url'].geturl()
# avoid duplicate content between the content and title fields
if result.get('content') == result.get('title'):
del result['content']
# make sure there is a template
if 'template' not in result:
result['template'] = 'default.html'
# strip multiple spaces and carriage returns from content
if result.get('content'):
result['content'] = WHITESPACE_REGEX.sub(' ', result['content'])
def __merge_url_result(self, result, position):
result['engines'] = set([result['engine']])
with self._lock:

View file

@ -1,28 +1,30 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=missing-module-docstring, too-few-public-methods
# the public namespace has not yet been finally defined ..
# __all__ = ["EngineRef", "SearchQuery"]
import threading
from copy import copy
from timeit import default_timer
from uuid import uuid4
import flask
from flask import copy_current_request_context
import babel
from searx import settings
from searx.answerers import ask
from searx.external_bang import get_bang_url
from searx.results import ResultContainer
from searx import logger
from searx.plugins import plugins
from searx.search.models import EngineRef, SearchQuery
from searx import settings
import searx.answerers
import searx.plugins
from searx.engines import load_engines
from searx.network import initialize as initialize_network, check_network_configuration
from searx.extended_types import SXNG_Request
from searx.external_bang import get_bang_url
from searx.metrics import initialize as initialize_metrics, counter_inc, histogram_observe_time
from searx.search.processors import PROCESSORS, initialize as initialize_processors
from searx.network import initialize as initialize_network, check_network_configuration
from searx.results import ResultContainer
from searx.search.checker import initialize as initialize_checker
from searx.search.models import SearchQuery
from searx.search.processors import PROCESSORS, initialize as initialize_processors
from .models import EngineRef, SearchQuery
logger = logger.getChild('search')
@ -68,17 +70,10 @@ class Search:
return False
def search_answerers(self):
"""
Check if an answer return a result.
If yes, update self.result_container and return True
"""
answerers_results = ask(self.search_query)
if answerers_results:
for results in answerers_results:
self.result_container.extend('answer', results)
return True
return False
results = searx.answerers.STORAGE.ask(self.search_query.query)
self.result_container.extend(None, results)
return bool(results)
# do search-request
def _get_requests(self):
@ -184,11 +179,11 @@ class Search:
class SearchWithPlugins(Search):
"""Inherit from the Search class, add calls to the plugins."""
__slots__ = 'ordered_plugin_list', 'request'
__slots__ = 'user_plugins', 'request'
def __init__(self, search_query: SearchQuery, ordered_plugin_list, request: flask.Request):
def __init__(self, search_query: SearchQuery, request: SXNG_Request, user_plugins: list[str]):
super().__init__(search_query)
self.ordered_plugin_list = ordered_plugin_list
self.user_plugins = user_plugins
self.result_container.on_result = self._on_result
# pylint: disable=line-too-long
# get the "real" request to use it outside the Flask context.
@ -200,14 +195,14 @@ class SearchWithPlugins(Search):
self.request = request._get_current_object()
def _on_result(self, result):
return plugins.call(self.ordered_plugin_list, 'on_result', self.request, self, result)
return searx.plugins.STORAGE.on_result(self.request, self, result)
def search(self) -> ResultContainer:
if plugins.call(self.ordered_plugin_list, 'pre_search', self.request, self):
if searx.plugins.STORAGE.pre_search(self.request, self):
super().search()
plugins.call(self.ordered_plugin_list, 'post_search', self.request, self)
searx.plugins.STORAGE.post_search(self.request, self)
self.result_container.close()
return self.result_container

View file

@ -158,6 +158,7 @@ class EngineProcessor(ABC):
return None
params = {}
params["query"] = search_query.query
params['category'] = engine_category
params['pageno'] = search_query.pageno
params['safesearch'] = search_query.safesearch

View file

@ -148,14 +148,24 @@ ui:
# URL formatting: pretty, full or host
url_formatting: pretty
# Lock arbitrary settings on the preferences page. To find the ID of the user
# setting you want to lock, check the ID of the form on the page "preferences".
# Lock arbitrary settings on the preferences page.
#
# preferences:
# lock:
# - categories
# - language
# - autocomplete
# - favicon
# - safesearch
# - method
# - doi_resolver
# - locale
# - theme
# - results_on_new_tab
# - infinite_scroll
# - search_on_category_select
# - method
# - image_proxy
# - query_in_title
# searx supports result proxification using an external service:
@ -217,14 +227,15 @@ outgoing:
# - fe80::/126
# External plugin configuration, for more details see
# https://docs.searxng.org/dev/plugins.html
# https://docs.searxng.org/admin/settings/settings_plugins.html
#
# plugins:
# - plugin1
# - plugin2
# - mypackage.mymodule.MyPlugin
# - mypackage.mymodule.MyOtherPlugin
# - ...
# Comment or un-comment plugin to activate / deactivate by default.
# https://docs.searxng.org/admin/settings/settings_plugins.html
#
# enabled_plugins:
# # these plugins are enabled if nothing is configured ..
@ -1159,8 +1170,7 @@ engines:
engine: libretranslate
# https://github.com/LibreTranslate/LibreTranslate?tab=readme-ov-file#mirrors
base_url:
- https://translate.terraprint.co
- https://trans.zillyhuhn.com
- https://libretranslate.com/translate
# api_key: abc123
shortcut: lt
disabled: true

View file

@ -245,8 +245,3 @@ SCHEMA = {
'engines': SettingsValue(list, []),
'doi_resolvers': {},
}
def settings_set_defaults(settings):
apply_schema(settings, SCHEMA, [])
return settings

View file

@ -1,3 +0,0 @@
*
*/
!.gitignore

View file

@ -48,6 +48,14 @@ table {
}
}
div.pref-group {
width: 100%;
font-weight: normal;
padding: 1rem 0.5rem;
.ltr-text-align-left();
background: var(--color-settings-table-group-background);
}
.value {
margin: 0;
padding: 0;

View file

@ -0,0 +1,8 @@
<span>{{ answer.answer }}</span>
{%- if answer.url -%}
<a href="{{ answer.url }}" class="answer-url"
{%- if results_on_new_tab %} target="_blank" rel="noopener noreferrer"
{%- else -%} rel="noreferrer"
{%- endif -%}
>{{ urlparse(answer.url).hostname }}</a>
{% endif -%}

View file

@ -0,0 +1,52 @@
<details class="answer-translations">
<summary>{{ answer.translations[0].text }}</summary>
<dl>
{%- for item in answer.translations -%}
<dt>{{ item.text }}</dt>
<dd>
{%- if item.transliteration -%}
<div class="item-transliteration">{{ item.transliteration }}</div>
{%- endif -%}
{%- if item.examples -%}
<div>{{ _('Examples') }}</div>
<ul>
{%- for i in item.examples -%}
<li>{{ i }}</li>
{%- endfor -%}
</ul>
{%- endif -%}
{%- if item.definitions -%}
<div>{{ _('Definitions') }}</div>
<ul>
{%- for i in item.definitions -%}
<li>{{ i }}</li>
{%- endfor -%}
</ul>
{%- endif -%}
{%- if item.synonyms -%}
<div>{{ _('Synonyms') }}</div>
<ul>
{%- for i in item.synonyms -%}
<li>{{ i }}</li>
{%- endfor -%}
</ul>
{%- endif -%}
</dd>
{%- endfor -%}
</dl>
</details>
{%- if answer.url -%}
<a href="{{ answer.url }}" class="answer-url"
{%- if results_on_new_tab %}
target="_blank" rel="noopener noreferrer"
{%- else -%}
rel="noreferrer"
{%- endif -%}
>{{ answer.engine }}</a>
{%- else -%}
<span class="answer-url">{{ answer.engine }}</span>
{% endif -%}

View file

@ -1,38 +0,0 @@
<div class="answer-translations">
{% for translation in translations %}
{% if loop.index > 1 %}
<hr />
{% endif %}
<h3>{{ translation.text }}</h3>
{% if translation.transliteration %}
<b>translation.transliteration</b>
{% endif %} {% if translation.definitions %}
<dl>
<dt>{{ _('Definitions') }}</dt>
<ul>
{% for definition in translation.definitions %}
<li>{{ definition }}</li>
{% endfor %}
<ul>
</dl>
{% endif %} {% if translation.examples %}
<dl>
<dt>{{ _('Examples') }}</dt>
<ul>
{% for example in translation.examples %}
<li>{{ example }}</li>
{% endfor %}
</ul>
</dl>
{% endif %} {% if translation.synonyms %}
<dl>
<dt>{{ _('Synonyms') }}</dt>
<ul>
{% for synonym in translation.synonyms %}
<li>{{ synonym }}</li>
{% endfor %}
</ul>
</dl>
{% endif %}
{% endfor %}
</div>

View file

@ -20,7 +20,6 @@
{% if get_setting('server.limiter') or get_setting('server.public_instance') %}
<link rel="stylesheet" href="{{ url_for('client_token', token=link_token) }}" type="text/css">
{% endif %}
{% block styles %}{% endblock %}
<!--[if gte IE 9]>-->
<script src="{{ url_for('static', filename='js/searxng.head.min.js') }}" client_settings="{{ client_settings }}"></script>
<!--<![endif]-->

View file

@ -0,0 +1,8 @@
<div id="answers" role="complementary" aria-labelledby="answers-title">
<h4 class="title" id="answers-title">{{ _('Answers') }} : </h4>
{%- for answer in answers -%}
<div class="answer">
{%- include ("simple/" + (answer.template or "answer/legacy.html")) -%}
</div>
{%- endfor -%}
</div>

View file

@ -0,0 +1,19 @@
<div id="corrections" role="complementary" aria-labelledby="corrections-title">
<h4 id="corrections-title">{{ _('Try searching for:') }}</h4>
{% for correction in corrections %}
<div class="left">
<form method="{{ method or 'POST' }}" action="{{ url_for('search') }}" role="navigation">
{% for category in selected_categories %}
<input type="hidden" name="category_{{ category }}" value="1">
{% endfor %}
<input type="hidden" name="q" value="{{ correction.url }}">
<input type="hidden" name="language" value="{{ current_language }}">
<input type="hidden" name="time_range" value="{{ time_range }}">
<input type="hidden" name="safesearch" value="{{ safesearch }}">
<input type="hidden" name="theme" value="{{ theme }}">
{% if timeout_limit %}<input type="hidden" name="timeout_limit" value="{{ timeout_limit }}" >{% endif %}
<input type="submit" role="link" value="{{ correction.title }}">
</form>
</div>
{% endfor %}
</div>

View file

@ -37,19 +37,19 @@
{%- endmacro -%}
{%- macro plugin_preferences(section) -%}
{%- for plugin in plugins -%}
{%- if plugin.preference_section == section -%}
<fieldset>{{- '' -}}
<legend>{{ _(plugin.name) }}</legend>{{- '' -}}
<div class="value">
{{- checkbox_onoff_reversed('plugin_' + plugin.id, plugin.id not in allowed_plugins, 'plugin_labelledby' + plugin.id) -}}
</div>{{- '' -}}
<div class="description" id="{{ 'plugin_labelledby' + plugin.id }}">
{{- _(plugin.description) -}}
</div>{{- '' -}}
</fieldset>
{%- endif -%}
{%- endfor -%}
{%- for plugin in plugins_storage -%}
{%- if plugin.preference_section == section -%}
<fieldset>{{- '' -}}
<legend>{{ _(plugin.name) }}</legend>{{- '' -}}
<div class="value">
{{- checkbox_onoff_reversed('plugin_' + plugin.id, plugin.id not in allowed_plugins, 'plugin_labelledby' + plugin.id) -}}
</div>{{- '' -}}
<div class="description" id="{{ 'plugin_labelledby' + plugin.id }}">
{{- _(plugin.description) -}}
</div>{{- '' -}}
</fieldset>
{%- endif -%}
{%- endfor -%}
{%- endmacro -%}
{%- macro engine_about(search_engine) -%}
@ -158,6 +158,7 @@
<form id="search_form" method="post" action="{{ url_for('preferences') }}" autocomplete="off">
{{- tabs_open() -}}
{# tab: general #}
{{- tab_header('maintab', 'general', _('General'), True) -}}
{%- if 'categories' not in locked_preferences -%}
@ -179,13 +180,16 @@
{% if 'safesearch' not in locked_preferences %}
{%- include 'simple/preferences/safesearch.html' -%}
{%- endif -%}
{%- include 'simple/preferences/tokens.html' -%}
{{- plugin_preferences('general') -}}
{%- if 'doi_resolver' not in locked_preferences %}
{%- include 'simple/preferences/doi_resolver.html' -%}
{%- endif -%}
{%- include 'simple/preferences/tokens.html' -%}
{{- tab_footer() -}}
{# tab: ui #}
{{- tab_header('maintab', 'ui', _('User interface')) -}}
{%- if 'locale' not in locked_preferences -%}
@ -208,6 +212,7 @@
{{- plugin_preferences('ui') -}}
{{- tab_footer() -}}
{# tab: privacy #}
{{- tab_header('maintab', 'privacy', _('Privacy')) -}}
{%- if 'method' not in locked_preferences -%}
@ -222,6 +227,8 @@
{{- plugin_preferences('privacy') -}}
{{- tab_footer() -}}
{# tab: enignes #}
{{- tab_header('maintab', 'engines', _('Engines')) -}}
<p>
{{- _('Currently used search engines') -}}
@ -231,18 +238,23 @@
{{- tabs_close() -}}
{{- tab_footer() -}}
{# tab: query #}
{{- tab_header('maintab', 'query', _('Special Queries')) -}}
{%- if answerers -%}
{%- if answer_storage -%}
{%- include 'simple/preferences/answerers.html' -%}
{%- endif -%}
{{- tab_footer() -}}
{# tab: cookies #}
{{- tab_header('maintab', 'cookies', _('Cookies')) -}}
{%- include 'simple/preferences/cookies.html' -%}
{{- tab_footer() -}}
{{- tabs_close() -}}
{# footer with save & reset buttons #}
{%- include 'simple/preferences/footer.html' -%}
</form>{{- '' -}}

Some files were not shown because too many files have changed in this diff Show more