[mod] hardening xpath engine: ignore empty results

A SearXNG maintainer on Matrix reported a traceback::

    File "searxng-src/searx/engines/xpath.py", line 272, in response
      dom = html.fromstring(resp.text)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "searx-pyenv/lib/python3.11/site-packages/lxml/html/__init__.py", line 850, in fromstring
      doc = document_fromstring(html, parser=parser, base_url=base_url, **kw)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "searx-pyenv/lib/python3.11/site-packages/lxml/html/__init__.py", line 738, in document_fromstring
      raise etree.ParserError(
    lxml.etree.ParserError: Document is empty

I don't have an example to reproduce the issue, but the issue and this patch are
clearly recognizable even without an example.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
This commit is contained in:
Markus Heiser 2024-11-29 12:24:25 +01:00 committed by Markus Heiser
parent cf034488d5
commit 540323a4b0

View file

@ -269,6 +269,10 @@ def response(resp): # pylint: disable=too-many-branches
raise_for_httperror(resp)
results = []
if not resp.text:
return results
dom = html.fromstring(resp.text)
is_onion = 'onions' in categories