searxng/searx/plugins/tracker_url_remover.py
Markus Heiser 542f7d0d7b [mod] pylint all files with one profile / drop PYLINT_SEARXNG_DISABLE_OPTION
In the past, some files were tested with the standard profile, others with a
profile in which most of the messages were switched off ... some files were not
checked at all.

- ``PYLINT_SEARXNG_DISABLE_OPTION`` has been abolished
- the distinction ``# lint: pylint`` is no longer necessary
- the pylint tasks have been reduced from three to two

  1. ./searx/engines -> lint engines with additional builtins
  2. ./searx ./searxng_extra ./tests -> lint all other python files

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-03-11 14:55:38 +01:00

43 lines
1.1 KiB
Python

# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=missing-module-docstring
import re
from urllib.parse import urlunparse, parse_qsl, urlencode
from flask_babel import gettext
regexes = {
re.compile(r'utm_[^&]+'),
re.compile(r'(wkey|wemail)[^&]*'),
re.compile(r'(_hsenc|_hsmi|hsCtaTracking|__hssc|__hstc|__hsfp)[^&]*'),
re.compile(r'&$'),
}
name = gettext('Tracker URL remover')
description = gettext('Remove trackers arguments from the returned URL')
default_on = True
preference_section = 'privacy'
def on_result(_request, _search, result):
if 'parsed_url' not in result:
return True
query = result['parsed_url'].query
if query == "":
return True
parsed_query = parse_qsl(query)
changes = 0
for i, (param_name, _) in enumerate(list(parsed_query)):
for reg in regexes:
if reg.match(param_name):
parsed_query.pop(i - changes)
changes += 1
result['parsed_url'] = result['parsed_url']._replace(query=urlencode(parsed_query))
result['url'] = urlunparse(result['parsed_url'])
break
return True