Commit graph

115 commits

Author SHA1 Message Date
Alexandre Flament
eaa694fb7d [enh] replace requests by httpx 2021-04-10 15:38:33 +02:00
Alexandre Flament
ca93a01844 [mod] dynamically set language_support variable
The language_support variable is set to True by default,
and set to False in only 5 engines.

Except the documentation and the /config URL, this variable is not used.

This commit remove the variable definition in the engines, and
set value according to supported_languages length: False when the length is 0,
True otherwise.

Close #2485
2021-02-01 17:10:37 +01:00
Markus Heiser
7f505bdc6f [fix] google: avoid unnecessary SearxEngineXPathException errors
Avoid SearxEngineXPathException errors when parsing non valid results::

    .//div[@class="yuRUbf"]//a/@href index 0 not found
    Traceback (most recent call last):
      File "./searx/engines/google.py", line 274, in response
        url = eval_xpath_getindex(result, href_xpath, 0)
      File "./searx/searx/utils.py", line 608, in eval_xpath_getindex
        raise SearxEngineXPathException(xpath_spec, 'index ' + str(index) + ' not found')
    searx.exceptions.SearxEngineXPathException: .//div[@class="yuRUbf"]//a/@href index 0 not found

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-28 10:08:50 +01:00
Markus Heiser
b1fefec40d [fix] normalize the language & region aspects of all google engines
BTW: make the engines ready for search.checker:

- replace eval_xpath by eval_xpath_getindex and eval_xpath_list
- google_images: remove outer try/except block

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-28 10:08:46 +01:00
Markus Heiser
baec54c492 [fix] revise of the google-news engine
This revise is based on the methods developed in the revise of the google engine
(see commit 410c2f9).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2021-01-22 18:49:45 +01:00
Alexandre Flament
a4dcfa025c [enh] engines: add about variable
move meta information from comment to the about variable
so the preferences, the documentation can show these information
2021-01-14 20:57:17 +01:00
Alexandre Flament
64cccae99e [mod] various engines: use eval_xpath* functions and searx.exceptions.*
Engine list: ahmia, duckduckgo_images, elasticsearch, google, google_images, google_videos, youtube_api
2020-12-03 10:22:48 +01:00
Alexandre Flament
2006eb4680 [mod] move extract_text, extract_url to searx.utils 2020-10-02 18:13:56 +02:00
Markus Heiser
8162d7aff4 [fix] google engine - div classes has been renamed in HTML reult
Since 1. October 2020 google has changed the 'class' attribute of the HTML
result page.

Fix the xpath expressions and ignore <div class="g" ../> sections which do not
match to title's xpath expression.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2020-10-01 09:44:29 +02:00
Marc Abonce Seguin
ecf5899153 fetch google's search langs rather than ui langs 2020-09-22 11:37:44 +02:00
Dalf
1022228d95 Drop Python 2 (1/n): remove unicode string and url_utils 2020-09-10 10:39:04 +02:00
Adam Tauber
52eba0c721 [fix] pep8 2020-07-08 00:46:03 +02:00
Markus Heiser
410c2f903d [fix] revise google engine
this commit is picked from #1985
2020-07-07 21:50:59 +02:00
Marc Abonce Seguin
ccaf6ca02c [fix] update xpaths for new google results page 2019-12-07 16:37:24 -07:00
Adam Tauber
731e34299d
Merge pull request #1744 from dalf/optimizations
[mod] speed optimization
2019-12-02 13:39:58 +00:00
Emilien Devos
8f51430f5c [fix] Force Google old UI with a new user agent 2019-11-22 23:01:41 +01:00
Dalf
85b3723345 [mod] speed optimization
compile XPath only once
avoid redundant call to urlparse
get_locale(webapp.py): avoid useless call to request.accept_languages.best_match
2019-11-15 09:33:15 +01:00
Emilien Devos
cbd1ebdce8 [fix] Force Google old UI (#1597) 2019-05-29 10:05:57 +09:00
Noémi Ványi
b63d645a52 Revert "remove 'all' option from search languages"
This reverts commit 4d1770398a.
2019-01-07 21:19:00 +01:00
Marc Abonce Seguin
0169b63e84 [fix] fetch google's supported languages 2019-01-06 21:31:45 -06:00
Marc Abonce Seguin
5568f24d6c [fix] check language aliases when setting search language 2019-01-06 20:31:57 -06:00
Marc Abonce Seguin
f7f9c50393 [fix] force English results in Google when using en-US 2018-04-18 23:29:48 -05:00
Marc Abonce Seguin
772c048d01 refactor engine's search language handling
Add match_language function in utils to match any user given
language code with a list of engine's supported languages.

Also add language_aliases dict on each engine to translate
standard language codes into the custom codes used by the engine.
2018-03-27 00:08:03 -06:00
Marc Abonce Seguin
d1eae9359f fix fetch_langauges to be more accurate
Add languages supported by either all default general engines or 10 engines.
2018-03-20 17:58:20 -06:00
Noémi Ványi
2d5eed9b59 send constant cookie with query to Google 2017-12-18 21:38:52 +01:00
marc
4d1770398a remove 'all' option from search languages 2017-12-06 01:20:15 -06:00
Adam Tauber
1613c6319e [fix] handle /sorry redirects 2017-12-05 20:38:34 +01:00
Adam Tauber
6eb9503896 [fix] use english in google engine if no language was set - this prevents guessing the language by the IP of the instance 2017-11-22 22:56:47 +01:00
Adam Tauber
6fdb6640d9 [fix] revert language changes to prevent CAPTCHAs 2017-11-22 22:50:48 +01:00
Adam Tauber
9ab8536479 [fix] fix language support of google 2017-11-21 16:28:53 +01:00
Adam Tauber
52e615dede [enh] py3 compatibility 2017-05-15 12:02:30 +02:00
Adam Tauber
52d1087202 [enh] add result number parsing to google engine 2017-01-27 00:18:46 +01:00
David A Roberts
1d30141c20 [enh] show spelling corrections 2017-01-16 13:31:16 +10:00
Adam Tauber
0d4da30c7f [enh] add instant answers to google engine 2017-01-05 17:20:12 +01:00
marc
af35eee10b tests for _fetch_supported_languages in engines
and refactor method to make it testable without making requests
2016-12-15 00:40:21 -06:00
marc
f62ce21f50 [mod] fetch supported languages for several engines
utils/fetch_languages.py gets languages supported by each engine and
generates engines_languages.json with each engine's supported language.
2016-12-13 19:58:10 -06:00
marc
c677aee58a filter langauges 2016-12-13 19:32:00 -06:00
marc
149802c569 [enh] add supported_languages on engines and auto-generate languages.py 2016-12-13 19:32:00 -06:00
Noémi Ványi
c59c76e6ee add year to time range to engines which support "Last year"
Engines:
 * Bing images
 * Flickr (noapi)
 * Google
 * Google Images
 * Google News
2016-12-11 16:58:31 +01:00
Adam Tauber
16bdc0baf4 [mod] do not escape html content in engines 2016-12-09 18:59:19 +01:00
Adam Tauber
350a84520d [fix] time range detection 2016-07-26 00:28:48 +02:00
Noemi Vanyi
2e5839503f add time range search for google 2016-07-25 23:28:14 +02:00
stepshal
b3ab221b98 Fix anomalous backslash in string 2016-07-11 23:53:13 +07:00
Adam Tauber
85c0351dca Merge pull request #526 from ukwt/anime
Add a few search engines
2016-04-14 10:59:31 +02:00
Kirill Isakov
90c51cb449 Fix a few typos in Google search engine 2016-04-13 23:04:53 +06:00
Adam Tauber
6d55642ab4 [fix] no more redirect ++ explicitly specify search language to avoid googles ip based heuristics 2016-03-25 18:38:02 +01:00
Adam Tauber
09b7673fbd [fix] temporary disable googles inner links - #491 2016-01-18 13:10:21 +01:00
Adam Tauber
66f48c2bf5 [fix] google markup change - closes #489 2016-01-10 18:49:50 +01:00
Adam Tauber
5cea4f9445 [fix] prevent google engine to redirect
nid/pref cookies are also removed
2015-12-22 20:05:42 +01:00
Adam Tauber
d8f8bdc951 [fix] quickfix for sometimes missing PREF cookie 2015-12-15 09:48:38 +01:00
Adam Tauber
5d49c15f79 [fix] google engine - ignore new useless result type 2015-10-29 12:47:12 +01:00
Adam Tauber
0ad272c5cb [fix] content escaping - closes #441
TODO check other engines too
2015-09-30 16:42:03 +02:00
Dalf
fc0ae0f907 google engine: code cleanup 2015-06-06 00:18:00 +02:00
Dalf
72c8de35a2 google engine :remove OSM map 2015-06-05 23:56:23 +02:00
Alexandre Flament
b8fc531b60 [enh] google engine : parse map links and more 2015-06-05 11:23:24 +02:00
Alexandre Flament
39ff21237c [enh] google engine : avoid some "sorry google" by adding another cookie : NID. This cookie is specific by hostname.
This allow to send request to google.* (according to the search language).
Before this commit, request in other languages than english was sent to www.google.com which was redirected to www.google.*
The PREF is still use on the www.google.com domain.
2015-05-30 17:41:40 +02:00
Alexandre Flament
8a69ade875 Revert of #195 when the search language is not english
Sometimes there is two requests to google (depending of the source IP) : one to google.com, the second to google.fr (for instance).

Going to https://www.google.com/ncr and saving the PREF cookie for future use prevent this (there is no redirection).

But, recently (or not ?), by doing this the search returns English results even if the Accept-Language is specified.

There is still a way to prevent this : going to preference, set the search language. I don't know if this can be done by searx.

For now, a quick fix is to disable the use of the PREF cookie when the search language is not English (google engine will slower but returns excepted results).
2015-05-01 21:20:09 +02:00
dalf
0a83be0ec9 [fix] google engine: depending on the IP of the searx instance, each searx request where making two HTTP requests (see https://support.google.com/websearch/answer/873?hl=en ) 2015-01-22 11:40:28 +01:00
Adam Tauber
0f4cb32bf1 [mod] image results removed from google engine 2014-12-09 00:53:09 +01:00
Adam Tauber
611f4e2a86 [fix] pep8 2014-12-05 20:03:16 +01:00
Dalf
5dc3eb3399 [fix] rewrite the google engine since Google Web Search API is about to expire 2014-09-14 14:40:55 +02:00
Thomas Pointhuber
144f89bf78 add comments to google-engines 2014-09-01 15:10:05 +02:00
asciimoo
2a788c8f29 [enh] search language support init 2014-01-31 04:35:23 +01:00
asciimoo
ca271fd861 [enh] bing, google paging support 2014-01-29 21:14:38 +01:00
asciimoo
3207a396bd [enh] google engine added 2014-01-29 19:28:38 +01:00