Commit graph

200 commits

Author SHA1 Message Date
Mouse Reeve
d149e57494 Split expand book data task into per-edition tasks
Loading every edition in one task takes ages, and produces a large task
that clogs up the queue. This will create more, smaller tasks that will
finish more quickly.
2022-05-31 12:41:57 -07:00
Mouse Reeve
c3b35760a2 Updates test mocks for remote search 2022-05-31 09:37:54 -07:00
Mouse Reeve
969db13ff2 Safely return None in remote search return_first 2022-05-31 08:49:23 -07:00
Mouse Reeve
a053f20961 Re-implements return first option
Since we get all the results quickly now, this aggregates all the
results that came back and sorts them by confidence, and returns the
highest confidence result. The confidences aren't great on free text
search, but conceptually that's how it should work at least.

It may make sense to aggregate the search results in all contexts, but
I'll propose that in a separate PR.
2022-05-31 08:20:59 -07:00
Mouse Reeve
98ed03b6b4 Python formatting and test update 2022-05-30 17:00:34 -07:00
Mouse Reeve
83ee5a756f Filter intentaire results by confidence 2022-05-30 16:42:37 -07:00
Mouse Reeve
af19d728d2 Removes outdated unit tests 2022-05-30 16:16:10 -07:00
Mouse Reeve
87fe984462 Combines search formatter and parser function
The parser was extracting the list of search results from the json
object returned by the search endpoint, and the formatter was converting
an individual json entry into a SearchResult object. This just merged
them into one function, because they are never used separately.
2022-05-30 12:52:31 -07:00
Mouse Reeve
525e2a591d More error handing
Adds logging and error handling for some of the numerous ways a request
could fail (the remote site is down, the url is blocked, etc).

I also have the results boxes open by default, which makes it more
legible imo.
2022-05-30 12:40:13 -07:00
Mouse Reeve
45f2199c71 Gather and wait on async requests
This sends out the request tasks all at once and then aggregates the
results, instead of just running them one after another asynchronously.
2022-05-30 12:05:22 -07:00
Mouse Reeve
5e81ec75fb Set request headers in async search get request
Gotta ask for json
2022-05-30 11:19:16 -07:00
Mouse Reeve
9a9cef7766 Verify url before async search
The database lookup doesn't work during the asyn process, so this change
loops through the connectors and grabs the formatted urls before sending
it to the async handler.
2022-05-30 11:16:05 -07:00
Mouse Reeve
0adda36da7 Remove search endpoints from Connector
Instead of having individual search functions that make individual
requests, the connectors will always be searched asynchronously
together. The process_seach_response combines the parse and format
functions, which could probably be merged into one over-rideable
function.

The current to-do on this is to remove Inventaire search results that
are below the confidence threshhold after search, which used to happen
in the `search` function.
2022-05-30 10:37:24 -07:00
Mouse Reeve
9c03bf782e Make an async request to all search connectors
This is the untest first pass at re-arranging remote search to work in
parallel rather than sequence. It moves a couple functions around
(raise_not_valid_url, for example, needs to be in connector_manager.py
now to avoid circular imports). It adds a function to Connector objects
that generates a search result (either to the isbn endpoint or the free
text endpoint) based on the query, which was previously done as part of
the search.

I also lowered the timeout to 8 seconds by default.
2022-05-30 10:15:22 -07:00
Mouse Reeve
72d6a4ce52 Log info, not exception, for expected errors 2022-03-11 14:55:54 -08:00
Mouse Reeve
39691bed3a Merge branch 'main' into openlibrary-author-fields 2022-02-16 18:06:04 -08:00
Mouse Reeve
3e635f497e Adds some simple url validation 2022-02-03 15:11:01 -08:00
Mouse Reeve
194c69f512 Fixes return values of null responses 2022-02-02 07:09:35 -08:00
Mouse Reeve
754e24812b Check image extensions before saving 2022-02-01 21:18:25 -08:00
Mouse Reeve
9611815b44 Extract wikipedia and inventaire ids 2022-01-30 12:02:18 -08:00
Mouse Reeve
44dad43f36 Load new fields via connector 2022-01-30 11:41:33 -08:00
Mouse Reeve
b18c69e186 Make search timeouts configurable 2022-01-07 07:42:05 -08:00
Mouse Reeve
3545085a7d Fixes tests 2021-12-14 14:19:27 -08:00
Mouse Reeve
09f5218f9c Fixes accept header 2021-12-14 13:47:09 -08:00
Mouse Reeve
6e61e4d52c
Merge pull request #1578 from bookwyrm-social/improve-compatibility
Improve federation compability with Hubzilla and Zap
2021-12-09 11:06:04 -08:00
Mouse Reeve
02313f40b8 Adds update from inventaire link for books 2021-12-05 13:48:05 -08:00
Mouse Reeve
071da7d4fb Handle various link generation needs 2021-12-05 13:38:15 -08:00
Mouse Reeve
4085714764 Update openlibrary author with ISNI 2021-12-05 13:26:22 -08:00
Mouse Reeve
d7e4e6aa1e Adds openlibrary update for book 2021-12-05 13:02:42 -08:00
Mouse Reeve
b824841cb3 Adds update logic to connectors 2021-12-05 12:47:27 -08:00
Mouse Reeve
6dd7eebd98 Fixes tests 2021-11-16 10:16:28 -08:00
Mouse Reeve
d3e4c7e8d9 Removes change to boolean logic 2021-10-27 10:40:37 -07:00
Mouse Reeve
07446fa7d2 Adds more tests for the inventaire connector 2021-10-27 10:03:09 -07:00
Mouse Reeve
8ba875af4a Improve federation compability with Hubzilla and Zap
Co-authored-by: hubzilla <redmatrix@users.noreply.github.com>
Fixes #1564
2021-10-26 14:41:06 -07:00
Mouse Reeve
1033d3d045 Updates connector tests 2021-09-30 11:33:04 -07:00
Mouse Reeve
5dd2aac600 Merge branch 'main' into search-refactor 2021-09-30 10:41:30 -07:00
Mouse Reeve
d36ef2bcf1 Pylint change 2021-09-29 12:42:28 -07:00
Mouse Reeve
32391dd64d Python formatting 2021-09-29 12:38:31 -07:00
Mouse Reeve
0aef011258 Don't use the format detail if it maps directly 2021-09-29 12:29:17 -07:00
Mouse Reeve
123b23728f Infer format in openlibrary import 2021-09-29 12:21:19 -07:00
Mouse Reeve
08f6a97653 Python formatting 2021-09-18 11:33:43 -07:00
Mouse Reeve
acfb1bb376 Updating string format synatx part 2 2021-09-18 11:32:00 -07:00
Mouse Reeve
18591c7b56 Fixes circular import 2021-09-16 11:30:04 -07:00
Mouse Reeve
fbe05623ff Updates first_search_result functionality 2021-09-16 11:07:36 -07:00
Mouse Reeve
1f06d1a1d8 Removes local connector 2021-09-14 15:26:36 -07:00
Mouse Reeve
aa91361fe4 Fixes celery kwarg for queue 2021-09-07 17:09:44 -07:00
Mouse Reeve
de3f18655c Set priorities on tasks 2021-09-07 16:33:43 -07:00
Mouse Reeve
332a712d84 Safely handle work with no editions error 2021-08-23 15:59:58 -07:00
Mouse Reeve
ad0fff7030 Prevent overwriting data on import form outside data source 2021-08-17 10:08:07 -07:00
Mouse Reeve
55d84d50ee Fixes loading editions from inventaire 2021-08-08 15:55:49 -07:00