Commit graph

238 commits

Author SHA1 Message Date
Bart Schuurmans 609bc15406 Support http:// protocol in BookWyrm connector 2024-04-24 15:30:47 +02:00
Bart Schuurmans 3aefbb548e Allow serving BookWyrm on a non-standard port 2024-04-24 15:30:47 +02:00
Margaret Fero 91fe4ad535 Fix spacing for linter 2024-03-02 17:31:16 -08:00
Margaret Fero 9fa09d5ebe Add extra space required by linter 2024-03-02 17:30:37 -08:00
Margaret Fero 39da471f79 Disable Pylint Failure for imghdr deprecation for now 2024-03-02 15:59:17 -08:00
Joeri de Ruiter a901014e48 Change import of clean 2023-08-02 19:37:52 +02:00
Joeri de Ruiter ae5c27f3bb Sanitise description from Open Library 2023-08-02 19:30:40 +02:00
Joeri de Ruiter f4a4b59a14 Merge branch 'main' into markdown-import 2023-08-02 19:19:07 +02:00
Joeri de Ruiter 1a733746f2 Only remove surrounding p tags if there are no other p tags 2023-08-01 12:17:57 +02:00
Joeri de Ruiter 1a215e9b9e Convert description from Markdown to HTML when importing from Open Library 2023-08-01 11:45:46 +02:00
Joeri de Ruiter 2920973961 Some small improvements to annotations 2023-07-28 20:54:03 +02:00
Joeri de Ruiter f07d7b02f1 Type annotations and related changes for bookwyrm.connectors 2023-07-28 17:43:32 +02:00
Wesley Aptekar-Cassels 3e78e398c0 Switch from priority queues to function-based queues
Fixes: #2907
2023-07-20 12:25:30 -04:00
Mouse Reeve cbb027c56c
Merge pull request #2778 from ranok/upstream_pr
Move the search request logic into the AbstractConnector
2023-04-25 16:20:24 -07:00
Jacob Torrey 84834eb5d3 Run bw-dev black to fix formatting
Signed-off-by: Jacob Torrey <jacob@jacobtorrey.com>
2023-04-17 15:06:41 +00:00
Wesley Aptekar-Cassels 1048638e30 Stop ignoring task results
This is essentially a revert of 9cbff312a. The commit was at the advice
of the Celery docs for optimization, but I've since decided that the
downsides in terms of making things harder to debug (it makes Flower
nearly useless, for instance) are bigger than the upsides in performance
gain (which seem extremely small in practice, given how long our tasks
take, and the number of tasks we have).
2023-04-07 21:51:44 -04:00
Josh Soref 7f8279fe54 spelling: format
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2023-04-04 20:02:54 -04:00
Josh Soref 8d4b69927b spelling: directly
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2023-04-04 20:02:54 -04:00
Josh Soref 9ea5a3b89c spelling: data
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2023-04-04 20:02:54 -04:00
Josh Soref 06fa1adc27 spelling: arbitrary
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2023-04-04 20:02:54 -04:00
Jacob Torrey f9c75a43ae Fixing pylint issues
Signed-off-by: Jacob Torrey <jacob@jacobtorrey.com>
2023-04-04 16:46:32 +00:00
Jacob Torrey 797d339132 Move the search request logic into the AbstractConnector to allow for more flexibility
Signed-off-by: Jacob Torrey <jacob@jacobtorrey.com>
2023-04-04 16:03:37 +00:00
Wesley Aptekar-Cassels 9cbff312a5 Ignore Celery task results
Since we don't use the results of our Celery tasks (all of them return
None implicitly), it's prudent to set the ignore_result flag, for a
potential performance improvement. See the Celery docs for details [1].

We could do this with the global CELERY_IGNORE_RESULT setting, but it
offers more flexibility if we want to use task results in the future to
set it on a per-task basis.

[1]: https://docs.celeryq.dev/en/stable/userguide/tasks.html#ignore-results-you-don-t-want
2023-03-08 02:12:13 -05:00
0x29a 22eeee7368 Urlencode search query 2023-02-02 21:02:57 +01:00
Hugh Rundle e8452011f7 handle get_data exceptions better
Makes exception handling more precise, only raising status for 401s.

Also fixes a string pylint was complaining about.
2023-01-20 19:55:38 +11:00
Hugh Rundle 4108238716 resolve SECURE_FETCH bugs
ERROR HANDLING FIXES

- use raise_for_status() to pass through response code
- handle exceptions where no response object is passed through

INSTANCE ACTOR

- models.User.objects.create_user function cannot take an ID
- allow instance admins to determine username and email for instance actor in settings.py
2023-01-20 16:32:17 +11:00
Mouse Reeve 0a12be8279 Appease pylint 2022-11-25 10:41:04 -08:00
Mouse Reeve 44d308abad Fixes error on importing from inventaire 2022-11-25 09:35:26 -08:00
Mouse Reeve b37a4322de Change log level to info for connector exceptions
These errors in resolve_remote_id aren't really errors, they're
routine problems that we can expect from dealing with the outside world,
like a connection timeout, a server being down, a server being blocked,
et cetera. It's cluttering up the logs and causing unnecessary worry.
2022-11-17 12:35:19 -08:00
André Jaenisch 530d7de309
Use variable instead of string
Signed-off-by: André Jaenisch <andre.jaenisch@posteo.de>
2022-11-13 16:59:05 +01:00
Hugh Rundle 1ee2ff4811 normalise isbn on local book search
- uppercase ISBN before checking it's a number to account for trailing 'x'
- check maybe_isbn for search_identifiers search. Without this we are only searching external connectors, not locally!
2022-08-30 20:00:09 +10:00
Hugh Rundle 18d3d2f85d linting 2022-08-28 17:30:46 +10:00
Hugh Rundle f219851f3a strip leading and following spaces from ISBN 2022-08-28 17:28:00 +10:00
Hugh Rundle da5fd32196 normalise isbn searching
ISBNs are always numeric except for when the check digit in ISBN-10s is a ten, indicated with a capital X.
These changes ensure that ISBNs are always upper-case so that a lower-case 'x' is not used when searching.

Additionally some ancient ISBNs have been printed without a leading zero (i.e. they only have 9 characters on the physical book). This change prepends a zero if something looks like an ISBN but only has 9 chars.
2022-08-28 11:05:40 +10:00
Mouse Reeve 5706028656 Log failing to connect as info instead of exception
These are normal, expected errors, and while we should probably
re-evaluate the connectors in some way, pending that, there's no need to
log these as unepected errors, which causes confusion and clutters my
error logging.
2022-07-11 08:47:18 -07:00
Mouse Reeve 5d363da175 Handle getting edition data as dict or string 2022-07-03 11:05:20 -07:00
Mouse Reeve e7b0a84ded
Merge pull request #2142 from bookwyrm-social/load-data-duration
Split expand book data task into per-edition tasks
2022-06-30 11:47:23 -07:00
Mouse Reeve d149e57494 Split expand book data task into per-edition tasks
Loading every edition in one task takes ages, and produces a large task
that clogs up the queue. This will create more, smaller tasks that will
finish more quickly.
2022-05-31 12:41:57 -07:00
Mouse Reeve 374fdcf467 Use relative list order ranking in openlibrary search
Set OpenLibrary search condifidence based on the provided result order,
just using 1/(list index), so the first has rank 1, the second 0.5, the
third 0.33, et cetera.
2022-05-31 10:22:49 -07:00
Mouse Reeve c3b35760a2 Updates test mocks for remote search 2022-05-31 09:37:54 -07:00
Mouse Reeve 969db13ff2 Safely return None in remote search return_first 2022-05-31 08:49:23 -07:00
Mouse Reeve a053f20961 Re-implements return first option
Since we get all the results quickly now, this aggregates all the
results that came back and sorts them by confidence, and returns the
highest confidence result. The confidences aren't great on free text
search, but conceptually that's how it should work at least.

It may make sense to aggregate the search results in all contexts, but
I'll propose that in a separate PR.
2022-05-31 08:20:59 -07:00
Mouse Reeve 98ed03b6b4 Python formatting and test update 2022-05-30 17:00:34 -07:00
Mouse Reeve 83ee5a756f Filter intentaire results by confidence 2022-05-30 16:42:37 -07:00
Mouse Reeve af19d728d2 Removes outdated unit tests 2022-05-30 16:16:10 -07:00
Mouse Reeve 87fe984462 Combines search formatter and parser function
The parser was extracting the list of search results from the json
object returned by the search endpoint, and the formatter was converting
an individual json entry into a SearchResult object. This just merged
them into one function, because they are never used separately.
2022-05-30 12:52:31 -07:00
Mouse Reeve 525e2a591d More error handing
Adds logging and error handling for some of the numerous ways a request
could fail (the remote site is down, the url is blocked, etc).

I also have the results boxes open by default, which makes it more
legible imo.
2022-05-30 12:40:13 -07:00
Mouse Reeve 45f2199c71 Gather and wait on async requests
This sends out the request tasks all at once and then aggregates the
results, instead of just running them one after another asynchronously.
2022-05-30 12:05:22 -07:00
Mouse Reeve 5e81ec75fb Set request headers in async search get request
Gotta ask for json
2022-05-30 11:19:16 -07:00
Mouse Reeve 9a9cef7766 Verify url before async search
The database lookup doesn't work during the asyn process, so this change
loops through the connectors and grabs the formatted urls before sending
it to the async handler.
2022-05-30 11:16:05 -07:00