possible fix for #3372 - user export timeouts

This definitely needs to be tested on a large DB but I believe it may fix the timeouts b.s. gets when running user exports.

Instead of a gigantic single DB query with heaps of joins, we instead just do a series of simple queries and then use union()
to pull them into a de-duped queryset.

If I understand the results from explain() correctly, this is a massive reduction in DB work:

Unique  (cost=195899.15..198201.71 rows=11808 width=19220)

vs

Unique  (cost=150.28..153.44 rows=16 width=19220)
This commit is contained in:
Hugh Rundle 2024-06-09 10:34:22 +10:00
parent 3545a1c3b6
commit 261e794c1c
No known key found for this signature in database
GPG key ID: A7E35779918253F9

View file

@ -315,19 +315,28 @@ def export_book(user: User, edition: Edition):
def get_books_for_user(user):
"""Get all the books and editions related to a user"""
"""
Get all the books and editions related to a user.
editions = (
Edition.objects.select_related("parent_work")
.filter(
Q(shelves__user=user)
| Q(readthrough__user=user)
| Q(review__user=user)
| Q(list__user=user)
| Q(comment__user=user)
| Q(quotation__user=user)
)
.distinct()
We use union() instead of Q objects because it creates
multiple simple queries in stead of a much more complex DB query
that can time out.
"""
shelf_eds = Edition.objects.select_related("parent_work").filter(shelves__user=user)
rt_eds = Edition.objects.select_related("parent_work").filter(
readthrough__user=user
)
review_eds = Edition.objects.select_related("parent_work").filter(review__user=user)
list_eds = Edition.objects.select_related("parent_work").filter(list__user=user)
comment_eds = Edition.objects.select_related("parent_work").filter(
comment__user=user
)
quote_eds = Edition.objects.select_related("parent_work").filter(
quotation__user=user
)
editions = shelf_eds.union(rt_eds, review_eds, list_eds, comment_eds, quote_eds)
return editions