The queries as they previously existed required joining together 12
different tables, which is extremely expensive. Splitting it into four
queries means that the individual queries can effectively use the
indexes we have, and should be very fast no matter how many statuses are
in the database.
Removing the .distinct() call is fine, since we're adding them to a set
in Redis anyways, which will take care of the duplicates.
It's a bit ugly that we now make four separate calls to Redis (this
might result in things being slightly slower in cases where there are an
extremely small number of statuses), but doing things differently would
result in significantly more surgery to the existing code, so I've opted
to avoid that for the moment.
Fixes: #2725
This splits HomeStream.get_audience into two separate database queries,
in order to more effectively take advantage of the indexes we have.
Combining the user ID query and the user following query means that
Postgres isn't able to use the index we have on the userfollows table.
The query planner claims that the userfollows query should be about 20
times faster than it was previously, and the id query should take a
negligible amount of time, since it's selecting a single item by primary
key.
We don't need to worry about duplicates, since there is a constraint
preventing a user from following themself.
Fixes: #2720
Anywhere we have a user object, we can easily get the user ID in the
caller, and this will allow us more flexibility in the future to
implement optimizations that involve knowing a user ID without querying
the database for the user object.
Since we don't use the results of our Celery tasks (all of them return
None implicitly), it's prudent to set the ignore_result flag, for a
potential performance improvement. See the Celery docs for details [1].
We could do this with the global CELERY_IGNORE_RESULT setting, but it
offers more flexibility if we want to use task results in the future to
set it on a per-task basis.
[1]: https://docs.celeryq.dev/en/stable/userguide/tasks.html#ignore-results-you-don-t-want
Previously, every time a status was saved, a task would start to add it
to people's timelines. This meant there were a ton of duplicate tasks
that were potentially heavy to run. Now, the Status model has a "ready"
field which indicates that it's worth updating the timelines. It
defaults to True, which prevents statuses from accidentally not being
added due to ready state.
The ready state is explicitly set to false in the view, which is the
source of most of the noise for that task.
Retains 'direct' messages at the top of the logic tree to make it easier to understand.
In practice because direct messages are excluded from feeds anyway, this doesn't seem to make much difference, but it's easier to read.
This is in response to #1870
Users should not see links to posts they are not allowed to see, in their feed. The main question is how to stop that happening.
This commit hides all replies to posts if the original post was "followers only" and the user is not a follower of the original poster. The privacy of the reply is not considered relevant (except "direct").
I believe this is the cleanest way to deal with the problem, as it avoids orphaned replies and confusing 404s, and a reply without access to the context of the original post is not particularly useful to anyone. This also feels like it respects the wishes of the original poster more accurately, as it does not draw attention from non-followers to the original followers-only post.
A less draconian approach might be to remove the link to the original status in the feed interface, however that simply leads to confusion of another kind since it will make the interface inconsistent.
This commit does not change any ActivityPub behaviour - it only affects the Bookwyrm user feeds. This means orphaned posts may be sent to external apps like Mastodon.