Before splitting the publish_one/1 function into two parts for testing purposes we had logic that checked the keys of params for :unreachable_since and if it was absent it did not set the instance as reachable. There is also a test to validate that when unreachable_since is nil, we set it as reachable.
However the default value of :unreachable_since when an instance is reachable is nil. The test appears to be testing a scenario that does not exist in the real world, and with this refactor we will always have an :unreachable_since key.
We were attempting to update the reachability upon every successful federation because we always include it when we generate the publish_one jobs.
Gun's connection pool also returns an error if duplicate workers are launched simultaneously. Snooze on this error as well, and lower the snooze to 3 seconds with the optimism that the connection will still be open by then and the delivery can be completed quickly.
The original setting of 30 seconds is pretty high and means there's an unnatural lag between deliveries of activities destined to the same server that were created at nearly the same time. This configuration should be more efficient.
Our test environment cheats by constructing a conn with a custom oauth_access/2 function. This assigns a :token to the conn but due to the way it is constructed it has the :user preloaded. When the OAuth Plug fetches a token it does not preload the user, so the check for user.disclose_client was always nil and assumed to be false.
Preloading the :user ensures the test environment matches reality.
It is angry we are making a fake %Plug.Conn{} to pass through Signature.validate_signature/1. We can work around it by making the code support a map, but then we lose the benefit of being able to use put_req_header/3
This logic only exists in the Plug, so attempting to validate the signature by calling the library function HTTPSignature.validate_conn/2 directly will never work because we do not attempt to construct the (request-target) and @request-target headers with both the commonly misinterpreted and correct implementation of this field. Therefore all attempts to validate a signature from an Oban Job will fail.
When signatures fail on incoming activities we put the job into Oban to be processed later instead of doing the user fetching and validation inline which is expensive and increases latency on the incoming POST request. Unfortunately we did not retain the :method, :request_path, and :query_string parameters from the conn so the signature validation and Oban Job would always fail.
This was most obvious when Mastodon sends Deletes for users your server has never seen before.