mirror of
https://github.com/jointakahe/takahe.git
synced 2024-11-21 23:01:00 +00:00
Write some more docs
This commit is contained in:
parent
c8ad22a704
commit
807d546b12
5 changed files with 130 additions and 61 deletions
63
docs/domains.rst
Normal file
63
docs/domains.rst
Normal file
|
@ -0,0 +1,63 @@
|
|||
Domains
|
||||
=======
|
||||
|
||||
One of our key design features in Takahē is that we support multiple different
|
||||
domains for ActivityPub users to be under.
|
||||
|
||||
As a server administrator, you do this by specifying one or more Domains on
|
||||
your server that users can make Identities (posting accounts) under.
|
||||
|
||||
Domains can take two forms:
|
||||
|
||||
* **Takahē lives on and serves the domain**. In this case, you just set the domain
|
||||
to point to Takahē and ensure you have a matching domain record; ignore the
|
||||
"service domain" setting.
|
||||
|
||||
* **Takahē handles accounts under the domain but does not live on it**. For
|
||||
example, you wanted to service the ``@andrew@aeracode.org`` handle, but there
|
||||
is already a site on ``aeracode.org``, and Takahē instead must live elsewhere
|
||||
(e.g. ``fedi.aeracode.org``).
|
||||
|
||||
In this second case, you need to have a *service domain* - a place where
|
||||
Takahē and the Actor URIs for your users live, but which is different to your
|
||||
main domain you'd like the account handles to contain.
|
||||
|
||||
To set this up, you need to:
|
||||
|
||||
* Choose a service domain and point it at Takahē. *You cannot change this
|
||||
domain later without breaking everything*, so choose very wisely.
|
||||
|
||||
* On your primary domain, forward the URLs ``/.well-known/webfinger``,
|
||||
``/.well-known/nodeinfo`` and ``/.well-known/host-meta`` to Takahē.
|
||||
|
||||
* Set up a domain with these separate primary and service domains in its
|
||||
record.
|
||||
|
||||
|
||||
Technical Details
|
||||
-----------------
|
||||
|
||||
At its core, ActivityPub is a system built around URIs; the
|
||||
``@username@domain.tld`` format is actually based on Webfinger, a different
|
||||
standard, and merely used to discover the Actor URI for someone.
|
||||
|
||||
Making a system that allows any Webfinger handle to be accepted is relatively
|
||||
easy, but unfortunately this is only how users are discovered via mentions
|
||||
and search; when an incoming Follow comes in, or a Post is boosted onto your
|
||||
timeline, you have to discover the user's Webfinger handle
|
||||
*from their Actor URI* and this is where it gets tricky.
|
||||
|
||||
Mastodon, and from what we can tell most other implementations, do this by
|
||||
taking the ``preferredUsername`` field from the Actor object, the domain from
|
||||
the Actor URI, and webfinger that combination of username and domain. This
|
||||
means that the domain you serve the Actor URI on must uniquely map to a
|
||||
Webfinger handle domain - they don't need to match, but they do need to be
|
||||
translatable into one another.
|
||||
|
||||
Takahē handles all this internally, however, with a concept of Domains. Each
|
||||
domain has a primary (display) domain name, and an optional "service" domain;
|
||||
the primary domain is what we will use for the user's Webfinger handle, and
|
||||
the service domain is what their Actor URI is served on.
|
||||
|
||||
We look at ``HOST`` headers on incoming requests to match users to their
|
||||
domains, though for Actor URIs we ensure the domain is in the URI anyway.
|
|
@ -15,4 +15,5 @@ in alpha. For more information about Takahē, see
|
|||
:caption: Contents:
|
||||
|
||||
installation
|
||||
principles
|
||||
domains
|
||||
stator
|
||||
|
|
|
@ -14,6 +14,7 @@ Prerequisites
|
|||
* SSL support (Takahē *requires* HTTPS)
|
||||
* Something that can run Docker/OCI images
|
||||
* A PostgreSQL 14 (or above) database
|
||||
* Hosting/reverse proxy that passes the ``HOST`` header down to Takahē
|
||||
* One of these to store uploaded images and media:
|
||||
|
||||
* Amazon S3
|
||||
|
@ -28,7 +29,7 @@ This means that a "serverless" platform like AWS Lambda or Google Cloud Run is
|
|||
not enough by itself; while you can use these to serve the web pages if you
|
||||
like, you will need to run the Stator runner somewhere else as well.
|
||||
|
||||
The flagship Takahē instance, [takahe.social](https://takahe.social), runs
|
||||
The flagship Takahē instance, `takahe.social <https://takahe.social>`_, runs
|
||||
inside of Kubernetes, with one Deployment for the webserver and one for the
|
||||
Stator runner.
|
||||
|
||||
|
|
|
@ -1,59 +0,0 @@
|
|||
Design Principles
|
||||
=================
|
||||
|
||||
Takahē is somewhat opinionated in its design goals, which are:
|
||||
|
||||
* Simplicity of maintenance and operation
|
||||
* Multiple domain support
|
||||
* Asychronous Python core
|
||||
* Low-JS user interface
|
||||
|
||||
These are explained more below, but it's important to stress the one thing we
|
||||
are not aiming for - scalability.
|
||||
|
||||
If we wanted to build a system that could handle hundreds of thousands of
|
||||
accounts on a single server, it would be built very differently - queues
|
||||
everywhere as the primary communication mechanism, most likely - but we're
|
||||
not aiming for that.
|
||||
|
||||
Our final design goal is for around 10,000 users to work well, provided you do
|
||||
some PostgreSQL optimisation. It's likely the design will work beyond that,
|
||||
but we're not going to put any specific effort towards it.
|
||||
|
||||
After all, if you want to scale in a federated system, you can always launch
|
||||
more servers. We'd rather work towards the ability to share moderation and
|
||||
administration workloads across servers rather than have one giant big one.
|
||||
|
||||
|
||||
Simplicity Of Maintenance
|
||||
-------------------------
|
||||
|
||||
It's important that, when running a social networking server, you have as much
|
||||
time to focus on moderation and looking after your users as you can, rather
|
||||
than trying to be an SRE.
|
||||
|
||||
To this end, we use our deliberate design aim of "small to medium size" to try
|
||||
and keep the infrastructure simple - one set of web servers, one set of task
|
||||
runners, and a PostgreSQL database.
|
||||
|
||||
The task system (which we call Stator) is not based on a task queue, but on
|
||||
a state machine per type of object - which have retry logic built in. The
|
||||
system continually examines every object to see if it can progress its state
|
||||
by performing an action, which is not quite as *efficient* as using a queue,
|
||||
but recovers much more easily and doesn't get out of sync.
|
||||
|
||||
|
||||
Multiple Domain Support
|
||||
-----------------------
|
||||
|
||||
TODO
|
||||
|
||||
|
||||
Asynchronous Python
|
||||
-------------------
|
||||
|
||||
TODO
|
||||
|
||||
|
||||
Low-JS User Interface
|
||||
---------------------
|
63
docs/stator.rst
Normal file
63
docs/stator.rst
Normal file
|
@ -0,0 +1,63 @@
|
|||
Stator
|
||||
======
|
||||
|
||||
Takahē's background task system is called Stator, and rather than being a
|
||||
transitional task queue, it is instead a *reconciliation loop* system; the
|
||||
workers look for objects that could have actions taken, try to take them, and
|
||||
update them if successful.
|
||||
|
||||
As someone running Takahē, the most important aspects of this are:
|
||||
|
||||
* You have to run at least one Stator worker to make things like follows,
|
||||
posting, and timelines work.
|
||||
|
||||
* You can run as many workers as you want; there is a locking system to ensure
|
||||
they can coexist.
|
||||
|
||||
* You can get away without running any workers for a few minutes; the server
|
||||
will continue to accept posts and follows from other servers, and will
|
||||
process them when a worker comes back up.
|
||||
|
||||
* There is no separate queue to run, flush or replay; it is all stored in the
|
||||
main database.
|
||||
|
||||
* If all your workers die, just restart them, and within a few minutes the
|
||||
existing locks will time out and the system will recover itself and process
|
||||
everything that's pending.
|
||||
|
||||
You run a worker via the command ``manage.py runstator``. It will run forever
|
||||
until it is killed; send SIGINT (Ctrl-C) to it once to have it enter graceful
|
||||
shutdown, and a second time to force exiting immediately.
|
||||
|
||||
|
||||
Technical Details
|
||||
-----------------
|
||||
|
||||
Each object managed by Stator has a set of extra columns:
|
||||
|
||||
* ``state``, the name of a state in a state machine
|
||||
* ``state_ready``, a boolean saying if it's ready to have a transition tried
|
||||
* ``state_changed``, when it entered into its current state
|
||||
* ``state_attempted``, when a transition was last attempted
|
||||
* ``state_locked_until``, when the entry is locked by a worker until
|
||||
|
||||
They also have an associated state machine which is a subclass of
|
||||
``stator.graph.StateGraph``, which will define a series of states, the
|
||||
possible transitions between them, and handlers that run for each state to see
|
||||
if a transition is possible.
|
||||
|
||||
An object becoming ready for execution happens first:
|
||||
|
||||
* If it's just entered into a new state, or just created, it is marked ready.
|
||||
* If ``state_attempted`` is far enough in the past (based on the ``try_interval``
|
||||
of the current state), a small scheduling loop marks it as ready.
|
||||
|
||||
Then, in the main fast loop of the worker, it:
|
||||
|
||||
* Selects an item with ``state_ready`` that is in a state it can handle (some
|
||||
states are "externally progressed" and will not have handlers run)
|
||||
* Fires up a coroutine for that handler and lets it run
|
||||
* When that coroutine exits, sees if it returned a new state name and if so,
|
||||
transitions the object to that state.
|
||||
* If that coroutine errors or exits with ``None`` as a return value, it marks
|
||||
down the attempt and leaves the object to be rescheduled after its ``try_interval``.
|
Loading…
Reference in a new issue