Add documentation on durability and repair procedures (fix #219)

2024-11-22 16:11:01 +00:00 · 2023-06-14 11:54:21 +02:00 · 2023-06-14 11:54:21 +02:00 · 9233661967
commit 9233661967
parent 3aadba724d
3 changed files with 116 additions and 2 deletions
--- a/doc/book/cookbook/durability-repairs.md
+++ b/doc/book/cookbook/durability-repairs.md
@ -0,0 +1,114 @@
 +++
 title = "Durability & Repairs"
 weight = 50
 +++
 To ensure the best durability of your data and to fix any inconsistencies that may
 pop up in a distributed system, Garage provides a serires of repair operations.
 This guide will explain the meaning of each of them and when they should be applied.
 # General syntax of repair operations
 Repair operations described below are of the form `garage repair <repair_name>`.
 These repairs will not launch without the `--yes` flag, which should
 be added as follows: `garage repair --yes <repair_name>`.
 By default these repair procedures will only run on the Garage node your CLI is
 connecting to. To run on all nodes, add the `-a` flag as follows:
 `garage repair -a --yes <repair_name>`.
 # Data block operations
 ## Data store scrub
 Scrubbing the data store means examining each individual data block to check that
 their content is correct, by verifying their hash. Any block found to be corrupted
 (e.g. by bitrot or by an accidental manipulation of the datastore) will be
 restored from another node that holds a valid copy.
 A scrub is run automatically by Garage every 30 days. It can also be launched
 manually using `garage repair scrub start`.
 To view the status of an ongoing scrub, first find the task ID of the scrub worker
 using `garage worker list`. Then, run `garage worker info <scrub_task_id>` to
 view detailed runtime statistics of the scrub. To gather cluster-wide information,
 this command has to be run on each individual node.
 A scrub is a very disk-intensive operation that might slow down your cluster.
 You may pause an ongoing scrub using `garage repair scrub pause`, but note that
 the scrub will resume automatically 24 hours later as Garage will not let your
 cluster run without a regular scrub. If the scrub procedure is too intensive
 for your servers and is slowing down your workload, the recommended solution
 is to increase the "scrub tranquility" using `garage repair scrub set-tranquility`.
 A higher tranquility value will make Garage take longer pauses between two block
 verifications. Of course, scrubbing the entire data store will also take longer.
 ## Block check and resync
 In some cases, nodes hold a reference to a block but do not actually have the block
 stored on disk. Conversely, they may also have on disk blocks that are not referenced
 any more. To fix both cases, a block repair may be run with `garage repair blocks`.
 This will scan the entire block reference counter table to check that the blocks
 exist on disk, and will scan the entire disk store to check that stored blocks
 are referenced.
 It is recommended to run this procedure when changing your cluster layout,
 after the metadata tables have finished synchronizing between nodes
 (usually a few hours after `garage layout apply`).
 ## Inspecting lost blocks
 In extremely rare situations, data blocks may be unavailable from the entire cluster.
 This means that even using `garage repair blocks`, some nodes may be unable
 to fetch data blocks for which they hold a reference.
 These errors are stored on each node in a list of "block resync errors", i.e.
 blocks for which the last resync operation failed.
 This list can be inspected using `garage block list-errors`.
 These errors usually fall into one of the following categories:
 1. a block is still referenced but the object was deleted, this is a case
   of metadata reference inconsistency (see below for the fix)
 2. a block is referenced by a non-deleted object, but could not be fetched due
   to a transient error such as a network failure
 3. a block is referenced by a non-deleted object, but could not be fetched due
   to a permanent error such as there not being any valid copy of the block on the
   entire cluster
 To help make the difference between cases 1 and cases 2 and 3, you may use the
 `garage block info` command to see which objects hold a reference to each block.
 In the second case (transient errors), Garage will try to fetch the block again
 after a certain time, so the error should disappear natuarlly. You can also
 request Garage to try to fetch the block immediately using `garage block retry-now`
 if you have fixed the transient issue.
 If you are confident that you are in the third scenario and that your data block
 is definitely lost, then there is no other choice than to declare your S3 objects
 as unrecoverable, and to delete them properly from the data store. This can be done
 using the `garage block purge` command.
 # Metadata operations
 ## Metadata table resync
 Garage automatically resyncs all entries stored in the metadata tables every hour,
 to ensure that all nodes have the most up-to-date version of all the information
 they should be holding.
 The resync procedure is based on a Merkle tree that allows to efficiently find
 differences between nodes.
 In some special cases, e.g. before an upgrade, you might want to run a table
 resync manually. This can be done using `garage repair tables`.
 ## Metadata table reference fixes
 In some very rare cases where nodes are unavailable, some references between objects
 are broken. For instance, if an object is deleted, the underlying versions or data
 blocks may still be held by Garage. If you suspect that such corruption has occurred
 in your cluster, you can run one of the following repair procedures:
 - `garage repair versions`: checks that all versions belong to a non-deleted object, and purges any orphan version
 - `garage repair block_refs`: checks that all block references belong to a non-deleted object version, and purges any orphan block reference (this will then allow the blocks to be garbage-collected)
--- a/doc/book/cookbook/recovering.md
+++ b/doc/book/cookbook/recovering.md
@ -1,6 +1,6 @@
 +++
 title = "Recovering from failures"
-weight = 50
+weight = 60
 +++
 Garage is meant to work on old, second-hand hardware.
--- a/doc/book/cookbook/upgrading.md
+++ b/doc/book/cookbook/upgrading.md
@ -1,6 +1,6 @@
 +++
 title = "Upgrading Garage"
-weight = 60
+weight = 70
 +++
 Garage is a stateful clustered application, where all nodes are communicating together and share data structures.