[db-snapshot] documentation for metadata db snapshots

2024-11-25 09:31:00 +00:00 · 2024-03-15 13:16:41 +01:00 · 2024-03-15 13:16:41 +01:00 · 8cf3d24875
commit 8cf3d24875
parent a68c37555d
5 changed files with 114 additions and 7 deletions
--- a/doc/book/cookbook/real-world.md
+++ b/doc/book/cookbook/real-world.md
@ -72,13 +72,14 @@ to store 2 TB of data in total.
  to RAID, see [our dedicated documentation page](@/documentation/operations/multi-hdd.md).
 - For the metadata storage, Garage does not do checksumming and integrity
-  verification on its own. Users have reported that when using the LMDB
+  verification on its own, so it is better to use a robust filesystem such as
-  database engine (the default), database files have a tendency of becoming
+  BTRFS or ZFS. Users have reported that when using the LMDB database engine
-  corrupted after an unclean shutdown (e.g. a power outage), so you should use
+  (the default), database files have a tendency of becoming corrupted after an
-  a robust filesystem such as BTRFS or ZFS for the metadata partition, and take
+  unclean shutdown (e.g. a power outage), so you should take regular snapshots
-  regular snapshots so that you can restore to a recent known-good state in
+  to be able to recover from such a situation.  This can be done using Garage's
-  case of an incident.  If you cannot do so, you might want to switch to Sqlite
+  built-in automatic snapshotting (since v0.9.4), or by using filesystem level
-  which is more robust.
+  snapshots. If you cannot do so, you might want to switch to Sqlite which is
  more robust.
 - LMDB is the fastest and most tested database engine, but it has the following
  weaknesses: 1/ data files are not architecture-independent, you cannot simply
@ -124,6 +125,7 @@ A valid `/etc/garage.toml` for our cluster would look as follows:
 metadata_dir = "/var/lib/garage/meta"
 data_dir = "/var/lib/garage/data"
 db_engine = "lmdb"
 metadata_auto_snapshot_interval = "6h"
 replication_mode = "3"
--- a/doc/book/operations/durability-repairs.md
+++ b/doc/book/operations/durability-repairs.md
@ -104,6 +104,24 @@ operation will also move out all data from locations marked as read-only.
 # Metadata operations
 ## Metadata snapshotting
 It is good practice to setup automatic snapshotting of your metadata database
 file, to recover from situations where it becomes corrupted on disk. This can
 be done at the filesystem level if you are using ZFS or BTRFS.
 Since Garage v0.9.4, Garage is able to take snapshots of the metadata database
 itself. This basically amounts to copying the database file, except that it can
 be run live while Garage is running without the risk of corruption or
 inconsistencies.  This can be setup to run automatically on a schedule using
 [`metadata_auto_snapshot_interval`](@/documentation/reference-manual/configuration.md#metadata_auto_snapshot_interval).
 A snapshot can also be triggered manually using the `garage meta snapshot`
 command. Note that taking a snapshot using this method is very intensive as it
 requires making a full copy of the database file, so you might prefer using
 filesystem-level snapshots if possible. To recover a corrupted node from such a
 snapshot, read the instructions
 [here](@/documentation/operations/recovering.md#corrupted_meta).
 ## Metadata table resync
 Garage automatically resyncs all entries stored in the metadata tables every hour,
--- a/doc/book/operations/recovering.md
+++ b/doc/book/operations/recovering.md
@ -108,3 +108,57 @@ garage layout apply   # once satisfied, apply the changes
 Garage will then start synchronizing all required data on the new node.
 This process can be monitored using the `garage stats -a` command.
 ## Replacement scenario 3: corrupted metadata {#corrupted_meta}
 In some cases, your metadata DB file might become corrupted, for instance if
 your node suffered a power outage and did not shut down properly. In this case,
 you can recover without having to change the node ID and rebuilding a cluster
 layout. This means that data blocks will not need to be shuffled around, you
 must simply find a way to repair the metadata file. The best way is generally
 to discard the corrupted file and recover it from another source.
 First of all, start by locating the database file in your metadata directory,
 which [depends on your `db_engine`
 choice](@/documentation/reference-manual/configuration.md#db_engine).  Then,
 your recovery options are as follows:
 - **Option 1: resyncing from other nodes.** In case your cluster is replicated
  with two or three copies, you can simply delete the database file, and Garage
  will resync from other nodes. To do so, stop Garage, delete the database file
  or directory, and restart Garage. Then, do a full table repair by calling
  `garage repair -a --yes tables`.  This will take a bit of time to complete as
  the new node will need to receive copies of the metadata tables from the
  network.
 - **Option 2: restoring a snapshot taken by Garage.** Since v0.9.4, Garage can
  [automatically take regular
  snapshots](@/documentation/reference-manual/configuration.md#metadata_auto_snapshot_interval)
  of your metadata DB file. This file or directory should be located under
  `<metadata_dir>/snapshots`, and is named according to the UTC time at which it
  was taken. Stop Garage, discard the database file/directory and replace it by the
  snapshot you want to use. For instance, in the case of LMDB:
  ```bash
  cd $METADATA_DIR
  mv db.lmdb db.lmdb.bak
  cp -r snapshots/2024-03-15T12:13:52Z db.lmdb
  ```
  And for Sqlite:
  ```bash
  cd $METADATA_DIR
  mv db.sqlite db.sqlite.bak
  cp snapshots/2024-03-15T12:13:52Z db.sqlite
  ```
  Then, restart Garage and run a full table repair by calling `garage repair -a
  --yes tables`.  This should run relatively fast as only the changes that
  occurred since the snapshot was taken will need to be resynchronized. Of
  course, if your cluster is not replicated, you will lose all changes that
  occurred since the snapshot was taken.
 - **Option 3: restoring a filesystem-level snapshot.** If you are using ZFS or
  BTRFS to snapshot your metadata partition, refer to their specific
  documentation on rolling back or copying files from an old snapshot.
--- a/doc/book/operations/upgrading.md
+++ b/doc/book/operations/upgrading.md
@ -73,6 +73,18 @@ The entire procedure would look something like this:
  You can do all of the nodes in a single zone at once as that won't impact global cluster availability.
  Do not try to make a backup of the metadata folder of a running node.
  **Since Garage v0.9.4,** you can use the `garage meta snapshot --all` command
  to take a simultaneous snapshot of the metadata database files of all your
  nodes.  This avoids the tedious process of having to take them down one by
  one before upgrading. Be careful that if automatic snapshotting is enabled,
  Garage only keeps the last two snapshots and deletes older ones, so you might
  want to disable automatic snapshotting in your upgraded configuration file
  until you have confirmed that the upgrade ran successfully.  In addition to
  snapshotting the metadata databases of your nodes, you should back-up at
  least the `cluster_layout` file of one of your Garage instances (this file
  should be the same on all nodes and you can copy it safely while Garage is
  running).
 3. Prepare your binaries and configuration files for the new Garage version
 4. Restart all nodes simultaneously in the new version
--- a/doc/book/reference-manual/configuration.md
+++ b/doc/book/reference-manual/configuration.md
@ -15,6 +15,7 @@ data_dir = "/var/lib/garage/data"
 metadata_fsync = true
 data_fsync = false
 disable_scrub = false
 metadata_auto_snapshot_interval = "6h"
 db_engine = "lmdb"
@ -90,6 +91,7 @@ Top-level configuration options:
 [`db_engine`](#db_engine),
 [`disable_scrub`](#disable_scrub),
 [`lmdb_map_size`](#lmdb_map_size),
 [`metadata_auto_snapshot_interval`](#metadata_auto_snapshot_interval),
 [`metadata_dir`](#metadata_dir),
 [`metadata_fsync`](#metadata_fsync),
 [`replication_mode`](#replication_mode),
@ -346,6 +348,25 @@ at the cost of a moderate drop in write performance.
 Similarly to `metatada_fsync`, this is likely not necessary
 if geographical replication is used.
 #### `metadata_auto_snapshot_interval` (since Garage v0.9.4) {#metadata_auto_snapshot_interval}
 If this value is set, Garage will automatically take a snapshot of the metadata
 DB file at a regular interval and save it in the metadata directory.
 This can allow to recover from situations where the metadata DB file is corrupted,
 for instance after an unclean shutdown.
 See [this page](@/documentation/operations/recovering.md#corrupted_meta) for details.
 Garage keeps only the two most recent snapshots of the metadata DB and deletes
 older ones automatically.
 Note that taking a metadata snapshot is a relatively intensive operation as the
 entire data file is copied. A snapshot being taken might have performance
 impacts on the Garage node while it is running. If the cluster is under heavy
 write load when a snapshot operation is running, this might also cause the
 database file to grow in size significantly as pages cannot be recycled easily.
 For this reason, it might be better to use filesystem-level snapshots instead
 if possible.
 #### `disable_scrub` {#disable_scrub}
 By default, Garage runs a scrub of the data directory approximately once per