mirror of
https://git.deuxfleurs.fr/Deuxfleurs/garage.git
synced 2024-11-14 04:01:25 +00:00
Document multi-hdd support
This commit is contained in:
parent
bca347a1e8
commit
6595efd82f
3 changed files with 116 additions and 8 deletions
|
@ -75,16 +75,11 @@ to store 2 TB of data in total.
|
||||||
|
|
||||||
- For the metadata storage, Garage does not do checksumming and integrity
|
- For the metadata storage, Garage does not do checksumming and integrity
|
||||||
verification on its own. If you are afraid of bitrot/data corruption,
|
verification on its own. If you are afraid of bitrot/data corruption,
|
||||||
put your metadata directory on a BTRFS partition. Otherwise, just use regular
|
put your metadata directory on a ZFS or BTRFS partition. Otherwise, just use regular
|
||||||
EXT4 or XFS.
|
EXT4 or XFS.
|
||||||
|
|
||||||
- Having a single server with several storage drives is currently not very well
|
- Servers with multiple HDDs are supported natively by Garage without resorting
|
||||||
supported in Garage ([#218](https://git.deuxfleurs.fr/Deuxfleurs/garage/issues/218)).
|
to RAID, see [our dedicated documentation page](@/documentation/operations/multi-hdd.md).
|
||||||
For an easy setup, just put all your drives in a RAID0 or a ZFS RAIDZ array.
|
|
||||||
If you're adventurous, you can try to format each of your disk as
|
|
||||||
a separate XFS partition, and then run one `garage` daemon per disk drive,
|
|
||||||
or use something like [`mergerfs`](https://github.com/trapexit/mergerfs) to merge
|
|
||||||
all your disks in a single union filesystem that spreads load over them.
|
|
||||||
|
|
||||||
## Get a Docker image
|
## Get a Docker image
|
||||||
|
|
||||||
|
|
100
doc/book/operations/multi-hdd.md
Normal file
100
doc/book/operations/multi-hdd.md
Normal file
|
@ -0,0 +1,100 @@
|
||||||
|
+++
|
||||||
|
title = "Multi-HDD support"
|
||||||
|
weight = 15
|
||||||
|
+++
|
||||||
|
|
||||||
|
|
||||||
|
Since v0.9, Garage natively supports nodes that have several storage drives
|
||||||
|
for storing data blocks (not for metadata storage).
|
||||||
|
|
||||||
|
## Initial setup
|
||||||
|
|
||||||
|
To set up a new Garage storage node with multiple HDDs,
|
||||||
|
format and mount all your drives in different directories,
|
||||||
|
and use a Garage configuration as follows:
|
||||||
|
|
||||||
|
```toml
|
||||||
|
data_dir = [
|
||||||
|
{ path = "/path/to/hdd1", capacity = "2T" },
|
||||||
|
{ path = "/path/to/hdd2", capacity = "4T" },
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
Garage will automatically balance all blocks stored by the node
|
||||||
|
among the different specified directories, proportionnally to the
|
||||||
|
specified capacities.
|
||||||
|
|
||||||
|
## Updating the list of storage locations
|
||||||
|
|
||||||
|
If you add new storage locations to your `data_dir`,
|
||||||
|
Garage will not rebalance existing data between storage locations.
|
||||||
|
Newly written blocks will be balanced proportionnally to the specified capacities,
|
||||||
|
and existing data may be moved between drives to improve balancing,
|
||||||
|
but only opportunistically when a data block is re-written (e.g. an object
|
||||||
|
is re-uploaded, or an object with a duplicate block is uploaded).
|
||||||
|
|
||||||
|
To understand precisely what is happening, we need to dive in to how Garage
|
||||||
|
splits data among the different storage locations.
|
||||||
|
|
||||||
|
First of all, Garage divides the set of all possible block hashes
|
||||||
|
in a fixed number of slices (currently 1024), and assigns
|
||||||
|
to each slice a primary storage location among the specified data directories.
|
||||||
|
The number of slices having their primary location in each data directory
|
||||||
|
is proportionnal to the capacity specified in the config file.
|
||||||
|
|
||||||
|
When Garage receives a block to write, it will always write it in the primary
|
||||||
|
directory of the slice that contains its hash.
|
||||||
|
|
||||||
|
Now, to be able to not lose existing data blocks when storage locations
|
||||||
|
are added, Garage also keeps a list of secondary data directories
|
||||||
|
for all of the hash slices. Secondary data directories for a slice indicates
|
||||||
|
storage locations that once were primary directories for that slice, i.e. where
|
||||||
|
Garage knows that data blocks of that slice might be stored.
|
||||||
|
When Garage is requested to read a certain data block,
|
||||||
|
it will first look in the primary storage directory of its slice,
|
||||||
|
and if it doesn't find it there it goes through all of the secondary storage
|
||||||
|
locations until it finds it. This allows Garage to continue operating
|
||||||
|
normally when storage locations are added, without having to shuffle
|
||||||
|
files between drives to place them in the correct location.
|
||||||
|
|
||||||
|
This relatively simple strategy works well but does not ensure that data
|
||||||
|
is correctly balanced among drives according to their capacity.
|
||||||
|
To rebalance data, two strategies can be used:
|
||||||
|
|
||||||
|
- Lazy rebalancing: when a block is re-written (e.g. the object is re-uploaded),
|
||||||
|
Garage checks whether the existing copy is in the primary directory of the slice
|
||||||
|
or in a secondary directory. If the current copy is in a secondary directory,
|
||||||
|
Garage re-writes a copy in the primary directory and deletes the one from the
|
||||||
|
secondary directory.
|
||||||
|
|
||||||
|
- Active rebalancing: an operator of a Garage node can explicitly launch a repair
|
||||||
|
procedure that rebalances the data directories, moving all blocks to their
|
||||||
|
primary location. Once done, all secondary locations for all hash slices are
|
||||||
|
removed so that they won't be checked anymore when looking for a data block.
|
||||||
|
|
||||||
|
## Read-only storage locations
|
||||||
|
|
||||||
|
If you would like to move all data blocks from an existing data directory to one
|
||||||
|
or several new data directories, mark the old directory as read-only:
|
||||||
|
|
||||||
|
```toml
|
||||||
|
data_dir = [
|
||||||
|
{ path = "/path/to/old_data", read_only = true },
|
||||||
|
{ path = "/path/to/new_hdd1", capacity = "2T" },
|
||||||
|
{ path = "/path/to/new_hdd2", capacity = "4T" },
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
Garage will be able to read requested blocks from the read-only directory.
|
||||||
|
Garage will also move data out of the read-only directory either progressively
|
||||||
|
(lazy rebalancing) or if requested explicitly (active rebalancing).
|
||||||
|
|
||||||
|
Once an active rebalancing has finished, your read-only directory should be empty:
|
||||||
|
it might still contain subdirectories, but no data files. You can check that
|
||||||
|
it contains no files using:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
find -type f /path/to/old_data
|
||||||
|
```
|
||||||
|
|
||||||
|
at which point it can be removed from the `data_dir` list in your config file.
|
|
@ -91,6 +91,19 @@ This folder can be placed on an HDD. The space available for `data_dir`
|
||||||
should be counted to determine a node's capacity
|
should be counted to determine a node's capacity
|
||||||
when [adding it to the cluster layout](@/documentation/cookbook/real-world.md).
|
when [adding it to the cluster layout](@/documentation/cookbook/real-world.md).
|
||||||
|
|
||||||
|
Since `v0.9.0`, Garage supports multiple data directories with the following syntax:
|
||||||
|
|
||||||
|
```toml
|
||||||
|
data_dir = [
|
||||||
|
{ path = "/path/to/old_data", read_only = true },
|
||||||
|
{ path = "/path/to/new_hdd1", capacity = "2T" },
|
||||||
|
{ path = "/path/to/new_hdd2", capacity = "4T" },
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
See [the dedicated documentation page](@/documentation/operations/multi-hdd.md)
|
||||||
|
on how to operate Garage in such a setup.
|
||||||
|
|
||||||
### `db_engine` (since `v0.8.0`)
|
### `db_engine` (since `v0.8.0`)
|
||||||
|
|
||||||
By default, Garage uses the Sled embedded database library
|
By default, Garage uses the Sled embedded database library
|
||||||
|
|
Loading…
Reference in a new issue