Write documentation on configuration file and other improvements

This commit is contained in:
Alex Auvolat 2021-05-28 18:00:59 +02:00
parent b9127dd6f8
commit ebd21b325e
No known key found for this signature in database
GPG key ID: EDABF9711E244EB1
14 changed files with 516 additions and 263 deletions

View file

@ -3,12 +3,13 @@
[The Garage Data Store](./intro.md) [The Garage Data Store](./intro.md)
- [Getting Started](./getting_started/index.md) - [Getting Started](./getting_started/index.md)
- [Get a binary](./getting_started/binary.md) - [Get a binary](./getting_started/01_binary.md)
- [Configure the daemon](./getting_started/daemon.md) - [Configuring a test deployment](./getting_started/02_test_deployment.md)
- [Control the daemon](./getting_started/control.md) - [Configure a real-world deployment](./getting_started/03_real_world_deployment.md)
- [Configure a cluster](./getting_started/cluster.md) - [Control the daemon](./getting_started/04_control.md)
- [Create buckets and keys](./getting_started/bucket.md) - [Configure a cluster](./getting_started/05_cluster.md)
- [Handle files](./getting_started/files.md) - [Create buckets and keys](./getting_started/06_bucket.md)
- [Handle files](./getting_started/07_files.md)
- [Cookbook](./cookbook/index.md) - [Cookbook](./cookbook/index.md)
- [Host a website](./cookbook/website.md) - [Host a website](./cookbook/website.md)
@ -17,7 +18,8 @@
- [Recovering from failures](./cookbook/recovering.md) - [Recovering from failures](./cookbook/recovering.md)
- [Reference Manual](./reference_manual/index.md) - [Reference Manual](./reference_manual/index.md)
- [Garage CLI]() - [Garage configuration file](./reference_manual/configuration.md)
- [Garage CLI](./reference_manual/cli.md)
- [S3 API](./reference_manual/s3_compatibility.md) - [S3 API](./reference_manual/s3_compatibility.md)
- [Design](./design/index.md) - [Design](./design/index.md)

View file

@ -1 +1,3 @@
# Host a website # Host a website
TODO

View file

@ -7,14 +7,14 @@ We did not test other architecture/operating system but, as long as your archite
## From Docker ## From Docker
Our docker image is currently named `lxpz/garage_amd64` and is stored on the [Docker Hub](https://hub.docker.com/r/lxpz/garage_amd64/tags?page=1&ordering=last_updated). Our docker image is currently named `lxpz/garage_amd64` and is stored on the [Docker Hub](https://hub.docker.com/r/lxpz/garage_amd64/tags?page=1&ordering=last_updated).
We encourage you to use a fixed tag (eg. `v0.2.1`) and not the `latest` tag. We encourage you to use a fixed tag (eg. `v0.3.0`) and not the `latest` tag.
For this example, we will use the latest published version at the time of the writing which is `v0.2.1` but it's up to you For this example, we will use the latest published version at the time of the writing which is `v0.3.0` but it's up to you
to check [the most recent versions on the Docker Hub](https://hub.docker.com/r/lxpz/garage_amd64/tags?page=1&ordering=last_updated). to check [the most recent versions on the Docker Hub](https://hub.docker.com/r/lxpz/garage_amd64/tags?page=1&ordering=last_updated).
For example: For example:
``` ```
sudo docker pull lxpz/garage_amd64:v0.2.1 sudo docker pull lxpz/garage_amd64:v0.3.0
``` ```
## From source ## From source

View file

@ -0,0 +1,107 @@
# Configuring a test deployment
This section describes how to run a simple test Garage deployment with a single node.
Note that this kind of deployment should not be used in production, as it provides
no redundancy for your data!
We will also skip intra-cluster TLS configuration, meaning that if you add nodes
to your cluster, communication between them will not be secure.
First, make sure that you have Garage installed in your command line environment.
We will explain how to launch Garage in a Docker container, however we still
recommend that you install the `garage` CLI on your host system in order to control
the daemon.
## Writing a first configuration file
This first configuration file should allow you to get started easily with the simplest
possible Garage deployment:
```toml
metadata_dir = "/tmp/meta"
data_dir = "/tmp/data"
replication_mode = "none"
rpc_bind_addr = "[::]:3901"
bootstrap_peers = []
[s3_api]
s3_region = "garage"
api_bind_addr = "[::]:3900"
[s3_web]
bind_addr = "[::]:3902"
root_domain = ".web.garage"
index = "index.html"
```
Save your configuration file as `garage.toml`.
As you can see in the `metadata_dir` and `data_dir` parameters, we are saving Garage's data
in `/tmp` which gets erased when your system reboots. This means that data stored on this
Garage server will not be persistent. Change these to locations on your HDD if you want
your data to be persisted properly.
## Launching the Garage server
#### Option 1: directly (without Docker)
Use the following command to launch the Garage server with our configuration file:
```
garage server -c garage.toml
```
By default, Garage displays almost no output. You can tune Garage's verbosity as follows
(from less verbose to more verbose):
```
RUST_LOG=garage=info garage server -c garage.toml
RUST_LOG=garage=debug garage server -c garage.toml
RUST_LOG=garage=trace garage server -c garage.toml
```
Log level `info` is recommended for most use cases.
Log level `debug` can help you check why your S3 API calls are not working.
#### Option 2: in a Docker container
Use the following command to start Garage in a docker container:
```
docker run -d \
-p 3901:3901 -p 3902:3902 -p 3900:3900 \
-v ./config.toml:/garage/config.toml \
lxpz/garage_amd64:v0.3.0
```
To tune Garage's verbosity level, set the `RUST_LOG` environment variable in the configuration
at launch time. For instance:
```
docker run -d \
-p 3901:3901 -p 3902:3902 -p 3900:3900 \
-v ./config.toml:/garage/config.toml \
-e RUST_LOG=garage=info \
lxpz/garage_amd64:v0.3.0
```
## Checking that Garage runs correctly
The `garage` utility is also used as a CLI tool to configure your Garage deployment.
It tries to connect to a Garage server through the RPC protocol, by default looking
for a Garage server at `localhost:3901`.
Since our deployment already binds to port 3901, the following command should be sufficient
to show Garage's status, provided that you installed the `garage` binary on your host system:
```
garage status
```
Move on to [controlling the Garage daemon](04_control.md) to learn more about how to
use the Garage CLI to control your cluster.
Move on to [configuring your cluster](05_cluster.md) in order to configure
your single-node deployment for actual use!

View file

@ -0,0 +1,154 @@
# Configuring a real-world Garage deployment
To run Garage in cluster mode, we recommend having at least 3 nodes.
This will allow you to setup Garage for three-way replication of your data,
the safest and most available mode avaialble.
## Generating a TLS Certificate
You first need to generate TLS certificates to encrypt traffic between Garage nodes
(reffered to as RPC traffic).
To generate your TLS certificates, run on your machine:
```
wget https://git.deuxfleurs.fr/Deuxfleurs/garage/raw/branch/master/genkeys.sh
chmod +x genkeys.sh
./genkeys.sh
```
It will creates a folder named `pki/` containing the keys that you will used for the cluster.
## Real-world deployment
To run a real-world deployment, make sure you the following conditions are met:
- You have at least three machines with sufficient storage space available
- Each machine has a public IP address which is reachable by other machines.
Running behind a NAT is possible, but having several Garage nodes behind a single NAT
is slightly more involved as each will have to have a different RPC port number
(the local port number of a node must be the same as the port number exposed publicly
by the NAT).
- Ideally, each machine should have a SSD available in addition to the HDD you are dedicating
to Garage. This will allow for faster access to metadata and has the potential
to drastically reduce Garage's response times.
Before deploying garage on your infrastructure, you must inventory your machines.
For our example, we will suppose the following infrastructure with IPv6 connectivity:
| Location | Name | IP Address | Disk Space |
|----------|---------|------------|------------|
| Paris | Mercury | fc00:1::1 | 1 To |
| Paris | Venus | fc00:1::2 | 2 To |
| London | Earth | fc00:B::1 | 2 To |
| Brussels | Mars | fc00:F::1 | 1.5 To |
On each machine, we will have a similar setup,
especially you must consider the following folders/files:
- `/etc/garage/config.toml`: Garage daemon's configuration (see below)
- `/etc/garage/pki/`: Folder containing Garage certificates, must be generated on your computer and copied on the servers
- `/var/lib/garage/meta/`: Folder containing Garage's metadata, put this folder on a SSD if possible
- `/var/lib/garage/data/`: Folder containing Garage's data, this folder will grows and must be on a large storage, possibly big HDDs.
- `/etc/systemd/system/garage.service`: Service file to start garage at boot automatically (defined below, not required if you use docker)
A valid `/etc/garage/config.toml` for our cluster would be:
```toml
metadata_dir = "/var/lib/garage/meta"
data_dir = "/var/lib/garage/data"
replication_mode = "3"
rpc_bind_addr = "[::]:3901"
bootstrap_peers = [
"[fc00:1::1]:3901",
"[fc00:1::2]:3901",
"[fc00:B::1]:3901",
"[fc00:F::1]:3901",
]
[rpc_tls]
ca_cert = "/etc/garage/pki/garage-ca.crt"
node_cert = "/etc/garage/pki/garage.crt"
node_key = "/etc/garage/pki/garage.key"
[s3_api]
s3_region = "garage"
api_bind_addr = "[::]:3900"
[s3_web]
bind_addr = "[::]:3902"
root_domain = ".web.garage"
index = "index.html"
```
Please make sure to change `bootstrap_peers` to **your** IP addresses!
Check the [configuration file reference documentation](../reference_manual/configuration.md)
to learn more about all available configuration options.
### For docker users
On each machine, you can run the daemon with:
```bash
docker run \
-d \
--name garaged \
--restart always \
--network host \
-v /etc/garage/pki:/etc/garage/pki \
-v /etc/garage/config.toml:/garage/config.toml \
-v /var/lib/garage/meta:/var/lib/garage/meta \
-v /var/lib/garage/data:/var/lib/garage/data \
lxpz/garage_amd64:v0.3.0
```
It should be restart automatically at each reboot.
Please note that we use host networking as otherwise Docker containers
can not communicate with IPv6.
Upgrading between Garage versions should be supported transparently,
but please check the relase notes before doing so!
To upgrade, simply stop and remove this container and
start again the command with a new version of garage.
### For systemd/raw binary users
Create a file named `/etc/systemd/system/garage.service`:
```toml
[Unit]
Description=Garage Data Store
After=network-online.target
Wants=network-online.target
[Service]
Environment='RUST_LOG=garage=info' 'RUST_BACKTRACE=1'
ExecStart=/usr/local/bin/garage server -c /etc/garage/config.toml
[Install]
WantedBy=multi-user.target
```
To start the service then automatically enable it at boot:
```bash
sudo systemctl start garage
sudo systemctl enable garage
```
To see if the service is running and to browse its logs:
```bash
sudo systemctl status garage
sudo journalctl -u garage
```
If you want to modify the service file, do not forget to run `systemctl daemon-reload`
to inform `systemd` of your modifications.

View file

@ -6,8 +6,9 @@ The `garage` binary has two purposes:
In this section, we will see how to use the `garage` binary as a control tool for the daemon we just started. In this section, we will see how to use the `garage` binary as a control tool for the daemon we just started.
You first need to get a shell having access to this binary, which depends of your configuration: You first need to get a shell having access to this binary, which depends of your configuration:
- with `docker-compose`, run `sudo docker-compose exec g1 bash` then `/garage/garage`
- with `docker`, run `sudo docker exec -ti garaged bash` then `/garage/garage` - with `docker`, run `sudo docker exec -ti garaged bash`, you will now have a shell
where the Garage binary is available as `/garage/garage`
- with `systemd`, simply run `/usr/local/bin/garage` if you followed previous instructions - with `systemd`, simply run `/usr/local/bin/garage` if you followed previous instructions
*You can also install the binary on your machine to remotely control the cluster.* *You can also install the binary on your machine to remotely control the cluster.*
@ -27,14 +28,12 @@ The 3 first ones are certificates and keys needed by TLS, the last one is simply
Because we configure garage directly from the server, we do not need to set `--rpc-host`. Because we configure garage directly from the server, we do not need to set `--rpc-host`.
To avoid typing the 3 first options each time we want to run a command, we will create an alias. To avoid typing the 3 first options each time we want to run a command, we will create an alias.
### `docker-compose` alias ### test deployment
If you have simply deployed Garage on your local machine, without TLS, you can invoke
`garage` directly without any of these parameters and without making a `garagectl` alias
(replace mentions of `garagectl` in the next sections by `garage`).
```bash
alias garagectl='/garage/garage \
--ca-cert /pki/garage-ca.crt \
--client-cert /pki/garage.crt \
--client-key /pki/garage.key'
```
### `docker` alias ### `docker` alias
@ -45,7 +44,6 @@ alias garagectl='/garage/garage \
--client-key /etc/garage/pki/garage.key' --client-key /etc/garage/pki/garage.key'
``` ```
### raw binary alias ### raw binary alias
```bash ```bash
@ -74,4 +72,4 @@ Healthy nodes:
8781c50c410a41b3… 758338dde686 [::ffff:172.20.0.102]:3901 UNCONFIGURED/REMOVED 8781c50c410a41b3… 758338dde686 [::ffff:172.20.0.102]:3901 UNCONFIGURED/REMOVED
``` ```
...which means that you are ready to configure your cluster! ...which means that you are ready to [configure your cluster](05_cluster.md)!

View file

@ -7,7 +7,7 @@ as well as the site (think datacenter) of each machine.
## Test cluster ## Test cluster
As this part is not relevant for a test cluster, you can use this one-liner to create a basic topology: As this part is not relevant for a test cluster, you can use this three-liner to create a basic topology:
```bash ```bash
garagectl status | grep UNCONFIGURED | grep -Po '^[0-9a-f]+' | while read id; do garagectl status | grep UNCONFIGURED | grep -Po '^[0-9a-f]+' | while read id; do
@ -19,7 +19,7 @@ done
For our example, we will suppose we have the following infrastructure (Capacity, Identifier and Datacenter are specific values to garage described in the following): For our example, we will suppose we have the following infrastructure (Capacity, Identifier and Datacenter are specific values to garage described in the following):
| Location | Name | Disk Space | `Capacity` | `Identifier` | `Datacenter` | | Location | Name | Disk Space | `Capacity` | `Identifier` | `Zone` |
|----------|---------|------------|------------|--------------|--------------| |----------|---------|------------|------------|--------------|--------------|
| Paris | Mercury | 1 To | `2` | `8781c5` | `par1` | | Paris | Mercury | 1 To | `2` | `8781c5` | `par1` |
| Paris | Venus | 2 To | `4` | `2a638e` | `par1` | | Paris | Venus | 2 To | `4` | `2a638e` | `par1` |
@ -45,6 +45,15 @@ garagectl status
It will display the IP address associated with each node; from the IP address you will be able to recognize the node. It will display the IP address associated with each node; from the IP address you will be able to recognize the node.
### Zones
Zones are simply a user-chosen identifier that identify a group of server that are grouped together logically.
It is up to the system administrator deploying garage to identify what does "grouped together" means.
In most cases, a zone will correspond to a geographical location (i.e. a datacenter).
Behind the scene, Garage will use zone definition to try to store the same data on different zones,
in order to provide high availability despite failure of a zone.
### Capacity ### Capacity
Garage reasons on an arbitrary metric about disk storage that is named the *capacity* of a node. Garage reasons on an arbitrary metric about disk storage that is named the *capacity* of a node.
@ -55,19 +64,19 @@ Additionaly, the capacity values used in Garage should be as small as possible,
Here we chose that 1 unit of capacity = 0.5 To, so that we can express servers of size Here we chose that 1 unit of capacity = 0.5 To, so that we can express servers of size
1 To and 2 To, as wel as the intermediate size 1.5 To. 1 To and 2 To, as wel as the intermediate size 1.5 To.
### Datacenter Note that the amount of data stored by Garage on each server may not be strictly proportional to
its capacity value, as Garage will priorize having 3 copies of data in different zones,
Datacenter are simply a user-chosen identifier that identify a group of server that are located in the same place. even if this means that capacities will not be strictly respected. For example in our above examples,
It is up to the system administrator deploying garage to identify what does "the same place" means. nodes Earth and Mars will always store a copy of everything each, and the third copy will
Behind the scene, garage will try to store the same data on different sites to provide high availability despite a data center failure. have 66% chance of being stored by Venus and 33% chance of being stored by Mercury.
### Inject the topology ### Inject the topology
Given the information above, we will configure our cluster as follow: Given the information above, we will configure our cluster as follow:
``` ```
garagectl node configure --datacenter par1 -c 2 -t mercury 8781c5 garagectl node configure -z par1 -c 2 -t mercury 8781c5
garagectl node configure --datacenter par1 -c 4 -t venus 2a638e garagectl node configure -z par1 -c 4 -t venus 2a638e
garagectl node configure --datacenter lon1 -c 4 -t earth 68143d garagectl node configure -z lon1 -c 4 -t earth 68143d
garagectl node configure --datacenter bru1 -c 3 -t mars 212f75 garagectl node configure -z bru1 -c 3 -t mars 212f75
``` ```

View file

@ -4,6 +4,9 @@ We recommend the use of MinIO Client to interact with Garage files (`mc`).
Instructions to install it and use it are provided on the [MinIO website](https://docs.min.io/docs/minio-client-quickstart-guide.html). Instructions to install it and use it are provided on the [MinIO website](https://docs.min.io/docs/minio-client-quickstart-guide.html).
Before reading the following, you need a working `mc` command on your path. Before reading the following, you need a working `mc` command on your path.
Note that on certain Linux distributions such as Arch Linux, the Minio client binary
is called `mcli` instead of `mc` (to avoid name clashes with the Midnight Commander).
## Configure `mc` ## Configure `mc`
You need your access key and secret key created in the [previous section](bucket.md). You need your access key and secret key created in the [previous section](bucket.md).

View file

@ -1,222 +0,0 @@
# Configure the daemon
Garage is a software that can be run only in a cluster and requires at least 3 instances.
In our getting started guide, we document two deployment types:
- [Test deployment](#test-deployment) though `docker-compose`
- [Real-world deployment](#real-world-deployment) through `docker` or `systemd`
In any case, you first need to generate TLS certificates, as traffic is encrypted between Garage's nodes.
## Generating a TLS Certificate
To generate your TLS certificates, run on your machine:
```
wget https://git.deuxfleurs.fr/Deuxfleurs/garage/raw/branch/master/genkeys.sh
chmod +x genkeys.sh
./genkeys.sh
```
It will creates a folder named `pki` containing the keys that you will used for the cluster.
## Test deployment
Single machine deployment is only described through `docker-compose`.
Before starting, we recommend you create a folder for our deployment:
```bash
mkdir garage-single
cd garage-single
```
We start by creating a file named `docker-compose.yml` describing our network and our containers:
```yml
version: '3.4'
networks: { virtnet: { ipam: { config: [ subnet: 172.20.0.0/24 ]}}}
services:
g1:
image: lxpz/garage_amd64:v0.1.1d
networks: { virtnet: { ipv4_address: 172.20.0.101 }}
volumes:
- "./pki:/pki"
- "./config.toml:/garage/config.toml"
g2:
image: lxpz/garage_amd64:v0.1.1d
networks: { virtnet: { ipv4_address: 172.20.0.102 }}
volumes:
- "./pki:/pki"
- "./config.toml:/garage/config.toml"
g3:
image: lxpz/garage_amd64:v0.1.1d
networks: { virtnet: { ipv4_address: 172.20.0.103 }}
volumes:
- "./pki:/pki"
- "./config.toml:/garage/config.toml"
```
*We define a static network here which is not considered as a best practise on Docker.
The rational is that Garage only supports IP address and not domain names in its configuration, so we need to know the IP address in advance.*
and then create the `config.toml` file next to it as follow:
```toml
metadata_dir = "/garage/meta"
data_dir = "/garage/data"
rpc_bind_addr = "[::]:3901"
bootstrap_peers = [
"172.20.0.101:3901",
"172.20.0.102:3901",
"172.20.0.103:3901",
]
[rpc_tls]
ca_cert = "/pki/garage-ca.crt"
node_cert = "/pki/garage.crt"
node_key = "/pki/garage.key"
[s3_api]
s3_region = "garage"
api_bind_addr = "[::]:3900"
[s3_web]
bind_addr = "[::]:3902"
root_domain = ".web.garage"
index = "index.html"
```
*Please note that we have not mounted `/garage/meta` or `/garage/data` on the host: data will be lost when the container will be destroyed.*
And that's all, you are ready to launch your cluster!
```
sudo docker-compose up
```
While your daemons are up, your cluster is still not configured yet.
However, you can check that your services are still listening as expected by querying them from your host:
```bash
curl http://172.20.0.{101,102,103}:3902
```
which should give you:
```
Not found
Not found
Not found
```
That's all, you are ready to [configure your cluster!](./cluster.md).
## Real-world deployment
Before deploying garage on your infrastructure, you must inventory your machines.
For our example, we will suppose the following infrastructure:
| Location | Name | IP Address | Disk Space |
|----------|---------|------------|------------|
| Paris | Mercury | fc00:1::1 | 1 To |
| Paris | Venus | fc00:1::2 | 2 To |
| London | Earth | fc00:B::1 | 2 To |
| Brussels | Mars | fc00:F::1 | 1.5 To |
On each machine, we will have a similar setup, especially you must consider the following folders/files:
- `/etc/garage/pki`: Garage certificates, must be generated on your computer and copied on the servers
- `/etc/garage/config.toml`: Garage daemon's configuration (defined below)
- `/etc/systemd/system/garage.service`: Service file to start garage at boot automatically (defined below, not required if you use docker)
- `/var/lib/garage/meta`: Contains Garage's metadata, put this folder on a SSD if possible
- `/var/lib/garage/data`: Contains Garage's data, this folder will grows and must be on a large storage, possibly big HDDs.
A valid `/etc/garage/config.toml` for our cluster would be:
```toml
metadata_dir = "/var/lib/garage/meta"
data_dir = "/var/lib/garage/data"
rpc_bind_addr = "[::]:3901"
bootstrap_peers = [
"[fc00:1::1]:3901",
"[fc00:1::2]:3901",
"[fc00:B::1]:3901",
"[fc00:F::1]:3901",
]
[rpc_tls]
ca_cert = "/etc/garage/pki/garage-ca.crt"
node_cert = "/etc/garage/pki/garage.crt"
node_key = "/etc/garage/pki/garage.key"
[s3_api]
s3_region = "garage"
api_bind_addr = "[::]:3900"
[s3_web]
bind_addr = "[::]:3902"
root_domain = ".web.garage"
index = "index.html"
```
Please make sure to change `bootstrap_peers` to **your** IP addresses!
### For docker users
On each machine, you can run the daemon with:
```bash
docker run \
-d \
--name garaged \
--restart always \
--network host \
-v /etc/garage/pki:/etc/garage/pki \
-v /etc/garage/config.toml:/garage/config.toml \
-v /var/lib/garage/meta:/var/lib/garage/meta \
-v /var/lib/garage/data:/var/lib/garage/data \
lxpz/garage_amd64:v0.1.1d
```
It should be restart automatically at each reboot.
Please note that we use host networking as otherwise Docker containers can no communicate with IPv6.
To upgrade, simply stop and remove this container and start again the command with a new version of garage.
### For systemd/raw binary users
Create a file named `/etc/systemd/system/garage.service`:
```toml
[Unit]
Description=Garage Data Store
After=network-online.target
Wants=network-online.target
[Service]
Environment='RUST_LOG=garage=info' 'RUST_BACKTRACE=1'
ExecStart=/usr/local/bin/garage server -c /etc/garage/config.toml
[Install]
WantedBy=multi-user.target
```
To start the service then automatically enable it at boot:
```bash
sudo systemctl start garage
sudo systemctl enable garage
```
To see if the service is running and to browse its logs:
```bash
sudo systemctl status garage
sudo journalctl -u garage
```
If you want to modify the service file, do not forget to run `systemctl daemon-reload`
to inform `systemd` of your modifications.

View file

@ -0,0 +1,4 @@
# Garage CLI
The Garage CLI is mostly self-documented. Make use of the `help` subcommand
and the `--help` flag to discover all available options.

View file

@ -0,0 +1,196 @@
# Garage configuration file format reference
Here is an example `garage.toml` configuration file that illustrates all of the possible options:
```toml
metadata_dir = "/var/lib/garage/meta"
data_dir = "/var/lib/garage/data"
block_size = 1048576
replication_mode = "3"
rpc_bind_addr = "[::]:3901"
bootstrap_peers = [
"[fc00:1::1]:3901",
"[fc00:1::2]:3901",
"[fc00:B::1]:3901",
"[fc00:F::1]:3901",
]
consul_host = "consul.service"
consul_service_name = "garage-daemon"
max_concurrent_rpc_requests = 12
sled_cache_capacity = 134217728
sled_flush_every_ms = 2000
[rpc_tls]
ca_cert = "/etc/garage/pki/garage-ca.crt"
node_cert = "/etc/garage/pki/garage.crt"
node_key = "/etc/garage/pki/garage.key"
[s3_api]
s3_region = "garage"
api_bind_addr = "[::]:3900"
[s3_web]
bind_addr = "[::]:3902"
root_domain = ".web.garage"
index = "index.html"
```
The following gives details about each available configuration option.
## Available configuration options
#### `metadata_dir`
The directory in which Garage will store its metadata. This contains the node identifier,
the network configuration and the peer list, the list of buckets and keys as well
as the index of all objects, object version and object blocks.
Store this folder on a fast SSD drive if possible to maximize Garage's performance.
#### `data_dir`
The directory in which Garage will store the data blocks of objects.
This folder can be placed on an HDD. The space available for `data_dir`
should be counted to determine a node's capacity
when [configuring it](../getting_started/05_cluster.md).
#### `block_size`
Garage splits stored objects in consecutive chunks of size `block_size` (except the last
one which might be standard). The default size is 1MB and should work in most cases.
If you are interested in tuning this, feel free to do so (and remember to report your
findings to us!)
#### `replication_mode`
Garage supports the following replication modes:
- `none` or `1`: data stored on Garage is stored on a single node. There is no redundancy,
and data will be unavailable as soon as one node fails or its network is disconnected.
Do not use this for anything else than test deployments.
- `2`: data stored on Garage will be stored on two different nodes, if possible in different
zones. Garage tolerates one node failure before losing data. Data should be available
read-only when one node is down, but write operations will fail.
Use this only if you really have to.
- `3`: data stored on Garage will be stored on three different nodes, if possible each in
a different zones.
Garage tolerates two node failure before losing data. Data should be available
read-only when two nodes are down, and writes should be possible if only a single node
is down.
Note that in modes `2` and `3`,
if at least the same number of zones are available, an arbitrary number of failures in
any given zone is tolerated as copies of data will be spread over several zones.
**Make sure `replication_mode` is the same in the configuration files of all nodes.
Never run a Garage cluster where that is not the case.**
Changing the `replication_mode` of a cluster might work (make sure to shut down all nodes
and changing it everywhere at the time), but is not officially supported.
#### `rpc_bind_addr`
The address and port on which to bind for inter-cluster communcations
(reffered to as RPC for remote procedure calls).
The port specified here should be the same one that other nodes will used to contact
the node, even in the case of a NAT: the NAT should be configured to forward the external
port number to the same internal port nubmer. This means that if you have several nodes running
behind a NAT, they should each use a different RPC port number.
#### `bootstrap_peers`
A list of IPs and ports on which to contact other Garage peers of this cluster.
This should correspond to the RPC ports set up with `rpc_bind_addr`.
#### `consul_host` and `consul_service_name`
Garage supports discovering other nodes of the cluster using Consul.
This works only when nodes are announced in Consul by an orchestrator such as Nomad,
as Garage is not able to announce itself.
The `consul_host` parameter should be set to the hostname of the Consul server,
and `consul_service_name` should be set to the service name under which Garage's
RPC ports are announced.
#### `max_concurrent_rpc_requests`
Garage implements rate limiting for RPC requests: no more than
`max_concurrent_rpc_requests` concurrent outbound RPC requests will be made
by a Garage node (additionnal requests will be put in a waiting queue).
#### `sled_cache_capacity`
This parameter can be used to tune the capacity of the cache used by
[sled](https://sled.rs), the database Garage uses internally to store metadata.
Tune this to fit the RAM you wish to make available to your Garage instance.
More cache means faster Garage, but the default value (128MB) should be plenty
for most use cases.
#### `sled_flush_every_ms`
This parameters can be used to tune the flushing interval of sled.
Increase this if sled is thrashing your SSD, at the risk of losing more data in case
of a power outage (though this should not matter much as data is replicated on other
nodes). The default value, 2000ms, should be appropriate for most use cases.
## The `[rpc_tls]` section
This section should be used to configure the TLS certificates used to encrypt
intra-cluster traffic (RPC traffic). The following parameters should be set:
- `ca_cert`: the certificate of the CA that is allowed to sign individual node certificates
- `node_cert`: the node certificate for the current node
- `node_key`: the key associated with the node certificate
Note tha several nodes may use the same node certificate, as long as it is signed
by the CA.
If this section is absent, TLS is not used to encrypt intra-cluster traffic.
## The `[s3_api]` section
#### `api_bind_addr`
The IP and port on which to bind for accepting S3 API calls.
This endpoint does not suport TLS: a reverse proxy should be used to provide it.
#### `s3_region`
Garage will accept S3 API calls that are targetted to the S3 region defined here.
API calls targetted to other regions will fail with a AuthorizationHeaderMalformed error
message that redirects the client to the correct region.
## The `[s3_web]` section
Garage allows to publish content of buckets as websites. This section configures the
behaviour of this module.
#### `bind_addr`
The IP and port on which to bind for accepting HTTP requests to buckets configured
for website access.
This endpoint does not suport TLS: a reverse proxy should be used to provide it.
#### `root_domain`
The optionnal suffix appended to bucket names for the corresponding HTTP Host.
For instance, if `root_domain` is `web.garage.eu`, a bucket called `deuxfleurs.fr`
will be accessible either with hostname `deuxfleurs.fr.web.garage.eu`
or with hostname `deuxfleurs.fr`.
#### `index`
The name of the index file to return for requests ending with `/` (usually `index.html`).

View file

@ -1,6 +1,6 @@
## S3 Compatibility status # S3 Compatibility status
### Global S3 features ## Global S3 features
Implemented: Implemented:
@ -18,7 +18,7 @@ Not implemented:
- most `x-amz-` headers - most `x-amz-` headers
### Endpoint implementation ## Endpoint implementation
All APIs that are not mentionned are not implemented and will return a 400 bad request. All APIs that are not mentionned are not implemented and will return a 400 bad request.

View file

@ -1,8 +1,8 @@
## Load Balancing Data (planned for version 0.2) # Load Balancing Data (planned for version 0.2)
I have conducted a quick study of different methods to load-balance data over different Garage nodes using consistent hashing. I have conducted a quick study of different methods to load-balance data over different Garage nodes using consistent hashing.
### Requirements ## Requirements
- *good balancing*: two nodes that have the same announced capacity should receive close to the same number of items - *good balancing*: two nodes that have the same announced capacity should receive close to the same number of items
@ -15,9 +15,9 @@ I have conducted a quick study of different methods to load-balance data over di
replicas, independently of the order in which nodes were added/removed (this replicas, independently of the order in which nodes were added/removed (this
is to keep the implementation simple) is to keep the implementation simple)
### Methods ## Methods
#### Naive multi-DC ring walking strategy ### Naive multi-DC ring walking strategy
This strategy can be used with any ring-like algorithm to make it aware of the *multi-datacenter* requirement: This strategy can be used with any ring-like algorithm to make it aware of the *multi-datacenter* requirement:
@ -38,7 +38,7 @@ This method was implemented in the first version of Garage, with the basic
ring construction from Dynamo DB that consists in associating `n_token` random positions to ring construction from Dynamo DB that consists in associating `n_token` random positions to
each node (I know it's not optimal, the Dynamo paper already studies this). each node (I know it's not optimal, the Dynamo paper already studies this).
#### Better rings ### Better rings
The ring construction that selects `n_token` random positions for each nodes gives a ring of positions that The ring construction that selects `n_token` random positions for each nodes gives a ring of positions that
is not well-balanced: the space between the tokens varies a lot, and some partitions are thus bigger than others. is not well-balanced: the space between the tokens varies a lot, and some partitions are thus bigger than others.
@ -150,7 +150,7 @@ removing grisou gipsie : 49.22% 36.52% 12.79% 1.46%
on average: 62.94% 27.89% 8.61% 0.57% <-- WORSE THAN PREVIOUSLY on average: 62.94% 27.89% 8.61% 0.57% <-- WORSE THAN PREVIOUSLY
``` ```
#### The magical solution: multi-DC aware MagLev ### The magical solution: multi-DC aware MagLev
Suppose we want to select three replicas for each partition (this is what we do in our simulation and in most Garage deployments). Suppose we want to select three replicas for each partition (this is what we do in our simulation and in most Garage deployments).
We apply MagLev three times consecutively, one for each replica selection. We apply MagLev three times consecutively, one for each replica selection.