mirror of
https://git.deuxfleurs.fr/Deuxfleurs/garage.git
synced 2024-11-10 18:21:06 +00:00
Merge pull request 'updates to documentation for v0.8' (#385) from doc-0.8 into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/385
This commit is contained in:
commit
4fba06d62e
17 changed files with 242 additions and 101 deletions
|
@ -5,12 +5,14 @@ weight = 25
|
|||
|
||||
## Configuring a bucket for website access
|
||||
|
||||
There are two methods to expose buckets as website:
|
||||
There are three methods to expose buckets as website:
|
||||
|
||||
1. using the PutBucketWebsite S3 API call, which is allowed for access keys that have the owner permission bit set
|
||||
|
||||
2. from the Garage CLI, by an adminstrator of the cluster
|
||||
|
||||
3. using the Garage administration API
|
||||
|
||||
The `PutBucketWebsite` API endpoint [is documented](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutBucketWebsite.html) in the official AWS docs.
|
||||
This endpoint can also be called [using `aws s3api`](https://docs.aws.amazon.com/cli/latest/reference/s3api/put-bucket-website.html) on the command line.
|
||||
The website configuration supported by Garage is only a subset of the possibilities on Amazon S3: redirections are not supported, only the index document and error document can be specified.
|
||||
|
|
|
@ -20,57 +20,76 @@ sudo apt-get update
|
|||
sudo apt-get install build-essential
|
||||
```
|
||||
|
||||
## Using source from the Gitea repository (recommended)
|
||||
## Building from source from the Gitea repository
|
||||
|
||||
The primary location for Garage's source code is the
|
||||
[Gitea repository](https://git.deuxfleurs.fr/Deuxfleurs/garage).
|
||||
[Gitea repository](https://git.deuxfleurs.fr/Deuxfleurs/garage),
|
||||
which contains all of the released versions as well as the code
|
||||
for the developpement of the next version.
|
||||
|
||||
Clone the repository and build Garage with the following commands:
|
||||
Clone the repository and enter it as follows:
|
||||
|
||||
```bash
|
||||
git clone https://git.deuxfleurs.fr/Deuxfleurs/garage.git
|
||||
cd garage
|
||||
cargo build
|
||||
```
|
||||
|
||||
Be careful, as this will make a debug build of Garage, which will be extremely slow!
|
||||
To make a release build, invoke `cargo build --release` (this takes much longer).
|
||||
|
||||
The binaries built this way are found in `target/{debug,release}/garage`.
|
||||
|
||||
## Using source from `crates.io`
|
||||
|
||||
Garage's source code is published on `crates.io`, Rust's official package repository.
|
||||
This means you can simply ask `cargo` to download and build this source code for you:
|
||||
If you wish to build a specific version of Garage, check out the corresponding tag. For instance:
|
||||
|
||||
```bash
|
||||
cargo install garage
|
||||
git tag # List available tags
|
||||
git checkout v0.8.0 # Change v0.8.0 with the version you wish to build
|
||||
```
|
||||
|
||||
That's all, `garage` should be in `$HOME/.cargo/bin`.
|
||||
Otherwise you will be building a developpement build from the `main` branch
|
||||
that includes all of the changes to be released in the next version.
|
||||
Be careful that such a build might be unstable or contain bugs,
|
||||
and could be incompatible with nodes that run stable versions of Garage.
|
||||
|
||||
You can add this folder to your `$PATH` or copy the binary somewhere else on your system.
|
||||
For instance:
|
||||
Finally, build Garage with the following command:
|
||||
|
||||
```bash
|
||||
sudo cp $HOME/.cargo/bin/garage /usr/local/bin/garage
|
||||
cargo build --release
|
||||
```
|
||||
|
||||
The binary built this way can now be found in `target/release/garage`.
|
||||
You may simply copy this binary to somewhere in your `$PATH` in order to
|
||||
have the `garage` command available in your shell, for instance:
|
||||
|
||||
## Selecting features to activate in your build
|
||||
```bash
|
||||
sudo cp target/release/garage /usr/local/bin/garage
|
||||
```
|
||||
|
||||
Garage supports a number of compilation options in the form of Cargo features,
|
||||
If you are planning to develop Garage,
|
||||
you might be interested in producing debug builds, which compile faster but run slower:
|
||||
this can be done by removing the `--release` flag, and the resulting build can then
|
||||
be found in `target/debug/garage`.
|
||||
|
||||
## List of available Cargo feature flags
|
||||
|
||||
Garage supports a number of compilation options in the form of Cargo feature flags,
|
||||
which can be used to provide builds adapted to your system and your use case.
|
||||
The following features are available:
|
||||
To produce a build with a given set of features, invoke the `cargo build` command
|
||||
as follows:
|
||||
|
||||
| Feature | Enabled | Description |
|
||||
| ------- | ------- | ----------- |
|
||||
| `bundled-libs` | BY DEFAULT | Use bundled version of sqlite3, zstd, lmdb and libsodium |
|
||||
| `system-libs` | optional | Use system version of sqlite3, zstd, lmdb and libsodium if available (exclusive with `bundled-libs`, build using `cargo build --no-default-features --features system-libs`) |
|
||||
| `k2v` | optional | Enable the experimental K2V API (if used, all nodes on your Garage cluster must have it enabled as well) |
|
||||
| `kubernetes-discovery` | optional | Enable automatic registration and discovery of cluster nodes through the Kubernetes API |
|
||||
| `metrics` | BY DEFAULT | Enable collection of metrics in Prometheus format on the admin API |
|
||||
```bash
|
||||
# This will build the default feature set plus feature1, feature2 and feature3
|
||||
cargo build --release --features feature1,feature2,feature3
|
||||
# This will build ONLY feature1, feature2 and feature3
|
||||
cargo build --release --no-default-features \
|
||||
--features feature1,feature2,feature3
|
||||
```
|
||||
|
||||
The following feature flags are available in v0.8.0:
|
||||
|
||||
| Feature flag | Enabled | Description |
|
||||
| ------------ | ------- | ----------- |
|
||||
| `bundled-libs` | *by default* | Use bundled version of sqlite3, zstd, lmdb and libsodium |
|
||||
| `system-libs` | optional | Use system version of sqlite3, zstd, lmdb and libsodium<br>if available (exclusive with `bundled-libs`, build using<br>`cargo build --no-default-features --features system-libs`) |
|
||||
| `k2v` | optional | Enable the experimental K2V API (if used, all nodes on your<br>Garage cluster must have it enabled as well) |
|
||||
| `kubernetes-discovery` | optional | Enable automatic registration and discovery<br>of cluster nodes through the Kubernetes API |
|
||||
| `metrics` | *by default* | Enable collection of metrics in Prometheus format on the admin API |
|
||||
| `telemetry-otlp` | optional | Enable collection of execution traces using OpenTelemetry |
|
||||
| `sled` | BY DEFAULT | Enable using Sled to store Garage's metadata |
|
||||
| `sled` | *by default* | Enable using Sled to store Garage's metadata |
|
||||
| `lmdb` | optional | Enable using LMDB to store Garage's metadata |
|
||||
| `sqlite` | optional | Enable using Sqlite3 to store Garage's metadata |
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "Benchmarks"
|
||||
weight = 10
|
||||
weight = 40
|
||||
+++
|
||||
|
||||
With Garage, we wanted to build a software defined storage service that follow the [KISS principle](https://en.wikipedia.org/wiki/KISS_principle),
|
||||
|
|
|
@ -1,13 +1,13 @@
|
|||
+++
|
||||
title = "Goals and use cases"
|
||||
weight = 5
|
||||
weight = 10
|
||||
+++
|
||||
|
||||
## Goals and non-goals
|
||||
|
||||
Garage is a lightweight geo-distributed data store that implements the
|
||||
[Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/API/Welcome.html)
|
||||
object storage protocole. It enables applications to store large blobs such
|
||||
object storage protocol. It enables applications to store large blobs such
|
||||
as pictures, video, images, documents, etc., in a redundant multi-node
|
||||
setting. S3 is versatile enough to also be used to publish a static
|
||||
website.
|
||||
|
|
|
@ -20,6 +20,49 @@ In the meantime, you can find some information at the following links:
|
|||
- [an old design draft](@/documentation/working-documents/design-draft.md)
|
||||
|
||||
|
||||
## Request routing logic
|
||||
|
||||
Data retrieval requests to Garage endpoints (S3 API and websites) are resolved
|
||||
to an individual object in a bucket. Since objects are replicated to multiple nodes
|
||||
Garage must ensure consistency before answering the request.
|
||||
|
||||
### Using quorum to ensure consistency
|
||||
|
||||
Garage ensures consistency by attempting to establish a quorum with the
|
||||
data nodes responsible for the object. When a majority of the data nodes
|
||||
have provided metadata on a object Garage can then answer the request.
|
||||
|
||||
When a request arrives Garage will, assuming the recommended 3 replicas, perform the following actions:
|
||||
|
||||
- Make a request to the two preferred nodes for object metadata
|
||||
- Try the third node if one of the two initial requests fail
|
||||
- Check that the metadata from at least 2 nodes match
|
||||
- Check that the object hasn't been marked deleted
|
||||
- Answer the request with inline data from metadata if object is small enough
|
||||
- Or get data blocks from the preferred nodes and answer using the assembled object
|
||||
|
||||
Garage dynamically determines which nodes to query based on health, preference, and
|
||||
which nodes actually host a given data. Garage has no concept of "primary" so any
|
||||
healthy node with the data can be used as long as a quorum is reached for the metadata.
|
||||
|
||||
### Node health
|
||||
|
||||
Garage keeps a TCP session open to each node in the cluster and periodically pings them. If a connection
|
||||
cannot be established, or a node fails to answer a number of pings, the target node is marked as failed.
|
||||
Failed nodes are not used for quorum or other internal requests.
|
||||
|
||||
### Node preference
|
||||
|
||||
Garage prioritizes which nodes to query according to a few criteria:
|
||||
|
||||
- A node always prefers itself if it can answer the request
|
||||
- Then the node prioritizes nodes in the same zone
|
||||
- Finally the nodes with the lowest latency are prioritized
|
||||
|
||||
|
||||
For further reading on the cluster structure look at the [gateway](@/documentation/cookbook/gateways.md)
|
||||
and [cluster layout management](@/documentation/reference-manual/layout.md) pages.
|
||||
|
||||
## Garbage collection
|
||||
|
||||
A faulty garbage collection procedure has been the cause of
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "Related work"
|
||||
weight = 15
|
||||
weight = 50
|
||||
+++
|
||||
|
||||
## Context
|
||||
|
|
|
@ -9,6 +9,15 @@ Let's start your Garage journey!
|
|||
In this chapter, we explain how to deploy Garage as a single-node server
|
||||
and how to interact with it.
|
||||
|
||||
## What is Garage?
|
||||
|
||||
Before jumping in, you might be interested in reading the following pages:
|
||||
|
||||
- [Goals and use cases](@/documentation/design/goals.md)
|
||||
- [List of features](@/documentation/reference-manual/features.md)
|
||||
|
||||
## Scope of this tutorial
|
||||
|
||||
Our goal is to introduce you to Garage's workflows.
|
||||
Following this guide is recommended before moving on to
|
||||
[configuring a multi-node cluster](@/documentation/cookbook/real-world.md).
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "Administration API"
|
||||
weight = 16
|
||||
weight = 60
|
||||
+++
|
||||
|
||||
The Garage administration API is accessible through a dedicated server whose
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "Garage CLI"
|
||||
weight = 15
|
||||
weight = 30
|
||||
+++
|
||||
|
||||
The Garage CLI is mostly self-documented. Make use of the `help` subcommand
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "Configuration file format"
|
||||
weight = 5
|
||||
weight = 20
|
||||
+++
|
||||
|
||||
Here is an example `garage.toml` configuration file that illustrates all of the possible options:
|
||||
|
@ -10,7 +10,6 @@ metadata_dir = "/var/lib/garage/meta"
|
|||
data_dir = "/var/lib/garage/data"
|
||||
|
||||
block_size = 1048576
|
||||
block_manager_background_tranquility = 2
|
||||
|
||||
replication_mode = "3"
|
||||
|
||||
|
@ -87,17 +86,6 @@ files will remain available. This however means that chunks from existing files
|
|||
will not be deduplicated with chunks from newly uploaded files, meaning you
|
||||
might use more storage space that is optimally possible.
|
||||
|
||||
### `block_manager_background_tranquility`
|
||||
|
||||
This parameter tunes the activity of the background worker responsible for
|
||||
resyncing data blocks between nodes. The higher the tranquility value is set,
|
||||
the more the background worker will wait between iterations, meaning the load
|
||||
on the system (including network usage between nodes) will be reduced. The
|
||||
minimal value for this parameter is `0`, where the background worker will
|
||||
allways work at maximal throughput to resynchronize blocks. The default value
|
||||
is `2`, where the background worker will try to spend at most 1/3 of its time
|
||||
working, and 2/3 sleeping in order to reduce system load.
|
||||
|
||||
### `replication_mode`
|
||||
|
||||
Garage supports the following replication modes:
|
||||
|
|
125
doc/book/reference-manual/features.md
Normal file
125
doc/book/reference-manual/features.md
Normal file
|
@ -0,0 +1,125 @@
|
|||
+++
|
||||
title = "List of Garage features"
|
||||
weight = 10
|
||||
+++
|
||||
|
||||
|
||||
### S3 API
|
||||
|
||||
The main goal of Garage is to provide an object storage service that is compatible with the
|
||||
[S3 API](https://docs.aws.amazon.com/AmazonS3/latest/API/Welcome.html) from Amazon Web Services.
|
||||
We try to adhere as strictly as possible to the semantics of the API as implemented by Amazon
|
||||
and other vendors such as Minio or CEPH.
|
||||
|
||||
Of course Garage does not implement the full span of API endpoints that AWS S3 does;
|
||||
the exact list of S3 features implemented by Garage can be found [on our S3 compatibility page](@/documentation/reference-manual/s3-compatibility.md).
|
||||
|
||||
### Geo-distribution
|
||||
|
||||
Garage allows you to store copies of your data in multiple geographical locations in order to maximize resilience
|
||||
to adverse events, such as network/power outages or hardware failures.
|
||||
This allows Garage to run very well even at home, using consumer-grade Internet connectivity
|
||||
(such as FTTH) and power, as long as cluster nodes can be spawned at several physical locations.
|
||||
Garage exploits knowledge of the capacity and physical location of each storage node to design
|
||||
a storage plan that best exploits the available storage capacity while satisfying the geo-distributed replication constraint.
|
||||
|
||||
To learn more about geo-distributed Garage clusters,
|
||||
read our documentation on [setting up a real-world deployment](@/documentation/cookbook/real-world.md).
|
||||
|
||||
### Standalone/self-contained
|
||||
|
||||
Garage is extremely simple to deploy, and does not depend on any external service to run.
|
||||
This makes setting up and administering storage clusters, we hope, as easy as it could be.
|
||||
|
||||
### Flexible topology
|
||||
|
||||
A Garage cluster can very easily evolve over time, as storage nodes are added or removed.
|
||||
Garage will automatically rebalance data between nodes as needed to ensure the desired number of copies.
|
||||
Read about cluster layout management [here](@/documentation/reference-manual/layout.md).
|
||||
|
||||
### No RAFT slowing you down
|
||||
|
||||
It might seem strange to tout the absence of something as a desirable feature,
|
||||
but this is in fact a very important point! Garage does not use RAFT or another
|
||||
consensus algorithm internally to order incoming requests: this means that all requests
|
||||
directed to a Garage cluster can be handled independently of one another instead
|
||||
of going through a central bottleneck (the leader node).
|
||||
As a consequence, requests can be handled much faster, even in cases where latency
|
||||
between cluster nodes is important (see our [benchmarks](@/documentation/design/benchmarks/index.md) for data on this).
|
||||
This is particularly usefull when nodes are far from one another and talk to one other through standard Internet connections.
|
||||
|
||||
### Several replication modes
|
||||
|
||||
Garage supports a variety of replication modes, with 1 copy, 2 copies or 3 copies of your data,
|
||||
and with various levels of consistency, in order to adapt to a variety of usage scenarios.
|
||||
Read our reference page on [supported replication modes](@/documentation/reference-manual/configuration.md#replication-mode)
|
||||
to select the replication mode best suited to your use case (hint: in most cases, `replication_mode = "3"` is what you want).
|
||||
|
||||
### Web server for static websites
|
||||
|
||||
A storage bucket can easily be configured to be served directly by Garage as a static web site.
|
||||
Domain names for multiple websites directly map to bucket names, making it easy to build
|
||||
a platform for your users to autonomously build and host their websites over Garage.
|
||||
Surprisingly, none of the other alternative S3 implementations we surveyed (such as Minio
|
||||
or CEPH) support publishing static websites from S3 buckets, a feature that is however
|
||||
directly inherited from S3 on AWS.
|
||||
Read more on our [dedicated documentation page](@/documentation/cookbook/exposing-websites.md).
|
||||
|
||||
### Bucket names as aliases
|
||||
|
||||
In Garage, a bucket may have several names, known as aliases.
|
||||
Aliases can easily be added and removed on demand:
|
||||
this allows to easily rename buckets if needed
|
||||
without having to copy all of their content, something that cannot be done on AWS.
|
||||
For buckets served as static websites, having multiple aliases for a bucket can allow
|
||||
exposing the same content under different domain names.
|
||||
|
||||
Garage also supports bucket aliases which are local to a single user:
|
||||
this allows different users to have different buckets with the same name, thus avoiding naming collisions.
|
||||
This can be helpfull for instance if you want to write an application that creates per-user buckets with always the same name.
|
||||
|
||||
This feature is totally invisible to S3 clients and does not break compatibility with AWS.
|
||||
|
||||
### Cluster administration API
|
||||
|
||||
Garage provides a fully-fledged REST API to administer your cluster programatically.
|
||||
Functionnality included in the admin API include: setting up and monitoring
|
||||
cluster nodes, managing access credentials, and managing storage buckets and bucket aliases.
|
||||
A full reference of the administration API is available [here](@/documentation/reference-manual/admin-api.md).
|
||||
|
||||
### Metrics and traces
|
||||
|
||||
Garage makes some internal metrics available in the Prometheus data format,
|
||||
which allows you to build interactive dashboards to visualize the load and internal state of your storage cluster.
|
||||
|
||||
For developpers and performance-savvy administrators,
|
||||
Garage also supports exporting traces of what it does internally in OpenTelemetry format.
|
||||
This allows to monitor the time spent at various steps of the processing of requests,
|
||||
in order to detect potential performance bottlenecks.
|
||||
|
||||
### Kubernetes and Nomad integrations
|
||||
|
||||
Garage can automatically discover other nodes in the cluster thanks to integration
|
||||
with orchestrators such as Kubernetes and Nomad (when used with Consul).
|
||||
This eases the configuration of your cluster as it removes one step where nodes need
|
||||
to be manually connected to one another.
|
||||
|
||||
### Support for changing IP addresses
|
||||
|
||||
As long as all of your nodes don't thange their IP address at the same time,
|
||||
Garage should be able to tolerate nodes with changing/dynamic IP addresses,
|
||||
as nodes will regularly exchange the IP addresses of their peers and try to
|
||||
reconnect using newer addresses when existing connections are broken.
|
||||
|
||||
### K2V API (experimental)
|
||||
|
||||
As part of an ongoing research project, Garage can expose an experimental key/value storage API called K2V.
|
||||
K2V is made for the storage and retrieval of many small key/value pairs that need to be processed in bulk.
|
||||
This completes the S3 API with an alternative that can be used to easily store and access metadata
|
||||
related to objects stored in an S3 bucket.
|
||||
|
||||
In the context of our research project, [Aérogramme](https://aerogramme.deuxfleurs.fr),
|
||||
K2V is used to provide metadata and log storage for operations on encrypted e-mail storage.
|
||||
|
||||
Learn more on the specification of K2V [here](https://git.deuxfleurs.fr/Deuxfleurs/garage/src/branch/k2v/doc/drafts/k2v-spec.md)
|
||||
and on how to enable it in Garage [here](@/documentation/reference-manual/k2v.md).
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "K2V"
|
||||
weight = 30
|
||||
weight = 70
|
||||
+++
|
||||
|
||||
Starting with version 0.7.2, Garage introduces an optionnal feature, K2V,
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "Cluster layout management"
|
||||
weight = 10
|
||||
weight = 50
|
||||
+++
|
||||
|
||||
The cluster layout in Garage is a table that assigns to each node a role in
|
||||
|
|
|
@ -1,45 +0,0 @@
|
|||
+++
|
||||
title = "Request routing logic"
|
||||
weight = 10
|
||||
+++
|
||||
|
||||
Data retrieval requests to Garage endpoints (S3 API and websites) are resolved
|
||||
to an individual object in a bucket. Since objects are replicated to multiple nodes
|
||||
Garage must ensure consistency before answering the request.
|
||||
|
||||
## Using quorum to ensure consistency
|
||||
|
||||
Garage ensures consistency by attempting to establish a quorum with the
|
||||
data nodes responsible for the object. When a majority of the data nodes
|
||||
have provided metadata on a object Garage can then answer the request.
|
||||
|
||||
When a request arrives Garage will, assuming the recommended 3 replicas, perform the following actions:
|
||||
|
||||
- Make a request to the two preferred nodes for object metadata
|
||||
- Try the third node if one of the two initial requests fail
|
||||
- Check that the metadata from at least 2 nodes match
|
||||
- Check that the object hasn't been marked deleted
|
||||
- Answer the request with inline data from metadata if object is small enough
|
||||
- Or get data blocks from the preferred nodes and answer using the assembled object
|
||||
|
||||
Garage dynamically determines which nodes to query based on health, preference, and
|
||||
which nodes actually host a given data. Garage has no concept of "primary" so any
|
||||
healthy node with the data can be used as long as a quorum is reached for the metadata.
|
||||
|
||||
## Node health
|
||||
|
||||
Garage keeps a TCP session open to each node in the cluster and periodically pings them. If a connection
|
||||
cannot be established, or a node fails to answer a number of pings, the target node is marked as failed.
|
||||
Failed nodes are not used for quorum or other internal requests.
|
||||
|
||||
## Node preference
|
||||
|
||||
Garage prioritizes which nodes to query according to a few criteria:
|
||||
|
||||
- A node always prefers itself if it can answer the request
|
||||
- Then the node prioritizes nodes in the same zone
|
||||
- Finally the nodes with the lowest latency are prioritized
|
||||
|
||||
|
||||
For further reading on the cluster structure look at the [gateway](@/documentation/cookbook/gateways.md)
|
||||
and [cluster layout management](@/documentation/reference-manual/layout.md) pages.
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "S3 Compatibility status"
|
||||
weight = 20
|
||||
weight = 40
|
||||
+++
|
||||
|
||||
## DISCLAIMER
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "Design draft"
|
||||
weight = 25
|
||||
title = "Design draft (obsolete)"
|
||||
weight = 50
|
||||
+++
|
||||
|
||||
**WARNING: this documentation is a design draft which was written before Garage's actual implementation.
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
+++
|
||||
title = "Load balancing data"
|
||||
weight = 10
|
||||
title = "Load balancing data (obsolete)"
|
||||
weight = 60
|
||||
+++
|
||||
|
||||
**This is being yet improved in release 0.5. The working document has not been updated yet, it still only applies to Garage 0.2 through 0.4.**
|
||||
|
|
Loading…
Reference in a new issue