
213 lines
7.6 KiB
Raw Normal View History

Implement ListMultipartUploads (#171) Implement ListMultipartUploads, also refactor ListObjects and ListObjectsV2. It took me some times as I wanted to propose the following things: - Using an iterator instead of the loop+goto pattern. I find it easier to read and it should enable some optimizations. For example, when consuming keys of a common prefix, we do many [redundant checks]( while the only thing to do is to [check if the following key is still part of the common prefix]( - Try to name things (see ExtractionResult and RangeBegin enums) and to separate concerns (see ListQuery and Accumulator) - An IO closure to make unit tests possibles. - Unit tests, to track regressions and document how to interact with the code - Integration tests with `s3api`. In the future, I would like to move them in Rust with the aws rust SDK. Merging of the logic of ListMultipartUploads and ListObjects was not a goal but a consequence of the previous modifications. Some points that we might want to discuss: - ListObjectsV1, when using pagination and delimiters, has a weird behavior (it lists multiple times the same prefix) with `aws s3api` due to the fact that it can not use our optimization to skip the whole prefix. It is independant from my refactor and can be tested with the commented `s3api` tests in ``. It probably has the same weird behavior on the official AWS S3 implementation. - Considering ListMultipartUploads, I had to "abuse" upload id marker to support prefix skipping. I send an `upload-id-marker` with the hardcoded value `include` to emulate your "including" token. - Some ways to test ListMultipartUploads with existing software (my tests are limited to s3api for now). Co-authored-by: Quentin Dufour <> Reviewed-on: Co-authored-by: Quentin <> Co-committed-by: Quentin <>
2022-01-12 18:04:55 +00:00
#!/usr/bin/env bash
2020-11-29 16:03:08 +00:00
set -ex
export LC_ALL=C.UTF-8
export LANG=C.UTF-8
2020-11-29 16:03:08 +00:00
SCRIPT_FOLDER="`dirname \"$0\"`"
2022-10-18 11:17:30 +00:00
2021-10-04 16:27:57 +00:00
Implement ListMultipartUploads (#171) Implement ListMultipartUploads, also refactor ListObjects and ListObjectsV2. It took me some times as I wanted to propose the following things: - Using an iterator instead of the loop+goto pattern. I find it easier to read and it should enable some optimizations. For example, when consuming keys of a common prefix, we do many [redundant checks]( while the only thing to do is to [check if the following key is still part of the common prefix]( - Try to name things (see ExtractionResult and RangeBegin enums) and to separate concerns (see ListQuery and Accumulator) - An IO closure to make unit tests possibles. - Unit tests, to track regressions and document how to interact with the code - Integration tests with `s3api`. In the future, I would like to move them in Rust with the aws rust SDK. Merging of the logic of ListMultipartUploads and ListObjects was not a goal but a consequence of the previous modifications. Some points that we might want to discuss: - ListObjectsV1, when using pagination and delimiters, has a weird behavior (it lists multiple times the same prefix) with `aws s3api` due to the fact that it can not use our optimization to skip the whole prefix. It is independant from my refactor and can be tested with the commented `s3api` tests in ``. It probably has the same weird behavior on the official AWS S3 implementation. - Considering ListMultipartUploads, I had to "abuse" upload id marker to support prefix skipping. I send an `upload-id-marker` with the hardcoded value `include` to emulate your "including" token. - Some ways to test ListMultipartUploads with existing software (my tests are limited to s3api for now). Co-authored-by: Quentin Dufour <> Reviewed-on: Co-authored-by: Quentin <> Co-committed-by: Quentin <>
2022-01-12 18:04:55 +00:00
2020-11-29 16:03:08 +00:00
2021-05-15 08:24:20 +00:00
# @FIXME Duck is not ready for testing, we have a bug
echo "⏳ Setup"
2020-11-29 16:03:08 +00:00
${SCRIPT_FOLDER}/ > /tmp/garage.log 2>&1 &
sleep 6
2020-11-29 16:03:08 +00:00
which garage
garage -c /tmp/config.1.toml status
garage -c /tmp/config.1.toml key list
garage -c /tmp/config.1.toml bucket list
2020-12-06 09:04:17 +00:00
dd if=/dev/urandom of=/tmp/garage.1.rnd bs=1k count=2 # No multipart, inline storage (< INLINE_THRESHOLD = 3072 bytes)
dd if=/dev/urandom of=/tmp/garage.2.rnd bs=1M count=5 # No multipart but file will be chunked
dd if=/dev/urandom of=/tmp/garage.3.rnd bs=1M count=10 # by default, AWS starts using multipart at 8MB
2020-11-29 16:03:08 +00:00
2023-06-09 09:33:45 +00:00
dd if=/dev/urandom of=/tmp/garage.part1.rnd bs=1M count=5
dd if=/dev/urandom of=/tmp/garage.part2.rnd bs=1M count=5
dd if=/dev/urandom of=/tmp/garage.part3.rnd bs=1M count=5
dd if=/dev/urandom of=/tmp/garage.part4.rnd bs=1M count=5
# data of lower entropy, to test compression
dd if=/dev/urandom bs=1k count=2 | base64 -w0 > /tmp/garage.1.b64
dd if=/dev/urandom bs=1M count=5 | base64 -w0 > /tmp/garage.2.b64
dd if=/dev/urandom bs=1M count=10 | base64 -w0 > /tmp/garage.3.b64
echo "🧪 S3 API testing..."
if [ -z "$SKIP_AWS" ]; then
2023-06-09 09:33:45 +00:00
echo "🛠️ Testing with awscli (aws s3)"
source ${SCRIPT_FOLDER}/
2021-05-02 20:30:56 +00:00
aws s3 ls
for idx in {1..3}.{rnd,b64}; do
aws s3 cp "/tmp/garage.$idx" "s3://eprouvette/&+-é\"/garage.$"
aws s3 ls s3://eprouvette
aws s3 cp "s3://eprouvette/&+-é\"/garage.$" "/tmp/garage.$idx.dl"
diff /tmp/garage.$idx /tmp/garage.$idx.dl
rm /tmp/garage.$idx.dl
aws s3 rm "s3://eprouvette/&+-é\"/garage.$"
2023-06-09 09:33:45 +00:00
echo "🛠️ Testing multipart uploads with awscli (aws s3api)"
UPLOAD=$(aws s3api create-multipart-upload --bucket eprouvette --key 'upload' | jq -r ".UploadId")
echo "Upload ID: $UPLOAD"
ETAG3=$(aws s3api upload-part --bucket eprouvette --key 'upload' \
--part-number 3 --body "/tmp/garage.part1.rnd" --upload-id "$UPLOAD" \
| jq -r ".ETag")
ETAG2=$(aws s3api upload-part --bucket eprouvette --key 'upload' \
--part-number 2 --body "/tmp/garage.part2.rnd" --upload-id "$UPLOAD" \
| jq -r ".ETag")
ETAG3=$(aws s3api upload-part --bucket eprouvette --key 'upload' \
--part-number 3 --body "/tmp/garage.part3.rnd" --upload-id "$UPLOAD" \
| jq -r ".ETag")
ETAG6=$(aws s3api upload-part --bucket eprouvette --key 'upload' \
--part-number 6 --body "/tmp/garage.part4.rnd" --upload-id "$UPLOAD" \
| jq -r ".ETag")
2023-06-09 09:33:45 +00:00
MPU="{\"Parts\":[{\"PartNumber\":2,\"ETag\":$ETAG2}, {\"PartNumber\":3,\"ETag\":$ETAG3}, {\"PartNumber\":6,\"ETag\":$ETAG6}]}"
echo $MPU > /tmp/garage.mpu.json
aws s3api complete-multipart-upload --multipart-upload file:///tmp/garage.mpu.json \
--bucket eprouvette --key 'upload' --upload-id "$UPLOAD"
aws s3api get-object --bucket eprouvette --key upload /tmp/garage.mpu.get
if [ "$(md5sum /tmp/garage.mpu.get | cut -d ' ' -f 1)" != "$(cat /tmp/garage.part{2,3,4}.rnd | md5sum | cut -d ' ' -f 1)" ]; then
echo "Invalid multipart upload"
exit 1
aws s3api delete-object --bucket eprouvette --key upload
2024-02-28 13:09:41 +00:00
echo "🛠️ Test SSE-C with awscli (aws s3)"
echo "$SSEC_KEY" | base64 -d > /tmp/garage.ssec-key
for idx in {1,2}.rnd; do
aws s3 cp --sse-c AES256 --sse-c-key fileb:///tmp/garage.ssec-key \
"/tmp/garage.$idx" "s3://eprouvette/garage.$"
aws s3 cp --sse-c AES256 --sse-c-key fileb:///tmp/garage.ssec-key \
"s3://eprouvette/garage.$" "/tmp/garage.$idx.dl.sse-c"
diff "/tmp/garage.$idx" "/tmp/garage.$idx.dl.sse-c"
aws s3api delete-object --bucket eprouvette --key "garage.$"
if [ -z "$SKIP_S3CMD" ]; then
echo "🛠️ Testing with s3cmd"
source ${SCRIPT_FOLDER}/
2021-05-02 20:30:56 +00:00
s3cmd ls
for idx in {1..3}.{rnd,b64}; do
s3cmd put "/tmp/garage.$idx" "s3://eprouvette/&+-é\"/garage.$idx.s3cmd"
s3cmd ls s3://eprouvette
s3cmd get "s3://eprouvette/&+-é\"/garage.$idx.s3cmd" "/tmp/garage.$idx.dl"
diff /tmp/garage.$idx /tmp/garage.$idx.dl
rm /tmp/garage.$idx.dl
s3cmd rm "s3://eprouvette/&+-é\"/garage.$idx.s3cmd"
if [ -z "$SKIP_BOTO3" ]; then
echo "🛠️ Testing with boto3 for STREAMING-UNSIGNED-PAYLOAD-TRAILER"
source ${SCRIPT_FOLDER}/
AWS_ENDPOINT_URL=https://localhost:4443 python <<EOF
import boto3
client = boto3.client('s3', verify=False)
client.put_object(Body=b'hello world', Bucket='eprouvette', Key='test.s3.txt')
client.delete_object(Bucket='eprouvette', Key='test.s3.txt')
# Minio Client
if [ -z "$SKIP_MC" ]; then
echo "🛠️ Testing with mc (minio client)"
source ${SCRIPT_FOLDER}/
2021-05-02 20:30:56 +00:00
mc ls garage/
for idx in {1..3}.{rnd,b64}; do
mc cp "/tmp/garage.$idx" "garage/eprouvette/&+-é\"/garage.$"
mc ls garage/eprouvette
mc cp "garage/eprouvette/&+-é\"/garage.$" "/tmp/garage.$idx.dl"
diff /tmp/garage.$idx /tmp/garage.$idx.dl
rm /tmp/garage.$idx.dl
mc rm "garage/eprouvette/&+-é\"/garage.$"
# RClone
if [ -z "$SKIP_RCLONE" ]; then
echo "🛠️ Testing with rclone"
source ${SCRIPT_FOLDER}/
2021-05-02 20:30:56 +00:00
rclone lsd garage:
for idx in {1..3}.{rnd,b64}; do
cp /tmp/garage.$idx /tmp/garage.$idx.dl
rclone copy "/tmp/garage.$idx.dl" "garage:eprouvette/&+-é\"/"
rm /tmp/garage.$idx.dl
rclone ls garage:eprouvette
rclone copy "garage:eprouvette/&+-é\"/garage.$idx.dl" "/tmp/"
diff /tmp/garage.$idx /tmp/garage.$idx.dl
rm /tmp/garage.$idx.dl
rclone delete "garage:eprouvette/&+-é\"/garage.$idx.dl"
2021-05-15 08:24:20 +00:00
# Duck (aka Cyberduck CLI)
if [ -z "$SKIP_DUCK" ]; then
echo "🛠️ Testing with duck (aka cyberduck cli)"
source ${SCRIPT_FOLDER}/
duck --list garage:/
duck --mkdir "garage:/eprouvette/duck"
for idx in {1..3}.{rnd,b64}; do
duck --verbose --upload "garage:/eprouvette/duck/" "/tmp/garage.$idx"
2021-05-15 08:24:20 +00:00
duck --list garage:/eprouvette/duck/
duck --download "garage:/eprouvette/duck/garage.$idx" "/tmp/garage.$idx.dl"
diff /tmp/garage.$idx /tmp/garage.$idx.dl
2021-05-15 08:24:20 +00:00
rm /tmp/garage.$idx.dl
duck --delete "garage:/eprouvette/duck/garage.$"
2022-03-03 12:34:20 +00:00
if [ -z "$SKIP_WINSCP" ]; then
echo "🛠️ Testing with winscp"
source ${SCRIPT_FOLDER}/
winscp <<EOF
mkdir eprouvette/winscp
for idx in {1..3}.{rnd,b64}; do
winscp <<EOF
put Z:\\tmp\\garage.$idx eprouvette/winscp/garage.$idx.winscp
ls eprouvette/winscp/
get eprouvette/winscp/garage.$idx.winscp Z:\\tmp\\garage.$idx.dl
rm eprouvette/winscp/garage.$idx.winscp
diff /tmp/garage.$idx /tmp/garage.$idx.dl
rm /tmp/garage.$idx.dl
winscp <<EOF
rm eprouvette/winscp
2023-06-09 09:33:45 +00:00
rm /tmp/garage.part{1..4}.rnd
rm /tmp/garage.{1..3}.{rnd,b64}
echo "🏁 Teardown"
AWS_ACCESS_KEY_ID=`cat /tmp/garage.s3 |cut -d' ' -f1`
AWS_SECRET_ACCESS_KEY=`cat /tmp/garage.s3 |cut -d' ' -f2`
garage -c /tmp/config.1.toml bucket deny --read --write eprouvette --key $AWS_ACCESS_KEY_ID
garage -c /tmp/config.1.toml bucket delete --yes eprouvette
garage -c /tmp/config.1.toml key delete --yes $AWS_ACCESS_KEY_ID
Implement ListMultipartUploads (#171) Implement ListMultipartUploads, also refactor ListObjects and ListObjectsV2. It took me some times as I wanted to propose the following things: - Using an iterator instead of the loop+goto pattern. I find it easier to read and it should enable some optimizations. For example, when consuming keys of a common prefix, we do many [redundant checks]( while the only thing to do is to [check if the following key is still part of the common prefix]( - Try to name things (see ExtractionResult and RangeBegin enums) and to separate concerns (see ListQuery and Accumulator) - An IO closure to make unit tests possibles. - Unit tests, to track regressions and document how to interact with the code - Integration tests with `s3api`. In the future, I would like to move them in Rust with the aws rust SDK. Merging of the logic of ListMultipartUploads and ListObjects was not a goal but a consequence of the previous modifications. Some points that we might want to discuss: - ListObjectsV1, when using pagination and delimiters, has a weird behavior (it lists multiple times the same prefix) with `aws s3api` due to the fact that it can not use our optimization to skip the whole prefix. It is independant from my refactor and can be tested with the commented `s3api` tests in ``. It probably has the same weird behavior on the official AWS S3 implementation. - Considering ListMultipartUploads, I had to "abuse" upload id marker to support prefix skipping. I send an `upload-id-marker` with the hardcoded value `include` to emulate your "including" token. - Some ways to test ListMultipartUploads with existing software (my tests are limited to s3api for now). Co-authored-by: Quentin Dufour <> Reviewed-on: Co-authored-by: Quentin <> Co-committed-by: Quentin <>
2022-01-12 18:04:55 +00:00
exec 3>&-
2020-12-06 09:04:17 +00:00
echo "✅ Success"