pict-rs/releases/0.5.0.md
2023-12-08 23:57:20 -06:00

22 KiB

pict-rs 0.5.0

TODO: quick message about 0.5 release

Overview

Headline Features

Changes

Other Features

Fixes

Removals

Upgrade Notes

The upgrade from pict-rs 0.4 to 0.5 might take some time. Any uploads that have not had their metadata extracted will be processed during this migration. In order to reduce the time required the migration, pict-rs version 0.4.6 can be run first, and the /internal/prepare_upgrade endpoint can be hit. This will tell pict-rs to extract the required metadata for the 0.5 upgrade in the background while 0.4 is still running, enabling the upgrade to 0.5 to proceed much faster after it completes. More information about this endpoint can be found in the 0.4.6 release document

If you're running the provided docker container with default configuration values, the 0.5 container can be pulled and launched with no changes. There is a migration from the 0.4 database format to the 0.5 database format that will occur automatically on first launch. This is also the process that will extract metadata for any uploads that have not had their metadata extracted yet.

If you have any custom configuration, note that some media-related configuration values have been moved. See Media Configuration below.

pict-rs 0.5 now supports postgres for metadata storage. It is possible to upgrade directly from 0.4 to 0.5 with postgres, rather than upgrading to 0.5 with sled and then making the migration later. This process is documented in the pict-rs readme.

A notable addition in 0.5 is upgrade configuration. Since the migration accesses media from the configured store, increasing the concurrency can be beneficial. By default, pict-rs will attempt to migrate 32 records at a time, but this value can be increased.

In the configuration file

[upgrade]
concurrency = 32

With environment variables

PICTRS__UPGRADE__CONCURRENCY=32

On the commandline

pict-rs run --upgrade-concurrency 32

More information can be found in the pict-rs 0.4 to 0.5 migration guide.

Descriptions

Postgres Repo

One of the most anticipated features of pict-rs 0.5 is the support for postgres for metadata storage. This makes pict-rs (mostly) stateless, enabling it to be scaled horizontally across multiple physical or virtual hosts. It also simplifies administration and backup processes for folks who are already running a postgres server for other reasons.

Configuring postgres is simple. The repo section of the pict-rs configuration now supports more than just sled.

In the configuration file

[repo]
type = 'postgres'
url = 'postgres://user:password@host:5432/db'

With environment variables

PICTRS__REPO__TYPE=postgres
PICTRS__REPO__URL=postgres://user:password@host:5432/db

On the commandline

pict-rs run filesystem -p ./data postgres -u 'postgres://user:password@host:5432/db'

It is possible to update from 0.4 directly to 0.5 with postgres, this process is documented in the pict-rs readme.

It is possible to migrate to postgres after first upgrading to 0.5. This is also ducmented in the pict-rs readme.

Media Proxy

pict-rs 0.5 supports caching media, and purging cached media after it is no longer necessary to keep. This behavior is accessed with pict-rs' new proxy query parameter at the /image/original endpoint, and the /image/process.{ext} endpoint. By passing the URL to an image to the proxy parameter, you can instruct pict-rs to download the media from that URL and cache it locally before serving it, or a processed version of it. The proxied media is stored for a configurable time limit, which resets on each access to the proxied media.

This value can be configured as follows

In the configuration file

[media.retention]
proxy = "7d" # units available are "m" for minutes, "h" for hours, "d" for days, and "y" for years

With environment variables

PICTRS__MEDIA__RETENTION__PROXY=7d

On the commandline

pict-rs run --media-retention-proxy 7d

More documentation can be found in the pict-rs readme and in the example pict-rs.toml.

Improved Animation Support

pict-rs 0.4 only supported gif for animated images. This decision had been made due to the format's poor support for higher-resolution animations with higher framerates compared to silent videos for a given file size. In pict-rs 0.5, animated images are now supported to the same extent as still images. Available formats for animated images include apng, avif, gif, and webp (although not jxl yet).

With this support for more animated file types, pict-rs no longer transcodes larger animations into videos. Instead, animated images will be left in their original format unless an override is configured, and the default values for maximum animation dimensions, file size, and frame count have been increased.

Configuration related to animations has been moved from [media.gif] to [media.animation] to better reflect what it applies to.

For all configuration options regarding animations, see the example pict-rs.toml.

Media Configuration

pict-rs 0.5 splits media configuration further, allowing images, animations, and videos to each specify their own maximum file size, dimensions, frame counts, etc, in addition to a few global limits. Relevant sections of configuration are [media], [media.image], [media.animation], and [media.video].

A list of updated and moved configuration values can be found below.

Image Changes
Old Environment Variable New Environment Variable
PICTRS__MEDIA__FORMAT PICTRS__MEDIA__IMAGE__FORMAT
PICTRS__MEDIA__MAX_WIDTH PICTRS__MEDIA__IMAGE__MAX_WIDTH
PICTRS__MEDIA__MAX_HEIGHT PICTRS__MEDIA__IMAGE__MAX_HEIGHT
PICTRS__MEDIA__MAX_AREA PICTRS__MEDIA__IMAGE__MAX_AREA
PICTRS__MEDIA__IMAGE__MAX_FILE_SIZE
Old TOML Value New TOML Value
[media] format [media.image] format
[media] max_width [media.image] max_width
[media] max_height [media.image] max_height
[media] max_area [media.image] max_area
[media.image] max_file_size
Animation Changes
Old Environment Variable New Environment Variable
PICTRS__MEDIA__GIF__MAX_WIDTH PICTRS__MEDIA__ANIMATION__MAX_WIDTH
PICTRS__MEDIA__GIF__MAX_HEIGHT PICTRS__MEDIA__ANIMATION__MAX_HEIGHT
PICTRS__MEDIA__GIF__MAX_AREA PICTRS__MEDIA__ANIMATION__MAX_AREA
PICTRS__MEDIA__GIF__MAX_FILE_SIZE PICTRS__MEDIA__ANIMATION__MAX_FILE_SIZE
PICTRS__MEDIA__GIF__MAX_FRAME_COUNT PICTRS__MEDIA__ANIMATION__MAX_FRAME_COUNT
PICTRS__MEDIA__ANIMATION__FORMAT
PICTRS__MEDIA__ANIMATION__MAX_FILE_SIZE
Old TOML Value New TOML Value
[media.gif] max_width [media.animation] max_width
[media.gif] max_height [media.animation] max_height
[media.gif] max_area [media.animation] max_area
[media.gif] max_file_size [media.animation] max_file_size
[media.gif] max_frame_count [media.animation] max_frame_count
[media.animation] format
[media.animation] max_file_size
Video Changes
Old Environment Variable New Environment Variable
PICTRS__MEDIA__ENABLE_SILENT_VIDEO PICTRS__MEDIA__VIDEO__ENABLE
PICTRS__MEDIA__ENABLE_FULL_VIDEO PICTRS__MEDIA__VIDEO__ALLOW_AUDIO
PICTRS__MEDIA__VIDEO_CODEC PICTRS__MEDIA__VIDEO__VIDEO_CODEC
PICTRS__MEDIA__AUDIO_CODEC PICTRS__MEDIA__VIDEO__AUDIO_CODEC
PICTRS__MEDIA__MAX_FRAME_COUNT PICTRS__MEDIA__VIDEO__MAX_FRAME_COUNT
PICTRS__MEDIA__ENABLE_FULL_VIDEO PICTRS__MEDIA__VIDEO__ALLOW_AUDIO
PICTRS__MEDIA__VIDEO__MAX_WIDTH
PICTRS__MEDIA__VIDEO__MAX_HEIGHT
PICTRS__MEDIA__VIDEO__MAX_AREA
PICTRS__MEDIA__VIDEO__MAX_FILE_SIZE
Old TOML Value New TOML Value
[media] enable_silent_video [media.video] enable
[media] enable_full_video [media.video] allow_audio
[media] video_codec [media.video] video_codec
[media] audio_codec [media.video] audio_codec
[media] max_frame_count [media.video] max_frame_count
[media] enable_full_video [media.video] allow_audio
[media.video] max_width
[media.video] max_height
[media.video] max_area
[media.video] max_file_size

Note that although each media type now includes its own MAX_FILE_SIZE configuration, the PICTRS__MEDIA__MAX_FILE_SIZE value still exists as a global limit for any file type.

In addition to all the configuration options mentioned above, there are now individual quality settings that can be configured for each image and animation type, as well as for video files. Please see the pict-rs.toml file for more information.

For all configuration options regarding media, see the example pict-rs.toml.

Improved Collision Resistance

Previously, pict-rs relied on a sha256 hash of uploaded media's bytes in order to detect which files where the same, and which were unique. I have not heard of any existing problem with this behavior, however, as pict-rs becomes more widely deployed I want to ensure that deduplication does not happen erroneously. In the event of a hash collision between two different pieces of media, pict-rs would consider them duplicates and discard the newly uploaded file in favor of the one it already stored. This would lead to the wrong media being served for a user. In order to reduce the chances of a collission, pict-rs 0.5 now includes the media's file length and a marker indicating it's content type as part of it's unique identifier. This maintains the behavior of identical media being deduplicated, while making collisions far less likely to occur.

Optional Video Transcoding

In pict-rs 0.4, all videos that were uploaded were transcoded from their original format to either pict-rs' default format, or a format specified with configuration. pict-rs 0.5 removes the default "vp9" video codec, and allows uploaded videos to maintain their original codec and format, provided pict-rs supports it. This means uploaded h264, h265, vp8, vp9, and av1 videos no longer need to be decoded & encoded during the upload process aside from detecting their metadata.

This behavior is enabled by default, but for administrators who have configured a non-default video codec, this can be enabled by removing the video_codec and audio_codec configurations.

Library API Changes

The library API for pict-rs has changed from 0.4. It now follows a "fluent programming" pattern of chaining methods to configure and launch the service from within another application. Documentation for the new library API can be found on docs.rs

For most cases, using pict-rs' ConfigSource as the entrypoint is recommended.

As a quick example,

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    pict_rs::ConfigSource::memory(serde_json::json!({
        "server": {
            "address": "127.0.0.1:8080"
        },
        "repo": {
            "type": "sled",
            "path": "./sled-repo"
        },
        "store": {
            "type": "filesystem",
            "path": "./files"
        }
    }))
    .init::<&str>(None)?
    .run_on_localset()
    .await?;

    Ok(())
}

Reject Malformed Deadlines

pict-rs supports setting a request deadline with the X-Request-Deadline header, where the value is a unix timestamp in nanoseconds. In pict-rs 0.4, requests with deadlines that could not be parsed were treated as requests without deadlines. in pict-rs 0.5 a malformed deadline will cause the request to be rejected with a 400 Bad Request error.

Logging Verbosity

pict-rs 0.5 no longer logs the opening and closing of tracing spans. This means the logs will now be mostly empty aside from the occasional warning or error. Errors should still provide ample debug information, so the decrease in verbosity should not negatively affect debugging of the program.

Quality Configuration

pict-rs 0.5 now supports configuring the quality and compression of transcoded media. New configuration sections have been introduced to allow setting values for each image format individually, as well as individual quality settings for different video resolutions.

For the specifics of quality configuration, see the example pict-rs.toml.

Read-Only Mode

pict-rs 0.5 has gained the ability to run in a "read-only" mode, which will forbid the application from accepting new uploaded media, creating new media variants, or updating any metadata. This is useful for complex environments that are capable of running a copy of the pict-rs application while performing maintenance on the "real" deployment.

For configuration information, see the example pict-rs.toml.

Danger-Dummy Mode

pict-rs 0.5 can now run without any external dependencies in "Danger-Dummy Mode." This allows uploading and serving media, but all of the metadata for that media will be fake default values, and the media cannot be processed in any way. This is not intended for production uses, since generic file stores (like directly using object storage) are more tailored for this use case. Instead, this is provided to enable applications to perform integration tests with pict-rs without needing to reconstruct a valid environment to run the application.

For configuration information, see the example pict-rs.toml.

Media Variant Retention

Similar to the new proxy feature, pict-rs 0.5 now sets a time limit on media variants, and discards them if they are not accessed within that time limit. Since variants can be regenerated on request, there's no harm in removing them to save on storage space.

For configuration information, see the example pict-rs.toml.

Configurable Temporary Directory

pict-rs now supports using a non-default temporary directory. Previous versions of pict-rs have resorted to the operating system's defined temporary directory, and that is still the default behavior, but this can now be overridden from pict-rs' configuration.

For configuration information, see the example pict-rs.toml.

Serve From Object Storage

pict-rs 0.5 now supports serving images directly from object storage. This is achieved by telling pict-rs the public URL at which your object storage is available in the object storage configuration. When this is set up, any existing image requested from pict-rs will be served with a redirect to the object storage URL. Since pict-rs is not intended to be exposed directly to the internet, any proxy between the internet and pict-rs must be configured not to follow redirects, or else that proxy will end up fetching the image from object-storage itself. In some cases, this might be reasonable (e.g. if Web Access is cheaper than S3API access for fetching files).

For configuration information, see the example pict-rs.toml.

Prometheus Metrics

pict-rs 0.5 now optionally exposes a scrape endpoint for Prometheus metrics. This includes various timings for requests, processing, and counts for pict-rs workings. It could be useful for admins to help monitor the health of pict-rs as it runs.

For configuration information, see the example pict-rs.toml.

Internal Hashes Endpoint

pict-rs 0.5 offers a new internal endpoint at /internal/hashes which returns a paginated list of all uploaded files, sorted from most recently uploaded to least recently uploaded. This endpoint supports querying by timestamp in rfc3339 format in order to quickly jump deep into the list. The number of returned results per page is also configurable.

For more information about this API, see the pict-rs readme.

Internal Delete Endpoint

pict-rs 0.5 offers a new internal endpoint for deleting aliases as /internal/delete. This is distinct from the purge endpoint in that it doesn't purge a whole image with all it's aliases, instead it simply deletes the provided alias and nothing more. It can be seen as an admin tool for deleting an upload on behalf of a user.

For more information about this API, see the pict-rs readme.

Error Codes

In pict-rs 0.5, all errors returned by the application have an associated error code. While pict-rs has always provided an error description in the "msg" field of the response, the new "code" field represents a general idea of the error that can be used for choosing a translation in downstream software.

While there's no documentation-level enumeration of all error codes, the file defining them is pretty easy to understand. You can find it here.

Clean Stray Magick Files

pict-rs depends on other applications in order to identify and process media. If these applications crash for any reason, they can potentially leave behind temporary files, which slowly aggregate over time, using disk or RAM. pict-rs 0.5 attempts to solve this by generating a new temporary directory for each invocation of imagemagick, and cleaning up that directory after the program terminates, regardless of success. This means that any stray magick- files should now be properly removed when they are no longer in use.

Always Drain Payloads

Actix Web, the framework used by pict-rs, has historically had problems with connections remaining open when request bodies are not read to completion. In order to avoid this, pict-rs has a dedicated task on each request thread that takes ownership of dropped request payloads and attempts to drain them, allowing the connections to close.

Constant-Time Equality for Deletions

pict-rs uses unique tokens generated per-image in order to authorize the deletion of those images. In pict-rs 0.5 the checking of these tokens has been made constant-time, preventing the use of timing attacks that could lead to images being deleted by an attacker.

ini and json5

These configuration formats have been removed from pict-rs 0.5 in order to improve compile times. If this upsets anyone let me know. I only provide examples in toml anyway.

Client Pool Size

The client_pool_size configuration value hasn't meant anything since the switch from awc to reqwest as pict-rs' http client. It has been removed in 0.5.

0.3 Migration Path

pict-rs 0.5 is not capable of migrating directly from 0.3. Instead, the upgrade path is to upgrade to 0.4 and then to 0.5.

Prepare Upgrade Endpoint

The internal prepare_upgrade endpoint was only useful in 0.4 to prepare for the 0.5 upgrade, so 0.5 removes this endpoint.