When the woodpecker server is not reachable (eg. for update,
maintenance, agent connection issue, ...) for a long period of time, the
agent tries continuously to reconnect, without any delay. This creates
**several GB** of logs in a short period of time.
Here is a sample line, repeated indefinitely:
```
{"level":"error","error":"rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing: dial tcp x.x.x.x:xxx: connect: connection refused\"","time":"2024-04-07T17:29:59Z","message":"grpc error: done(): code: Unavailable"}
```
It appears that the [backoff
package](https://pkg.go.dev/github.com/cenkalti/backoff/v4#BackOff),
after a certain amount of time, returns `backoff.Stop` (-1) instead of a
valid delay to wait. It means that no more retry should be made, [as
shown in the
example](https://pkg.go.dev/github.com/cenkalti/backoff/v4#BackOff). But
the code doesn't handle that case and takes -1 as the next delay.
This led to continuous retry with no delay between them and creates a
huge amount of logs.
[`MaxElapsedTime` default is 15
minutes](https://pkg.go.dev/github.com/cenkalti/backoff/v4#pkg-constants),
passed this time, `NextBackOff` returns `backoff.Stop` (-1) instead of
`MaxInterval`.
This commit sets `MaxElapsedTime` to 0, [to avoid `Stop`
return](https://pkg.go.dev/github.com/cenkalti/backoff/v4#ExponentialBackOff).
close#3109
~~also fix start time of steps to be set correctly~~ edgecase do not hit
anymore as we have a clear sepperation between workflows and steps now
:)
---------
Co-authored-by: Anbraten <anton@ju60.de>
https://go.dev/doc/modules/release-workflow#breaking
Fixes https://github.com/woodpecker-ci/woodpecker/issues/2913 fixes
#2654
```
runephilosof@fedora:~/code/platform-woodpecker/woodpecker-repo-configurator (master)$ go get go.woodpecker-ci.org/woodpecker@v2.0.0
go: go.woodpecker-ci.org/woodpecker@v2.0.0: invalid version: module contains a go.mod file, so module path must match major version ("go.woodpecker-ci.org/woodpecker/v2")
```
---------
Co-authored-by: qwerty287 <80460567+qwerty287@users.noreply.github.com>
Existing retry logic was a simple second delay, replacing it with a
exponential backoff.
Initial delay is 10ms up to 10s for the max delay. In the future this
should be made configurable.
With an extended max delay it becomes important to notice context
cancelation, so this now also selects on both the delay and context
done.
closes#1801closes#1815closes#1144
closes #983
closes #557closes#1827
regression of #1791
# TODO
- [x] adjust log model
- [x] add migration for logs
- [x] send log line via grpc using step-id
- [x] save log-line to db
- [x] stream log-lines to UI
- [x] use less structs for log-data
- [x] make web UI work
- [x] display logs loaded from db
- [x] display streaming logs
- [ ] ~~make migration work~~ -> dedicated pull (#1828)
# TESTED
- [x] new logs are stored in database
- [x] log retrieval via cli (of new logs) works
- [x] log streaming works (tested via curl & webui)
- [x] log retrieval via web (of new logs) works
---------
Co-authored-by: 6543 <6543@obermui.de>
close#1114
As long as the `VersionResponse` type is not changed the check will
fail/pass gracefully
example output:
```
{"level":"error","error":"GRPC version mismatch","time":"2023-03-19T19:49:09+01:00","message":"Server version next-6923e7ab does report grpc version 2 but we only understand 1"}
GRPC version mismatch
```