Commit graph

331 commits

Author SHA1 Message Date
Fernando Barbosa 12b3f9745f
Merge branch 'main' into woodpecker-fix-log-tail-cpu-lock 2024-04-15 10:16:11 -03:00
qwerty287 00f0fcd416
Rework addons (use rpc) (#3268)
Co-authored-by: Anbraten <6918444+anbraten@users.noreply.github.com>
2024-04-15 10:04:21 +02:00
Elias f211a780f3
Handle ImagePullBackOff pod status (#3580)
close: https://github.com/woodpecker-ci/woodpecker/issues/3555

Put the same logic from `waitStep` and call the function
`isImagePullBackOffState` in the `tailStep` function.

---------

Co-authored-by: elias.souza <elias.souza@quintoandar.com.br>
Co-authored-by: Anbraten <6918444+anbraten@users.noreply.github.com>
2024-04-15 09:08:13 +02:00
Fernando Barbosa 6a063b6e7b Add //nolint: gomnd to peer.Log context 2024-04-12 11:32:45 -03:00
Fernando Barbosa 85d03a63b0
Merge branch 'main' into woodpecker-fix-log-tail-cpu-lock 2024-04-09 19:14:21 -03:00
qwerty287 c9a3bfb321
Fix spellcheck and enable more dirs (#3603) 2024-04-09 09:04:53 +02:00
Fernando Barbosa 01699eaaab fix: k8s agent fails to tail logs starving the cpu
Proposal to fix https://github.com/woodpecker-ci/woodpecker/issues/2253

We have observed several possibly-related issues on a Kubernetes
backend:

1. Agents behave erractly when dealing with certain log payloads. A common
   observation here is that steps that produce a large volume of logs will cause
   some steps to be stuck "pending" forever.

2. Agents use way more CPU than should be expected, we often see 200-300
   millicores of CPU per Workflow per agent (as reported on #2253).

3. We commonly see Agents displaying thousands of error lines about
   parsing logs, often with very close timestamps, which may explain issues 1
   and 2 (as reported on #2253).

```
{"level":"error","error":"rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8","time":"2024-04-05T21:32:25Z","caller":"/src/agent/rpc/client_grpc.go:335","message":"grpc error: log(): code: Internal"}
{"level":"error","error":"rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8","time":"2024-04-05T21:32:25Z","caller":"/src/agent/rpc/client_grpc.go:335","message":"grpc error: log(): code: Internal"}
{"level":"error","error":"rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8","time":"2024-04-05T21:32:25Z","caller":"/src/agent/rpc/client_grpc.go:335","message":"grpc error: log(): code: Internal"}
```

4. We've also observed that agents will sometimes drop out of the worker queue,
also as reported on #2253.

Seeing as the logs point to `client_grpc.go:335`, this pull request
fixes the issue by:

1. Removing codes.Internal from being a retryable GRPC status. Now agent GRPC
calls that fail with codes. Internal will not be retried. There's not an
agreement on what GRPC codes should be retried but Internal does not seem to be
a common one to retry -- if ever.

2. Add a timeout of 30 seconds to any retries. Currently, the exponential
retries have a maximum timeout of _15 minutes_. I assume this might be
required by some other functions so Agents resume their operation in
case the webserver restarts. Still this is likely the cause behind the
large cpu increase as agents can be stuck trying thousands of requests for
a large windown of time. The previous change alone should be enough to
solve this issue but I think this might be a good idea to prevent
similar problems from arising in the future.
2024-04-08 16:32:29 -03:00
YR Chen e1b574a4bc
Add runtimeClassName in Kubernetes backend options (#3474)
Resolves #3473

---------

Co-authored-by: Thomas Anderson <127358482+zc-devs@users.noreply.github.com>
2024-03-29 10:29:07 +01:00
qwerty287 2029813fc2
Remove unused cache properties (#3567) 2024-03-29 09:48:28 +01:00
qwerty287 75803dba41
Fix uppercased env (#3516)
closes #3515 

I think after this is fixed, we should publish a new release as this can
be quite important.

Co-authored-by: Robert Kaussow <mail@thegeeklab.de>
2024-03-20 16:53:33 +02:00
qwerty287 f23d42b49e
Fix env schema (#3514)
closes #3510
2024-03-20 09:28:02 +01:00
Robert Kaussow a779eed3df
Enable golangci linter gomnd (#3171) 2024-03-15 18:00:25 +01:00
zowhoey ad507d8ee4
Move generic agent flags to cmd/agent/core (#3484) 2024-03-15 11:31:35 +01:00
Anbraten 9db9c7116f
Improve security context handling (#3482) 2024-03-13 22:41:13 +01:00
Anbraten c3e4c14c23
Set pull-request id and labels on pr-closed event (#3442) 2024-02-26 14:07:33 +01:00
qwerty287 9b0c4e4e3c
Fix env var naming (#3438)
closes #3436
2024-02-25 10:12:40 +01:00
6543 6eafb37aba
nit: compiler.Compile explizite init Environment map 2024-02-23 17:40:52 +01:00
qwerty287 d59bc64823
Fix server panic (#3426)
Closes #3424
2024-02-23 16:32:06 +01:00
qwerty287 de5c65939a
Deprecate alternative names on secrets (#3406)
Closes https://github.com/woodpecker-ci/woodpecker/discussions/2274

# deprecation of alternative names

Instead of
```yaml
secrets:
  - source: some_secret
    target: some_env
```
you now write:
```yaml
environment:
  some_env:
    from_secret: some_secret
```

Also, it's possible to use complex yaml objects in `environment`,
they're turned into json (just like `settings`).
2024-02-22 18:25:57 +01:00
qwerty287 0c9bbf91a3
Do not alter secret key upper-/lowercase (#3375) 2024-02-20 14:20:25 +01:00
Elias bffc9c8ff8
fix: can't run multiple services on k8s (#3395)
Fix Issue: https://github.com/woodpecker-ci/woodpecker/issues/3288

The way the pod service starts up makes it impossible to run two or more
pipelines at the same time when we have a service section.

The idea is to set the name of the service in the same way we did for
the pod name.

Pipeline: 

```yaml

services:
  mydb:
    image: mysql
    environment:
      - MYSQL_DATABASE=test
      - MYSQL_ROOT_PASSWORD=example
    ports:
      - 3306/tcp
steps:
  get-version:
    image: ubuntu
    commands:
      - ( apt update && apt dist-upgrade -y && apt install -y mysql-client 2>&1 )> /dev/null
      - sleep 30s # need to wait for mysql-server init
      - echo 'SHOW VARIABLES LIKE "version"' | mysql -uroot -hmydb test -pexample
```

Running more than one pipeline result:


![image](https://github.com/woodpecker-ci/woodpecker/assets/22245125/e512309f-0d1e-4125-bab9-2357a710fedd)

---------

Co-authored-by: elias.souza <elias.souza@quintoandar.com.br>
2024-02-17 12:30:06 +01:00
qwerty287 5d3a503f98
Add link checking (#3371)
Closes https://github.com/woodpecker-ci/woodpecker/issues/3332
2024-02-12 15:00:33 +01:00
qwerty287 894ab51215
Fix schema links (#3369)
Closes https://github.com/woodpecker-ci/woodpecker/issues/2063
2024-02-11 09:53:02 +01:00
qwerty287 f369d2c543
Lint for event filter and deprecate exclude (#3222)
Closes https://github.com/woodpecker-ci/woodpecker/discussions/2174

- return bad habit error if no event filter is set
- If this is applied, it's useless to allow `exclude`s on events.
Therefore, deprecate it together with `include`s which should be
replaced by `base.StringOrSlice` later.
2024-02-10 17:33:05 +01:00
sinlov 134fb7900c
fix: update schema event_enum to remove error warning when.event (#3357)
change test case to check

fix #3356
2024-02-09 08:05:21 +01:00
Anbraten 6785806873
Fix backend detection (#3353)
closes #3352
2024-02-09 00:04:43 +01:00
Anbraten 0b91317cde
Fix linter (#3354) 2024-02-08 22:49:07 +01:00
qwerty287 6892a9ca57
Parse backend options in backend (#3227)
Currently, backend options are parsed in the yaml parser.
This has some issues:
- backend specific code should be in the backend folders
- it is not possible to add backend options for backends added via
addons
2024-02-08 18:39:32 +01:00
qwerty287 f92f8b17a3
Make agent usable for external backends (#3270) 2024-02-08 16:33:22 +01:00
Fernando Barbosa c7467b9828
fix: agent panic when node is terminated during step execution (#3331)
Fixes https://github.com/woodpecker-ci/woodpecker/issues/3330

This adds error handling on the agent's WaitStep function, on two
sections where it could encounter a `panic: runtime error: invalid
memory address or nil pointer dereference` in case it could no longer
access complete information about a specific pod.

This error was found to happen if the node in which the pod was running
was terminated during the step's execution.
spite active pipelines being executed on the node.

Now instead of a panic on the agent's logs and undefined behavior on the
UI it will display a more helpful error message on the UI.

### Additional context

We observed the bug first on v2.1.1, but tested the fix internally on
top of 2.3.0.


![image](https://github.com/woodpecker-ci/woodpecker/assets/7269710/dfbcf089-85f7-4b5d-8102-f21af95c5cda)
2024-02-05 22:46:14 +01:00
qwerty287 9df572ef31
Add release event trigger (#3226)
Supersedes #764 

Bitbucket does not support release webhooks.

---------

Co-authored-by: Patrick Schratz <patrick.schratz@gmail.com>
2024-01-30 17:39:00 +01:00
Lukas 94b882fb95
Add spellcheck config (#3018)
Part of #738 

```
pnpx cspell lint --gitignore '{**,.*}/{*,.*}'
```

---------

Co-authored-by: Anbraten <anton@ju60.de>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: 6543 <6543@obermui.de>
2024-01-27 21:15:10 +01:00
Anbraten 0b5eef7d1e
Improve secret availability checks (#3271) 2024-01-27 20:59:44 +01:00
Thomas Anderson e5c83190c7
Sanitize pod's step label (#3275)
Closes #3272
2024-01-26 13:42:21 +01:00
Elias 1c3159ebb7
fix: bug pod service without label service (#3256) 2024-01-23 07:42:47 +01:00
qwerty287 6925afd83b
Pin prettier version (#3260) 2024-01-22 21:38:47 +02:00
Elias 32a1199519
fix: bug annotations (#3255)
Fix Issue: https://github.com/woodpecker-ci/woodpecker/issues/3254

Co-authored-by: elias.souza <elias.souza@quintoandar.com.br>
2024-01-22 13:39:49 +01:00
qwerty287 5e2f7d81b3
Clean up models (#3228) 2024-01-22 07:56:18 +01:00
Thomas Anderson 072fa29f4a
Fixed Pods creation of WP services (#3236)
Closes #3178
2024-01-21 03:56:37 +01:00
qwerty287 d1d2e9723d
Support custom steps entrypoint (#2985)
Closes https://github.com/woodpecker-ci/woodpecker/issues/278

---------

Co-authored-by: Anbraten <anton@ju60.de>
Co-authored-by: 6543 <6543@obermui.de>
2024-01-19 05:34:02 +01:00
6543 6a6cb094fb
Add schema test for depends_on (#3205) 2024-01-15 08:54:27 +01:00
Thomas Anderson 10f2e209d6
Secured kubernetes backend configuration (#3204)
Follow up of #3165
2024-01-15 03:59:08 +01:00
qwerty287 001b5639a6
Use assert for test (#3201)
instead of `if`s
2024-01-14 19:33:58 +01:00
qwerty287 b9f6f3f9fb
Replace goimports with gci (#3202)
`gci` seems to be much more strict.
2024-01-14 18:22:06 +01:00
qwerty287 45bf8600ef
Remove multipart logger (#3200) 2024-01-14 10:54:02 +01:00
Robert Kaussow 685907ddf6
Fix spelling in test description (#3198) 2024-01-13 15:24:13 +01:00
Thomas Anderson 0611fa9b32
Added protocol in port configuration (#2993)
Closes  #2727
2024-01-12 23:57:24 +01:00
Thomas Anderson 9bbc446009
Kubernetes AppArmor and seccomp (#3123)
Closes #2545

seccomp
https://kubernetes.io/docs/tutorials/security/seccomp/

https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/135-seccomp/README.md

AppArmor
https://kubernetes.io/docs/tutorials/security/apparmor/

fddcbb9cbf/keps/sig-node/24-apparmor/README.md
Went ahead and implemented API from KEP-24 above.
2024-01-12 23:32:24 +01:00
Robert Kaussow 9bbba4441d
Enable golangci linter forcetypeassert (#3168)
Split out from https://github.com/woodpecker-ci/woodpecker/pull/2960
2024-01-12 02:01:02 +01:00
Robert Kaussow f813badcf9
Enable golangci linter contextcheck (#3170)
Split out from https://github.com/woodpecker-ci/woodpecker/pull/2960
2024-01-11 22:15:15 +01:00