π MonitorsΒΆ
Monitors run custom releases validations and can rollback releases.
Monitors flowΒΆ
flowchart LR
helmwave_up[helmwave up]
exit0[exit 0]
exit1[exit 1]
helmwave_up --> release1[upgrade release 1]
helmwave_up --> release2[upgrade release 2]
helmwave_up --> release3[upgrade release 3]
release1 -- succeeded --> monitor1_start
release2 -- succeeded --> monitor1_start
release2 -- succeeded --> monitor2_start
release3 -- succeeded --> monitor2_start
monitor1_failed -.rollback release.->release_rollback1[rollback release 1]
monitor1_failed -.rollback release.->release_rollback2[rollback release 2]
monitor2_failed -.rollback release.->release_rollback2[rollback release 2]
monitor2_failed -.rollback release.->release_rollback3[rollback release 3]
release_rollback1 -.-> exit1
release_rollback2 -.-> exit1
release_rollback3 -.-> exit1
monitor1_succeeded -.-> exit0
monitor2_succeeded -.-> exit0
subgraph monitor1[Monitor 1]
monitor1_start[Monitor start]
monitor1_iteration[Monitor iteration]
monitor1_failed[Monitor failed]
monitor1_succeeded[Monitor succeeded]
monitor1_start --> monitor1_iteration
monitor1_iteration --next iteration--> monitor1_iteration
monitor1_iteration --failure threshold or total timeout-->monitor1_failed
monitor1_iteration --success threshold-->monitor1_succeeded
end
subgraph monitor2[Monitor 2]
monitor2_start[Monitor start]
monitor2_iteration[Monitor iteration]
monitor2_failed[Monitor failed]
monitor2_succeeded[Monitor succeeded]
monitor2_start --> monitor2_iteration
monitor2_iteration --next iteration--> monitor2_iteration
monitor2_iteration --failure threshold or total timeout-->monitor2_failed
monitor2_iteration --success threshold-->monitor2_succeeded
end
- Each monitor starts when its all dependant releases succeeded
- Each monitor runs its iterations every
iterval
withiteration_timeout
- Consecutive successful iterations are counted towards
success_threshold
- Consecutive failed iterations are counted towards
failure_threshold
- After all monitors exited dependant releases do actions for their failed monitors
DemoΒΆ
helmwave.yml
registries:
- host: registry-1.docker.io
monitors:
- name: nats-up-metric
type: prometheus
total_timeout: 1m # fail if it flaps between success and failure for so long
iteration_timeout: 1s
interval: 2s
success_threshold: 5
failure_threshold: 5
prometheus:
url: http://localhost:9090
expr: |
up == 1
- name: nats-delivered-metric
type: prometheus
total_timeout: 1m # fail if it flaps between success and failure for so long
iteration_timeout: 5s
interval: 10s
success_threshold: 5
failure_threshold: 5
prometheus:
url: http://localhost:9090
expr: |
sum(rate(nats_consumer_delivered_consumer_seq[15s])) > 0
.options: &options
namespace: nats
create_namespace: true
wait: true
timeout: 1m
max_history: 3 # best practice
chart:
# For example, we will use bitnami/nats chart, because it's small and fast
name: oci://registry-1.docker.io/bitnamicharts/nats
version: 7.8.3 # best practice
releases:
- name: nats
<<: *options
monitors:
- name: nats-up-metric
- name: nats-delivered-metric
$ helmwave build --diff-mode none
[INFO]: π¨ Building releases...
[INFO]: π¨ Building values...
[INFO]: π¨ no values provided
release: nats@nats
[INFO]: π¨ Building repositories...
[INFO]: π¨ Building registries...
[INFO]: π registry has been added to the plan
registry: registry-1.docker.io
[INFO]: π¨ Building charts...
[INFO]: Pulled: registry-1.docker.io/bitnamicharts/nats:7.8.3
[INFO]: Digest: sha256:5f80350b8a85177e4a9c7ed968f77c47bedcc461418172fb66594bc61fa1ffac
[INFO]: π¨ Building manifests...
[INFO]: β skipping updating dependencies for remote chart
release: nats@nats
[INFO]: Pulled: registry-1.docker.io/bitnamicharts/nats:7.8.3
[INFO]: Digest: sha256:5f80350b8a85177e4a9c7ed968f77c47bedcc461418172fb66594bc61fa1ffac
[INFO]: β
manifest done
release: nats@nats
[INFO]: π¨ Building graphs...
[INFO]: show graph:
βββββββββββββ
β nats@nats β
βββββββββββββ
[INFO]: π Plan
registries:
- registry-1.docker.io
releases:
- nats@nats
repositories:
-
[INFO]: π Skip diffing
[INFO]: π Planfile is ready!
[INFO]: π Plan
releases:
- nats@nats
repositories:
-
registries:
- registry-1.docker.io
[INFO]: π sync repositories...
[INFO]: π sync registries...
[INFO]: π₯ sync releases...
[INFO]: π₯ deploying...
release: nats@nats
[INFO]: β
release: nats@nats
[INFO]: monitor succeeded
monitor: nats-up-metric
streak: 1/5
[INFO]: monitor succeeded
monitor: nats-up-metric
streak: 2/5
[INFO]: monitor succeeded
streak: 3/5
monitor: nats-up-metric
[INFO]: monitor succeeded
streak: 4/5
monitor: nats-up-metric
[INFO]: monitor did not succeed
monitor: nats-delivered-metric
streak: 1/5
error: result is empty
[INFO]: monitor succeeded
monitor: nats-up-metric
streak: 5/5
[INFO]: β
monitor: nats-up-metric
[INFO]: monitor did not succeed
monitor: nats-delivered-metric
streak: 2/5
error: result is empty
[INFO]: monitor did not succeed
error: result is empty
monitor: nats-delivered-metric
streak: 3/5
[INFO]: monitor did not succeed
error: result is empty
streak: 4/5
monitor: nats-delivered-metric
[INFO]: monitor did not succeed
monitor: nats-delivered-metric
streak: 5/5
error: result is empty
[ERROR]: β monitor failed
monitor: nats-delivered-metric
error: monitor triggered failure threshold
[ERROR]: monitors failed, need to take actions
error: one of goroutines in waitgroup sent error: 1 error occurred:
* monitor triggered failure threshold
[INFO]: chose action to perform for failed monitors
action: rollback
release: nats@nats
[INFO]: Releases Success 1 / 1
[INFO]: Monitors Success 1 / 2
NAME | ERROR
------------------------+---------------------------------
nats-delivered-metric | [1;41mmonitor triggered failure[0m
| threshold
[FATAL]: deploy failed