Skip to content
This repository has been archived by the owner on May 18, 2020. It is now read-only.

Commit

Permalink
jaeger-cassandra-schema-job: unset activeDeadlineSeconds (#125)
Browse files Browse the repository at this point in the history
* jaeger-cassandra-schema-job: unset activeDeadlineSeconds

The current strategy is to abort the job after two
minutes (across all corresponding pod invocations).
In that case the job shows as 'failed' and a human
is required to intervene (there is no concept of an
automatic restart of a job in Kubernetes).

With this patch the job tries forever to create the
schema until one of the pods it starts succeeds
doing so. That is, with this change the job never
goes into the permanent 'failed' state.

That change is expected to smoothen deployment in
in environments where Cassandra takes a little less
predictable amount of time until it is available.
So far, in those environments the dealine can hit in
and then a human needs to re-schedule the job
to address this *transient* problem. With this patch
the system heals itself, instead.

Notes:

- activeDeadlineSeconds is a mechnanism for aborting
  a retry-loop in case of a *permanent* error such as
  misconfiguration.

- Here, `activeDeadlineSeconds: 120` was introduced
  three years ago in the first major commit of this
  template. It stands to reason that it was simply
  copy/pasted and did not have a deep rationale.
  Also, since then the job execution semantics around
  failure handling have changed a bit:
  https://github.com/kubernetes/community/pull/583/files

Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]>
  • Loading branch information
jgehrcke authored and jpkrohling committed Jan 15, 2020
1 parent bcca10c commit 0fd8437
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions production/cassandra.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#
# Copyright 2017-2019 The Jaeger Authors
# Copyright 2017-2020 The Jaeger Authors
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
# in compliance with the License. You may obtain a copy of the License at
Expand Down Expand Up @@ -122,11 +122,12 @@ items:
app.kubernetes.io/component: storage-backend
app.kubernetes.io/part-of: jaeger
spec:
activeDeadlineSeconds: 120
activeDeadlineSeconds: 86400
template:
metadata:
name: cassandra-schema
spec:
activeDeadlineSeconds: 320
containers:
- name: jaeger-cassandra-schema
image: jaegertracing/jaeger-cassandra-schema:1.6.0
Expand Down

0 comments on commit 0fd8437

Please sign in to comment.