jaeger-cassandra-schema-job: unset activeDeadlineSeconds (#125)

* jaeger-cassandra-schema-job: unset activeDeadlineSeconds The current strategy is to abort the job after two minutes (across all corresponding pod invocations). In that case the job shows as 'failed' and a human is required to intervene (there is no concept of an automatic restart of a job in Kubernetes). With this patch the job tries forever to create the schema until one of the pods it starts succeeds doing so. That is, with this change the job never goes into the permanent 'failed' state. That change is expected to smoothen deployment in in environments where Cassandra takes a little less predictable amount of time until it is available. So far, in those environments the dealine can hit in and then a human needs to re-schedule the job to address this *transient* problem. With this patch the system heals itself, instead. Notes: - activeDeadlineSeconds is a mechnanism for aborting a retry-loop in case of a *permanent* error such as misconfiguration. - Here, `activeDeadlineSeconds: 120` was introduced three years ago in the first major commit of this template. It stands to reason that it was simply copy/pasted and did not have a deep rationale. Also, since then the job execution semantics around failure handling have changed a bit: https://github.com/kubernetes/community/pull/583/files Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]>
jaegertracing · Jan 15, 2020 · 0fd8437 · 0fd8437
1 parent bcca10c
commit 0fd8437
Showing 1 changed file with 3 additions and 2 deletions.
diff --git a/production/cassandra.yml b/production/cassandra.yml
@@ -1,5 +1,5 @@
 #
-# Copyright 2017-2019 The Jaeger Authors
+# Copyright 2017-2020 The Jaeger Authors
 #
 # Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
 # in compliance with the License. You may obtain a copy of the License at
@@ -122,11 +122,12 @@ items:
       app.kubernetes.io/component: storage-backend
       app.kubernetes.io/part-of: jaeger
   spec:
-    activeDeadlineSeconds: 120
+    activeDeadlineSeconds: 86400
     template:
       metadata:
         name: cassandra-schema
       spec:
+        activeDeadlineSeconds: 320
         containers:
         - name: jaeger-cassandra-schema
           image: jaegertracing/jaeger-cassandra-schema:1.6.0