A hook is an executable file that Shell-operator runs when some event occurs. It can be a script or a compiled program written in any programming language. For illustrative purposes, we will use bash scripts. An example with a hook in the form of a Python script is available here: 002-startup-python.
The hook receives the data and returns the result via files. Paths to files are passed to the hook via environment variables.
At startup Shell-operator initializes the hooks:
- The recursive search for hook files is performed in the hooks directory. You can specify it with
--hooks-dir
command-line argument or with theSHELL_OPERATOR_HOOKS_DIR
environment variable (the default path is/hooks
).- Every executable file found in the path is considered a hook.
- Found hooks are sorted alphabetically according to the directories’ and hooks’ names. Then they are executed with the
--config
flag to get bindings to events in YAML or JSON format. - If hook's configuration is successful, the working queue named "main" is filled with
onStartup
hooks. - Then, the "main" queue is filled with
kubernetes
hooks withSynchronization
binding context type, so that each hook receives all existing objects described in hook's configuration. - After executing
kubernetes
hook withSynchronization
binding context, Shell-operator starts a monitor of Kubernetes events according to configuredkubernetes
binding.- Each monitor stores a snapshot — a refreshable list of all Kubernetes objects that match a binding definition.
Next, the main cycle is started:
-
Event handler adds hooks to the named queues on events:
kubernetes
hooks are added to the queue when desired WatchEvent is received from Kubernetes,schedule
hooks are added according to the schedule,kubernetes
andschedule
hooks are added to the "main" queue or the named queue ifqueue
field was specified.
-
Each named queue has its queue handler which executes hooks strictly sequentially. If hook fails with an error (non-zero exit code), Shell-operator restarts it (every 5 seconds) until it succeeds. In case of an erroneous execution of a hook, when other events occur, a queue will be filled with new tasks, but their execution will be blocked until the failing hook succeeds.
- You can change this behavior for a specific hook by adding
allowFailure: true
to the binding configuration (not available foronStartup
hooks).
- You can change this behavior for a specific hook by adding
-
Each hook is executed with a binding context, that describes an already occurred event:
kubernetes
hook receivesEvent
binding context with an object related to the event.schedule
hook receives a name of triggered schedule binding.
-
If there is a sequence of hook executions in a queue, then hook is executed once with array of binding contexts.
- If binding contains
group
key, then a sequence of binding context with similargroup
key is compacted into one binding context.
- If binding contains
-
Several metrics are available for monitoring the activity of the queues and hooks: queues size, number of execution errors for specific hooks, etc. See METRICS for more details.
Shell-operator runs the hook with the --config
flag. In response, the hook should print its event binding configuration to stdout. The response can be in YAML format:
configVersion: v1
onStartup: ORDER,
schedule:
- {SCHEDULE_PARAMETERS}
- {SCHEDULE_PARAMETERS}
kubernetes:
- {KUBERNETES_PARAMETERS}
- {KUBERNETES_PARAMETERS}
or in JSON format:
{
"configVersion": "v1",
"onStartup": STARTUP_ORDER,
"schedule": [
{SCHEDULE_PARAMETERS},
{SCHEDULE_PARAMETERS}
],
"kubernetes": [
{KUBERNETES_PARAMETERS},
{KUBERNETES_PARAMETERS}
]
}
configVersion
field specifies a version of configuration schema. The latest schema version is v1 and it is described below.
Event binding is an event type ("onStartup", "schedule" or "kubernetes") plus parameters required for a subscription.
Use this binding type to execute a hook at the Shell-operator’ startup.
Syntax:
configVersion: v1
onStartup: ORDER
Parameters:
ORDER
— an integer value that specifies an execution order. When added to the "main" queue, the hooks will be sorted by this value and then alphabetically by file name.
Scheduled execution. You can bind a hook to any number of schedules.
Syntax:
configVersion: v1
schedule:
- crontab: "*/5 * * * *"
allowFailure: true|false
- name: "Every 20 minutes"
crontab: "*/20 * * * *"
allowFailure: true|false
- name: "every 10 seconds",
crontab: "*/10 * * * * *"
allowFailure: true|false
queue: "every-ten"
includeSnapshotsFrom: "monitor-pods"
- name: "every minute"
crontab: "* * * * *"
allowFailure: true|false
group: "pods"
...
Parameters:
-
name
— is an optional identifier. It is used to distinguish between multiple schedules during runtime. For more information see binding context. -
crontab
– is a mandatory schedule with a regular crontab syntax with 5 fields. 6 fields style crontab also supported, for more information see documentation on robfig/cron.v2 library. -
allowFailure
— if ‘true’, Shell-operator skips the hook execution errors. If ‘false’ or the parameter is not set, the hook is restarted after a 5 seconds delay in case of an error. -
queue
— a name of a separate queue. It can be used to execute long-running hooks in parallel with other hooks. -
includeSnapshotsFrom
— a list of names ofkubernetes
bindings. When specified, all monitored objects will be added to the binding context in asnapshots
field. -
group
— a key that define a group ofschedule
andkubernetes
bindings. See grouping.
Run a hook on a Kubernetes object changes.
Syntax:
configVersion: v1
kubernetes:
- name: "Monitor pods in cache tier"
apiVersion: v1
kind: Pod # required
executeHookOnEvent: [ "Added", "Modified", "Deleted" ]
executeHookOnSynchronization: true|false # default is true
fullObjectInSnapshot: true|false # default is true
nameSelector:
matchNames:
- pod-0
- pod-1
labelSelector:
matchLabels:
myLabel: myLabelValue
someKey: someValue
matchExpressions:
- key: "tier"
operator: "In"
values: ["cache"]
# - ...
fieldSelector:
matchExpressions:
- field: "status.phase"
operator: "Equals"
value: "Pending"
# - ...
namespace:
nameSelector:
matchNames: ["somenamespace", "proj-production", "proj-stage"]
labelSelector:
matchLabels:
myLabel: "myLabelValue"
someKey: "someValue"
matchExpressions:
- key: "env"
operator: "In"
values: ["production"]
# - ...
jqFilter: ".metadata.labels"
includeSnapshotsFrom:
- "Monitor pods in cache tier"
- "monitor Pods"
- ...
allowFailure: true|false # default is false
queue: "cache-pods"
group: "pods"
- name: "monitor Pods"
kind: "pod"
# ...
Parameters:
-
name
is an optional identifier. It is used to distinguish different bindings during runtime. See also binding context. -
apiVersion
is an optional group and version of object API. For example, it isv1
for core objects (Pod, etc.),rbac.authorization.k8s.io/v1beta1
for ClusterRole andmonitoring.coreos.com/v1
for prometheus-operator. -
kind
is the type of a monitored Kubernetes resource. This field is required. CRDs are supported, but the resource should be registered in the cluster before Shell-operator starts. This can be checked withkubectl api-resources
command. You can specify a case-insensitive name, kind or short name in this field. For example, to monitor a DaemonSet these forms are valid:"kind": "DaemonSet" "kind": "Daemonset" "kind": "daemonsets" "kind": "DaemonSets" "kind": "ds"
-
executeHookOnEvent
— the list of events which led to a hook's execution. By default, all events are used to execute a hook: "Added", "Modified" and "Deleted". Docs: Using API WatchEvent. Empty array can be used to prevent hook execution, it is useful when binding is used only to define a snapshot. -
executeHookOnSynchronization
— iffalse
, Shell-operator skips the hook execution with Synchronization binding context. See binding context. -
nameSelector
— selector of objects by their name. If this selector is not set, then all objects of a specified Kind are monitored. -
labelSelector
— standard selector of objects by labels (examples of use). If the selector is not set, then all objects of a specified kind are monitored. -
fieldSelector
— selector of objects by their fields, works like--field-selector=''
flag ofkubectl
. Supported operators are Equals (or=
,==
) and NotEquals (or!=
) and all expressions are combined with AND. Also, note that fieldSelector with 'metadata.name' the field is mutually exclusive with nameSelector. There are limits on fields, see Note. -
namespace
— filters to choose namespaces. If omitted, events from all namespaces will be monitored. -
namespace.nameSelector
— this filter can be used to monitor events from objects in a particular list of namespaces. -
namespace.labelSelector
— this filter works likelabelSelector
but for namespaces and Shell-operator dynamically subscribes to events from matched namespaces. -
jqFilter
— an optional parameter that specifies event filtering using jq syntax. The hook will be triggered on the "Modified" event only if the filter result is changed after the last event. See example 102-monitor-namespaces. -
allowFailure
— iftrue
, Shell-operator skips the hook execution errors. Iffalse
or the parameter is not set, the hook is restarted after a 5 seconds delay in case of an error. -
queue
— a name of a separate queue. It can be used to execute long-running hooks in parallel with hooks in the "main" queue. -
includeSnapshotsFrom
— an array of names ofkubernetes
bindings in a hook. When specified, a list of monitored objects from that bindings will be added to the binding context in asnapshots
field. Self-include is also possible. -
fullObjectInSnapshot
— if not set ortrue
, dumps of Kubernetes resources are cached for this binding and the snapshot includes them asobject
fields. Set tofalse
if the hook not relies on full objects to reduce the memory footprint. -
group
— a key that define a group ofschedule
andkubernetes
bindings. See grouping.
Example:
configVersion: v1
kubernetes:
# Trigger on labels changes of Pods with myLabel:myLabelValue in any namespace
- name: "label-changes-of-mylabel-pods"
kind: pod
executeHookOnEvent: ["Modified"]
labelSelector:
matchLabels:
myLabel: "myLabelValue"
namespace:
nameSelector: ["default"]
jqFilter: .metadata.labels
allowFailure: true
includeSnapshotsFrom: ["label-changes-of-mylabel-pods"]
This hook configuration will execute hook on each change in labels of pods labeled with myLabel=myLabelValue
in "default" namespace. The binding context will contain all pods with myLabel=myLabelValue
from "default" namespace.
Unlike kubectl
you should explicitly define namespace.nameSelector
to monitor events from default
namespace.
namespace:
nameSelector: ["default"]
Shell-operator requires a ServiceAccount with the appropriate RBAC permissions. See examples with RBAC: monitor-pods and monitor-namespaces.
This filter is used to ignore superfluous "Modified" events, and to exclude object from event subscription. For example, if the hook should track changes of object's labels, jqFilter: ".metadata.labels"
can be used to ignore changes in other properties (.status
,.metadata.annotations
, etc.).
The result of applying the filter to the event's object is passed to the hook in a filterResult
field of a binding context.
You can use JQ_LIBRARY_PATH
environment variable to set a path with jq
modules. Also, Shell-operator uses jq
release 1.6 so you can check your filters with a binary of that version.
Consider that the "Added" event is not always equal to "Object created" if labelSelector
, fieldSelector
or namespace.labelSelector
is specified in the binding
. If objects and/or namespace are updated in Kubernetes, the binding
may suddenly start matching them, with the "Added" event. The same with "Deleted" event: "Deleted" is not always equal to "Object removed", the object can just move out of a scope of selectors.
There is no support for filtering by arbitrary field neither for core resources nor for custom resources (see issue#53459). Only metadata.name
and metadata.namespace
fields are commonly supported.
However fieldSelector can be useful for some resources with extended set of supported fields:
kind | fieldSelector | src url |
---|---|---|
Pod | spec.nodeName spec.restartPolicy spec.schedulerName spec.serviceAccountName status.phase status.podIP status.nominatedNodeName |
1.16 |
Event | involvedObject.kind involvedObject.namespace involvedObject.name involvedObject.uid involvedObject.apiVersion involvedObject.resourceVersion involvedObject.fieldPath reason source type |
1.16 |
Secret | type | 1.16 |
Namespace | status.phase | 1.16 |
ReplicaSet | status.replicas | 1.16 |
Job | status.successful | 1.16 |
Node | spec.unschedulable | 1.16 |
Example of selecting Pods by 'Running' phase:
kind: Pod
fieldSelector:
matchExpressions:
- field: "status.phase"
operator: Equals
value: Running
Objects should match all expressions defined in fieldSelector
and labelSelector
, so, for example, multiple fieldSelector
expressions with metadata.name
field and different values will not match any object.
When an event associated with a hook is triggered, Shell-operator executes the hook without arguments. The information about the event that led to the hook execution is called the binding context and is written in JSON format to a temporary file. The path to this file is available to hook via environment variable BINDING_CONTEXT_PATH
.
Temporary files have unique names to prevent collisions between queues and are deleted after the hook run.
Binging context is a JSON-array of structures with the following fields:
binding
— a string from thename
orgroup
parameters. If these parameters has not been set in the binding configuration, then strings "schedule" or "kubernetes" are used. For a hook executed at startup, this value is always "onStartup".type
— "Schedule" forschedule
bindings. "Synchronization" or "Event" forkubernetes
bindings. "Synchronization" or "Group" ifgroup
is defined.
The hook receives "Event"-type binding context on Kubernetes event and it contains more fields:
watchEvent
— the possible value is one of the values you can use withexecuteHookOnEvent
parameter: "Added", "Modified" or "Deleted".object
— a JSON dump of the full object related to the event. It contains an exact copy of the corresponding field in WatchEvent response, so it's the object state at the moment of the event (not at the moment of the hook execution).filterResult
— the result ofjq
execution with specifiedjqFilter
on the above mentioned object. IfjqFilter
is not specified, thenfilterResult
is omitted.
The hook receives existed objects on startup for each binding with "Synchronization"-type binding context:
objects
— a list of existing objects that match selectors in binding configuration. Each item of this list containsobject
andfilterResult
fields. If the list is empty, the value ofobjects
is an empty array.
If group
or includeSnapshotsFrom
are defined, the hook receives binding context with additional field:
snapshots
— a map that contains a list of objects for each binding name fromincludeSnapshotsFrom
or for eachkubernetes
binding in a group. IfincludeSnapshotsFrom
list is empty, the field is omitted.
Hook with this configuration:
configVersion: v1
onStartup: 1
will be executed with this binding context at startup:
[{"binding": "onStartup"}]
For example, if you have the following configuration in a hook:
configVersion: v1
schedule:
- name: incremental
crontab: "0 2 */3 * * *"
allowFailure: true
then at 12:02, it will be executed with the following binding context:
[{ "binding": "incremental", "type":"Schedule"}]
A hook can monitor Pods in all namespaces with this simple configuration:
configVersion: v1
kubernetes:
- kind: Pod
During startup, the hook receives all existing objects with "Synchronization"-type binding context:
[
{
"binding": "kubernetes",
"type": "Synchronization",
"objects": [
{
"object": {
"kind": "Pod",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
}
},
{
"object": {
"kind": "Pod",
"metadata":{
"name":"kube-proxy-...",
"namespace":"kube-system",
...
},
}
},
...
]
}
]
If pod pod-321d12
is then added into namespace 'default', then the hook will be executed with the "Event"-type binding context:
[
{
"binding": "kubernetes",
"type": "Event",
"watchEvent": "Added",
"object": {
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
"name": "pod-321d12",
"namespace": "default",
...
},
"spec": {
...
},
...
}
}
]
Shell-operator caches a list of resources for each kubernetes
binding. Another bindings can access this list via includeSnapshotsFrom
parameter. Also, there is a group
parameter to automatically get all snapshots from multiple bindings and deduplicate executions.
Snapshot is a list of cached kubernetes objects and corresponding jqFilter results. To access the snapshot from particular binding, there is a map snapshots
in the binding context where the key is a binding name and the value is the snapshot.
snapshots
format:
"snapshots": {
"binding-name-1": [
{
"object": {
"kind": "Pod",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
},
"filterResult": { ... },
},
...
]
}
object
— it is a JSON dump of Kubernetes object.filterResult
— a JSON result of applyingjqFilter
to the Kubernetes object.
Keeping dumps for object
fields can take a lot of memory. There is a parameter keepFullObjectsInMemory: false
to disable full dumps.
Note that disabling full objects make sense only if jqFilter
is defined, as it disables full objects in snapshots
field, objects
field of "Synchronization" binding context and object
field of "Event" binding context.
For example, this binding configuration will execute hook with empty items in objects
field of "Synchronization" binding context:
kubernetes:
- name: pods
kinds: Pod
keepFullObjectsInMemory: false
To illustrate includeSnapshotsFrom
parameter, consider the hook that monitors changes of labels of all Pods and do something interesting on schedule:
configVersion: v1
schedule:
- name: incremental
crontab: "0 2 */3 * * *"
includeSnapshotsFrom: ["monitor-pods"]
kubernetes:
- name: monitor-pods
kind: Pod
jqFilter: '.metadata.labels'
includeSnapshotsFrom: ["monitor-pods"]
During startup, the hook will be executed with the "Synchronization" binding context with snapshots
JSON object:
[
{
"binding": "kubernetes",
"type": "Synchronization",
"objects": [
{
"object": {
"kind": "Pod",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
"labels": { ... },
...
},
},
"filterResult": {
"label1": "value",
...
}
},
{
"object": {
"kind": "Pod",
"metadata":{
"name":"kube-proxy-...",
"namespace":"kube-system",
...
},
},
"filterResult": {
"label1": "value",
...
}
},
...
],
"snapshots": {
"monitor-pods": [
{
"object": {
"kind": "Pod",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
},
"filterResult": { ... },
},
...
]
}
}
]
If pod pod-321d12
is then added into the "default" namespace, then the hook will be executed with the "Event" binding context with object
and filterResult
fields:
[
{
"binding": "kubernetes",
"type": "Event",
"watchEvent": "Added",
"object": {
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
"name": "pod-321d12",
"namespace": "default",
...
},
"spec": {
...
},
...
},
"filterResult": { ... },
"snapshots": {
"monitor-pods": [
{
"object": {
"kind": "Pod",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
},
"filterResult": { ... },
},
...
]
}
}
]
at 12:02, the hook will be executed with the following binding context:
[
{
"binding": "incremental",
"type": "Schedule",
"snapshots": {
"monitor-pods": [
{
"object": {
"kind": "Pod",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
},
"filterResult": { ... },
},
...
]
}
}
]
group
parameter defines a named group of bindings. Group is used when the source of event is not important and data in snapshots is enough for the hook. When binding with group
is triggered with the event, the hook receives snapshots from all bindings with equal group
name. Also, adjacent tasks with equal group
in the same queue are "compacted" and hook is executed only once. So it is wise to use the same queue for all hooks in a group.
executeHookOnSynchronization
, executeHookOnEvent
and keepFullObjectsInMemory
can be used with group
.
group
parameter is compatible with includeSnapshotsFrom
parameter. includeSnapshotsFrom
can be used to include additional snapshots into binding context.
Binding context for group contains:
binding
field with group name.type
field with "Synchronization" or "Group" string.snapshots
field if there is at least onekubernetes
binding in the group and inincludeSnapshotsFrom
.
Consider the hook that is executed on changes of labels of all Pods, changes in ConfigMap and also on schedule:
configVersion: v1
schedule:
- name: incremental
crontab: "* * * * *"
group: "pods"
kubernetes:
- name: monitor_pods
apiVersion: v1
kind: Pod
jqFilter: '.metadata.labels'
group: "pods"
- name: monitor_configmap
apiVersion: v1
kind: ConfigMap
jqFilter: '.data'
group: "pods"
During startup, the hook will be executed with the "Synchronization" binding context with snapshots
JSON object:
[
{
"binding": "pods",
"type": "Synchronization",
"snapshots": {
"monitor_pods": [
{
"object": {
"kind": "Pod",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
},
"filterResult": { ... },
},
...
],
"monitor_configmap": [
{
"object": {
"kind": "ConfigMap",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
},
"filterResult": { ... },
},
...
]
}
}
]
If pod pod-dfbd12
is then added into the "default" namespace, then the hook will be executed with the "Group" binding context:
[
{
"binding": "pods",
"type": "Group",
"snapshots": {
"monitor_pods": [
{
"object": {
"kind": "Pod",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
},
"filterResult": { ... },
},
...
],
"monitor_configmap": [
{
"object": {
"kind": "ConfigMap",
"metadata":{
"name":"etcd-...",
"namespace":"kube-system",
...
},
},
"filterResult": { ... },
},
...
]
}
}
]
Every minute it will be executed with the same binding context with fresh snapshots:
[
{
"binding": "pods",
"type": "Group",
"snapshots": {
"monitor_pods": [
...
],
"monitor_configmaps": [
...
]
}
}
]