Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework Config Watcher and speed up configuration reload #774

Merged
merged 13 commits into from
Sep 29, 2022

Conversation

pkosiec
Copy link
Collaborator

@pkosiec pkosiec commented Sep 28, 2022

Description

Changes proposed in this pull request:

  • Rework Config Watcher to have it a sidecar that calls BotKube endpoint when config should be reloaded
  • Implement /reload endpoint that restarts BotKube - it is called by the new Config Watcher
  • Improve messages when editing Source Bindings: if Config Watcher is disabled, it asks user to restart the app manually
  • Fix rendering emojis for amessages on MS Teams
  • Fix make container-image-single target
  • Fix MS teams formatting issues

Limitations

The initial Config Watcher sync sometimes takes ~7-8 seconds on my machine, depending on the load. To prevent posting "Welcome" message by BotKube, I implemented wait for the synchronization in our app (with a timeout of 10s, this is a non-fatal error). That means BotKube start is slowed down now by a few secs.
Why? This avoids an issue when a user edits Source Bindings right after the BotKube "welcome" message and Config Watcher doesn't pick the change. See the video:

Screen.Recording.2022-09-28.at.16.05.01.mov

But, for a price of a slower start. Watch this:

Screen.Recording.2022-09-28.at.15.16.43.mov

That's why I made it configurable. Not sure what about the default values though. Does a user really edits Source Bindings within first 10 seconds after BotKube startup? I think that's an edge case we shouldn't spend too much time. I think the faster startup is more important. If you agree, that means maybe we should set the initialSyncTimeout to 0s. WDYT?

Breaking changes

  • The settings.configWatcher option has been replaced with configWatcher.enabled: true. Config Watcher now runs as a sidecar container. See possible configuration options on the Helm chart parameters page.

To do

  • Documentation - as a follow-up PR after merging this one

Testing

Check out the PR.
Export envs:

export WS_SLACK_BOT_TOKEN=""
export WS_SLACK_APP_TOKEN=""

Install BotKube:

cat > /tmp/values.yaml << ENDOFFILE
communications:
  'default-group':
    # Settings for Slack
    socketSlack:
      enabled: true
      botToken: "${WS_SLACK_BOT_TOKEN}"
      appToken: "${WS_SLACK_APP_TOKEN}"
      channels:
        'default':
          name: backend
          bindings:
            executors:
              - kubectl-read-only
            sources:
              - k8s-backend
        'frontend':
          name: frontend
          bindings:
            executors:
              - kubectl-read-only
            sources:
              - k8s-frontend

settings:
  clusterName: gke-playgrounda
  upgradeNotifier: false

configWatcher:
  initialSyncTimeout: 0 # you can change it to 10s 

sources:
  'k8s-backend':
    kubernetes:
      recommendations:
        pod:
          noLatestImageTag: false
          labelsSet: false
        ingress:
          backendServiceValid: true
          tlsSecretValid: true

      namespaces:
        include:
          - "backend"

      resources:
        - name: apps/v1/deployments
          events:
            - create
            - error
        - name: v1/pods
          events:
            - create
            - error
        - name: v1/services
          events:
            - create
            - error
        - name: networking.k8s.io/v1/ingresses
          events:
            - create
  'k8s-frontend':
    kubernetes:
      recommendations:
        pod:
          noLatestImageTag: false
          labelsSet: false
        ingress:
          backendServiceValid: false
          tlsSecretValid: false

      resources:
        - name: v1/pods
          namespaces:
            include:
              - "frontend"
          events:
            - create
            - error

executors:
  'kubectl-read-only':
    enabled: true
rbac:
  create: true
  rules:
    - apiGroups: ["*"]
      resources: ["*"]
      verbs: ["get", "watch", "list"]
    - apiGroups: [""]
      resources: ["pods"]
      verbs: ["delete", "create"]

analytics:
  disable: true

image:
  repository: kubeshop/pr/botkube
  tag: 774-PR
#  registry: docker.io
#  repository: pkosiec/botkube
#  tag: 'config-reload-fix'
ENDOFFILE
helm install botkube ./helm/botkube -n botkube --create-namespace  -f /tmp/values.yaml

Edit Source Bindings on frontend or backend channel. You can use the UI.
Observe BotKube behavior.

You can reinstall BotKube, this time with configWatcher.initialSyncTimeout: 10s and see how it works.

Related issue(s)

Resolves #765

@pkosiec pkosiec added the enhancement New feature or request label Sep 28, 2022
@pkosiec pkosiec added this to the v0.14.0 milestone Sep 28, 2022
@pkosiec pkosiec changed the title Rework Config Watcher and configuration reload Rework Config Watcher and speed up configuration reload Sep 28, 2022
@pkosiec pkosiec added the bug Something isn't working label Sep 28, 2022
@pkosiec pkosiec force-pushed the config-reload-improvements branch from 4acfd12 to 9100d3e Compare September 28, 2022 13:27
@pkosiec pkosiec marked this pull request as ready for review September 28, 2022 14:09
@pkosiec pkosiec requested a review from a team September 28, 2022 14:09
@pkosiec pkosiec requested a review from PrasadG193 as a code owner September 28, 2022 14:09
@huseyinbabal
Copy link
Contributor

@pkosiec I was testing locally, I suggest to use following notation for botkube image to prevent apple m1 problem, plus your pr already generates docker image manifest for multiple archs.

image:
  repository: kubeshop/pr/botkube
  tag: 774-PR

I have apple m1 problem with config watcher, will fix and rerun the tests
Screen Shot 2022-09-29 at 10 44 35

@pkosiec
Copy link
Collaborator Author

pkosiec commented Sep 29, 2022

Oh sorry @huseyinbabal - I pushed my own image as there were problem with the CI build. But yeah, it was amd64 only.

Now as I rebased the PR, please use the values you posted 🙂 Will update PR desc as well. Thanks!

@huseyinbabal
Copy link
Contributor

Oh sorry @huseyinbabal - I pushed my own image as there were problem with the CI build. But yeah, it was amd64 only.

Now as I rebased the PR, please use the values you posted 🙂 Will update PR desc as well. Thanks!

sure, I am also building k8s-sidecar for arm64, will push to ghcr
Screen Shot 2022-09-29 at 10 52 44

@pkosiec
Copy link
Collaborator Author

pkosiec commented Sep 29, 2022

Ah, yes, I forgot about making k8s-sidecar multiarch build 😞 I'm sorry. Original k8s-sidecar is multiarch indeed, but again, I built amd64 version only... Time to switch from Intel to M1/M2 as well 😄 Thank you Huseyin for help and sorry for troubles!

@huseyinbabal
Copy link
Contributor

Ah, yes, I forgot about making k8s-sidecar multiarch build 😞 I'm sorry. Original k8s-sidecar is multiarch indeed, but again, I built amd64 version only... Time to switch from Intel to M1/M2 as well 😄 Thank you Huseyin for help and sorry for troubles!

No worries, now I managed to run everything with following additional config

...
configWatcher:
  image:
    tag: 'initial-events-arm64'
...

@huseyinbabal
Copy link
Contributor

Tested with Socket and Socket Slack mode, it works well 👍
Screen Shot 2022-09-29 at 11 22 45
Screen Shot 2022-09-29 at 11 23 00

@pkosiec pkosiec force-pushed the config-reload-improvements branch from 99bd35c to e3c2dee Compare September 29, 2022 10:40
@pkosiec pkosiec force-pushed the config-reload-improvements branch from 334533f to 7501aa3 Compare September 29, 2022 10:42
@pkosiec
Copy link
Collaborator Author

pkosiec commented Sep 29, 2022

Hey @huseyinbabal

Hope we'll be ready to merge this soon 🚀

@pkosiec pkosiec added the breaking Contains breaking change label Sep 29, 2022
@huseyinbabal
Copy link
Contributor

I retested with final changes, and it works 🚀

@pkosiec pkosiec merged commit 74b5f52 into kubeshop:main Sep 29, 2022
@pkosiec pkosiec deleted the config-reload-improvements branch September 29, 2022 11:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Contains breaking change bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Config update detection for runtime config is too slow
3 participants