-
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Use Github Actions for StackStorm-Exchange CI and Maintenance #63
Comments
That's a very detailed proposal, thanks for putting it together! 👍 My concern is that the occurrence and severity of the problem highlighted and trying to solve it this way is unproportionally minor comparing to a proposed major change, eg. switching the entire CI platform. From the other side we all see how GH Actions are gaining popularity, people are getting more fluent with them and might be a good de-facto standard CI for the Exchange in the future. From the disadvantages @nmaludy mentioned one time is that GH Actions don't have SSH debugging that CircleCI has which might be very helpful sometimes. I'm thinking that we may find more advantages with time about moving to GH Actions. |
Before we change over from CircleCI to GHA, let's take minute and get some data on how many CircleCI configs have been customized, and how invasive those customizations have been. I would guess that not many packs have customized a great deal. I think a larger issue is that we've had updates to the StackStorm-Exchange/ci sample CircleCI config that haven't been pushed out to all packs, so there's probably a few previous versions of StackStorm-Exchange/ci's CircleCI config in various Exchange packs. It would be nice if we had an automated process to push out changes to StackStorm-Exchange/ci's CircleCI config to all packs. And we can either do that with StackStorm itself, or we can do it with GitHub Actions in the StackStorm-Exchange/ci repository itself (eg: a deploy workflow that runs for each PR that touches I don't see the weekly test scheduling as a big deal, except for massive failures (like Python 2.7 tests failing across the board). And even then, it's just a matter of deleting a bunch of email. I don't really know what benefit we get from keeping/controlling that information in a single place, since in the years that I've been working with StackStorm, we haven't ever adjusted that schedule, and I don't really see a need to. The weekend pack CI runs are the exact same workflows that run for every opened PR and every PR merge. And the ideal is that the weekend tests are only an early warning system for StackStorm pack devs (eg: transitive dependency causing issues). In practice, 99% of the weekend pack CI failures are due to GitHub Personal Access Tokens expiring due to disuse. There's a balance to be struck here. Some packs have a legitimate need to be able to customize the test environment, and I absolutely think that we should support whatever is needed. However, for the sake of keeping our cognitive and maintenance burdens as low as possible, we should strive to keep the test configs as consistent as we reasonably can. Letting pack authors go crazy with their test configs is going to end in frustration and tears for everybody. To summarize my post here:
Sorry to pour cold water on this idea. Having maintained StackStorm Exchange for a few years now, I'd like to think I have a pretty good perception of what our priorities should be. |
Some of the StackStorm-Exchange problems are highlighted in https://github.com/orgs/StackStorm-Exchange/projects/1 |
I looked into what it would take to switch just the What the deploy step doesThe deploy step does two related things. First it pushes new tags into the pack repo when the version changes in pack.yaml. Then it updates the index. The first part would be very easy in GHA as select workflows can have automatic write access to the repo. By default PR workflows do not have write access or access to secrets. Schedule and other event triggers can however. The second part pushes updated metadata to the How to move
|
Re customized CircleCI config: Before last week's exchange-wide pack CI updates, some packs had reformatted the CircleCI config slightly. Now, the CI config has been standardized across all packs with these exceptions:
|
Looking back, following the Security discussions https://github.com/StackStorm/private-discussions/issues/5 SSH access to CI system for Exchange Packs looks like a downside/risk these days. +1 for migration from CircleCI to GH Actions for a native & seamless integration to fix the current pain points of StackStorm Exchange. |
I no longer believe a shared CircleCI + GHA is a possibility because so much of the exchange infra is broken with CircleCI changes that force using an ssh deploy key. That in turn breaks our deploy workflow because it conflicts with how we're using the PAT to clone and modify repos in CircleCI. This is an overview what I think we need to do to move the current CI to GHA. Some of my other comments above explain additional workflows and future improvements, but ignore the bits about combining CircleCI with GHA. We have a common set of CI workflows for all StackStorm-Exchange workflows:
I would start with converting the build_and_test_python36 job to GHA before worrying about the deploy job. The deploy job will be more involved since it requires changes across repositories. For CircleCI, we had to copy the .circleci/config.yml from a master copy to each pack repo. The master copy of the exchange CircleCI config is here: That ci repo would probably be a good place to put the composite actions, assuming we can put more than one composite action in the same repo (edit: we can). Then, in each pack repo we would have a much lighter weight GHA workflow that specifies the cron schedule for weekly tests and uses those composite actions. One gotcha in all of this, is there are a couple of repos that had to modify the main workflow: the vault and zabbix packs:
|
Good stuff. Nice idea with the GH composite actions, similar to CircleCI orbs 👍 Perhaps having a new PoC stackstorm-exchange pack would be a good way to experiment with all the machinery & show it. |
OK I stubbed together some composite actions. They will not work yet, but hopefully they're a good starting point. To use these, I imagine a workflow that looks something like this: name: CI
on:
# ...
jobs:
build_and_test:
runs-on: ubuntu-latest
name: 'Build and Test - Python ${{ matrix.python-version-short }}'
strategy:
matrix:
include:
- python-version-short: 3.6
python-version: 3.6.13
steps:
# eventually replace @gha with @master
- name: Checkout Pack Repo and CI Repos
uses: StackStorm-Exchange/ci/gha/checkout@gha
- name: Install APT Dependencies
uses: StackStorm-Exchange/ci/gha/apt-dependencies@gha
with:
cache-version: v0
- name: Install Python Dependencies
uses: StackStorm-Exchange/ci/gha/py-dependencies@gha
with:
cache-version: v0
python-version: ${{ matrix.python-version }}
# vault pack would add one or more custom test setup steps here
- name: Run pack tests
uses: StackStorm-Exchange/ci/gha/test@gha
with:
# This makes the tests use an alternate config that enables shared libs
enable-common-libs: true We will still need to manage the cron schedule in each pack repo. |
For the deploy workflow, we have a variety of problems:
Serializing index updatesA persistent service would make serializing updates more natural, but then we have to deal with a persistent service (If we do go with a persistent service, GCP's free-tier offers 1 free e2-micro VM instance per month). But, maybe we can get away with creating a semi-persistent "service" using a github actions workflow. Based on the Github Usage Limits, workflows are limited as follows:
If we had a workflow running for the max of 72-hours, splitting that into 6-hour jobs would mean a workflow with 12 jobs where the job concurrency is limited to 1. But I'm not sure how to start a workflow every 72-hours. Using a cron scheduled workflow, we could easily do a workflow that runs every day (eg I don't think there's a good way to receive webhook events from github within a github action workflow. But we can poll Github's events API for StackStorm-Exchange org events. The index is meant to be eventually consistent since we have to serialize index updates, so this Events API caveat should not be a problem:
Index update workflowSo, the index update workflow/job would do something like this:
Pack Deploy workflowEach pack's deploy workflow, then would only have to:
Sane Github credentials managementDoing index updates this way fits within how Github currently manages tokens for workflows (read/write for the current repo; read-only for everything else), so we would not need any PATs or the persistent credentials that come with a Github app or an oauth app. |
OK. I've spent a lot of time figuring this out. I don't know when I'll have time to pick it up again. If someone else can please pick this up and work on any of these pieces, I would appreciate the help. |
With what you propose, we can avoid a long-running workflow and run the Index Update Workflow once every 5 mins by cron, which will store in git some state/checksum or similar to continue where we left off. Worst case, if events API won't work, can use https://docs.github.com/en/rest/reference/repos#list-organization-repositories list-org-repos API checking the Eventual consistency is fine, |
Thank you @lm-ydubler for helping to test/fix the build_and_test workflow. The For most packs, the workflow will consist of this (see https://github.com/StackStorm-Exchange/stackstorm-test/blob/gha/.github/workflows/build_and_test.yaml): name: CI
on:
push:
pull_request:
schedule:
# NOTE: We run this weekly at 1 am UTC on every Saturday
- cron: '0 1 * * 6'
jobs:
build_and_test:
name: 'Build and Test'
uses: StackStorm-Exchange/ci/.github/workflows/pack-build_and_test.yaml@gha
with:
enable-common-libs: true
#apt-cache-version: v0
#py-cache-version: v0 This uses github's newly GA reusable workflows to use this workflow. Any packs (like vault) that need to inject some logic, would copy this workflow and make their modifications instead of directly reusing it like this (differences include name: CI - Build and Test
on:
push:
pull_request:
schedule:
# NOTE: We run this weekly at 1 am UTC on every Saturday
- cron: '0 1 * * 6'
jobs:
build_and_test:
runs-on: ubuntu-latest
name: 'Build and Test - Python ${{ matrix.python-version-short }}'
strategy:
matrix:
include:
- python-version-short: 3.6
python-version: 3.6.13
steps:
# eventually replace @gha with @master
- name: Checkout Pack Repo and CI Repos
uses: StackStorm-Exchange/ci/.github/actions/checkout@gha
- name: Install APT Dependencies
uses: StackStorm-Exchange/ci/.github/actions/apt-dependencies@gha
with:
cache-version: v0
- name: Install Python Dependencies
uses: StackStorm-Exchange/ci/.github/actions/py-dependencies@gha
with:
cache-version: v0
python-version: ${{ matrix.python-version }}
# The vault pack would add its custom test setup steps here
- name: Run pack tests
uses: StackStorm-Exchange/ci/.github/actions/test@gha
with:
# This makes the tests use an alternate config that enables shared libs
enable-common-libs: true
services:
mongo:
image: mongo:3.4
ports:
- 27017:27017
rabbitmq:
image: rabbitmq:3
ports:
- 5672:5672 Next step is to figure out the deploy stuff. |
You can see a successful test run here: https://github.com/StackStorm-Exchange/stackstorm-test/actions/runs/1509040299 |
If we're running a frequent cron job to rebuild the index, every 5 minutes is probably too often as this involves cloning all pack repos. In my tests, it took 2-2.5 min just to clone all pack repos. To test, I forked the index and made the gha branch the default branch. In the gha branch there is a simple workflow (triggered manually with workflow_dispatch) that reuses a workflow in the gha branch of the ci repo. After editing the workflow in the ci repo, I re-run or re-trigger the workflow in my index repo fork. So far, the workflow clones ci/tooling repos and all the pack repos. Then it has a sample step to show how to loop through all the pack checkouts to do something simple ( Here's my latest test run: |
To clarify why we would need to set a cron job to run less than every 5 minutes, we need to ensure that pack updates are serialized (no parallel or concurrent edits). |
OK. I think using cron is more of a possibility than I thought because github has Hopefully github won't be upset with re-cloning all repos in the StackStorm-Exchange org every 5 minutes. There are no rate limits on cloning repos, but they can add them on a case-by-case basis if they don't like the traffic pattern: https://github.community/t/git-clone-limits-using-git-commands-vs-the-api-what-are-they/14357/2 So, we don't have to muck with the events right now. |
I'm satisfied that the index update workflow does what it needs to. We'll just need to define a cron schedule before we merge it to master on the index repo. Now, the final step: We need a process that creates tags on the pack repos. By process I mean, the steps someone needs to take when they want to cut a new release of a pack PLUS the Github Actions workflow(s) required to support those steps. Current state: CircleCIWe should not blindly copy what the CircleCI workflow does, because that process is subtly broken. Basically, the CircleCI deploy step would:
But, many people (myself included) updated pack.yaml in a PR because we know it will need a new version. But there are almost always multiple bug or typo fixes (eg to docs) after we've adjusted the pack.yaml. So, all of those commits after the pack.yaml update are not included until the next time a PR updates pack.yaml. Also, if that commit happens to be on a branch that is slightly behind master (but it will merge cleanly on master), then the merge commit produced when merging the PR will also not be included which means the tag won't include the newer commits on master either. The future: on Github ActionsSo, we need a different process that pack maintainers need to use to release new versions of a pack. The question is, what should that process look like? And what github workflow(s) do we need to support that process? |
Here are 2 possible workflows we could use: a release workflow inspired by OpsDroid's processI recently released a new version of OpsDroid, and I really liked their release workflow. Maybe we could do something similar for packs.
alternative tag-only processBut, that might make exchange-wide updates more difficult. So, we could also do something like:
One thing this doesn't do is attempt to add tags for older versions. That is something that the CircleCI workflow tried to do, but it did not do it well (as detailed above). I don't think we need to do that. |
For Exchange, the fact that the contributor just bumps the version in pack meta and a new git tag is automatically created helped us a lot in the maintenance. So yeah, automation with auto-tagging would be ideal, as before. |
Doing that tagging on merge to master would be really good, and better than we had before. As I think it wasn't well known that essentially you shouldn't bump the pack version until you'd had an all clear on the review, and then need to do one more change to update the pack version - to prevent the tag being done on the wrong commit. |
OK. This is ready for more eyes. Thanks go to @ym-dubler for helping to test, invalidate a bunch of my assumptions, and push this forward! And thanks to LogicMonitor for dedicating resources to this issue! We need to merge changes across multiple repos in this order:
I already merged the
These will need to be updated to replace After that I can start working on pushing these workflows to all the packs. I will need a senior maintainer to help disable CircleCI as we switch each pack over to GHA. |
So, just to clarify: the tag release workflow adds the tag on push to master (or whatever is the default branch) if the latest tag doesn't match the current version in pack.yaml. |
Happy to have helped and glad to have worked with you for a solution. |
@winem @lm-ydubler and I just had a meeting, we talked about doing this to roll the GHA updates out:
|
Should we add the gha workflows to https://github.com/StackStorm-Exchange/exchange-template ? |
It doesn't contain any CI workflows, so probably good as is without adding another dependency to the repo. |
OK. afaict, we have excised CircleCI from packs on the exchange. Can a senior maintainer (an org admin) please:
|
I created some skeleton workflows to show the outline of GHA workflows to bootstrap a pack repo and add maintainers to it. StackStorm-Exchange/exchange-incubator#172 If you have some time, please pick one or more of the tasks in those workflows and implement them. It's on the |
I would like to use that repo as a template repo to bootstrap new pack repos. I think https://github.com/StackStorm-Exchange/exchange-template is designed to be used by pack authors. But does that repo have much utility for pack authors? I suspect something like https://github.com/EncoreTechnologies/cookiecutter-stackstorm would be more useful to pack authors. So, would it be a ok to repurpose it for setting up new packs? |
Progress report on bootstrapping packs via GHA:
So, once that last PR is merged, we can bootstrap packs from incubator PRs with a Possible future workflows:
|
Current State
StackStorm-Exchange is using CircleCI for (1) pack testing and (2) updating the exchange index. Most of the CI logic is centralized in StackStorm-Exchange/ci to simplify exchange-wide CI updates. The CircleCI config (
.circleci/config.yml
) is a key piece of the CI infrastructure that cannot be centralized; a copy is kept in every pack on the exchange. CircleCI config can be edited by pack authors to add jobs, add docker images or steps to the standard jobs, etc. Also, the schedule for when weekly pack tests occur (Saturday/Sunday) is in the pack-local circle config.All other administrative tasks across the exchange require manual intervention by oneor more of the project maintainers. There are several versions of shared scripts that the maintainers pass around (in StackStorm-Exchange/ci, StackStorm-Exchange/exchange-tools, and StackStorm-Exchange/exchange-misc) to handle these administrative tasks. Other updates might be handled by ad-hoc scripting.
aside: this proposal has nothing to do with recent discussion around dropping python2.7 testing on the exchange. It is about optimizing our CI infrastructure for the Exchange.
Goals
Proposal
I recommend we move the Exchange's CI workflows from Circle CI to Github Actions. This includes writing a variety of new workflows to automate as many administrative tasks as possible.
For at least one of those workflows, use a tool that allows us to merge yaml files so that the standard CI jobs can be extended in ways that do not prevent automated updates of the file. Some form of centralized "variables" could be used to define, for example, the test schedule for various packs.
Other CI provider options
Many projects (including StackStorm) have been abandoning Travis CI for many reasons, so that is not a viable option for the exchange.
Circle CI has served us well. Circle CI uses a single config file for all of the CI workflows. However, a single config file for all CI makes it more difficult to both allow customization, and simplify exchange-wide CI updates.
The first bullet under Goal 4 provides another reason not to use CircleCI. The schedule is another potential merge conflict that makes doing exchange-wide pack updates difficult.
Why Github Actions (GHA)?
Looking at the goals listed above, writing workflows for 1 could probably be accomplished with any CI provider. But GHA might have a slight edge as it is much closer to the Github API, and many community-written-actions alreadfy implement some of the workflows, or steps in those workflows, that we'll need.
Goal 2 is the clearest winner with moving to GHA. This comes free with GHA, as each workflow is a separate file in the repo. So, any custom workflow files will be safe from exchange-wide updates to the standard workflows.
The tension between goals 3 and 4 require additional effort. But, with additional tooling we can define extension points in our standard CI pack test workflow(s) so that it can be updated by both pack authors and by exchange-wide CI updates. There are few yaml templating/merging options available. At least one of them should be runnable within github workflows.
Proposed new workflows
For the first bullet under goal 4, adjust weekly test schedule, we could have a centralized file that assigns packs to CI slots (currently Saturday or Sunday). Then, when that file gets updated (eg w/ a new pack), that can trigger a workflow that pushes out the assigned schedule to packs that need an update.
For the second bullet under goal 4, exchange-wide pack CI updates, we could have a central template file for the standard CI workflow config. When a PR that updates that template is merged, that would trigger a CI workflow that goes through all the exchange's packs, regenerates the affected standard workflows. That regeneration would take into account the schedule, and it would merge any customizations into the standard workflow.
Also, when the files that provide the customizations are updated in one of the packs, that should trigger regenerating the affected workflow file as well.
YAML merge tool options
Here are two options:
modulesync
/pdk
gflows
I would prefer
gflows
, assuming it works well for our use cases.modulesync
and/orpdk
Written in ruby.
Quoting @nmaludy: https://stackstorm-community.slack.com/archives/CSBELJ78A/p1612183089198000
https://github.com/voxpupuli/modulesync
https://puppet.com/docs/pdk/1.x/pdk_reference.html
gflows
Written in go.
gflows
is a tool (and a github action) that uses jsonnet or ytt to template github workflow files.Being a golang static-binary, it can be simply downloaded/run both in CI, and locally on pack authors' machines. Thus, the pack author will not have to set up any special environment (ruby, python, or otherwise) to regenerate any customized standard workflow file locally when authoring the pack.
https://github.com/jbrunton/gflows
Alternatives
One approach to adding customization within the CircleCI config is to one or more bash scripts throughout the standard workflows so pack authors can add their customization there. This is the approach taken by: StackStorm-Exchange/ci#101
This satisfies only part of goal 3. Other changes, like adding docker images or adding workflows, would also need a yaml merge solution.
Request for Comment
Does anyone have any additional pros/cons or opinions to add about:
gflows
vsmodulesync
gflows
to allow explicit customization of the pack CIThe text was updated successfully, but these errors were encountered: