Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multus should look for CNI configuration in a different directory than kubelet #218

Closed
wants to merge 1 commit into from
Closed

Conversation

squeed
Copy link
Contributor

@squeed squeed commented Jul 1, 2019

This changes Multus to look for "default" CNI configurations in a different directory than where it writes it's file for CRIO. It also updates all plugins to write to that directory (when multus is enabled).

@squeed squeed requested review from dougbtv and dcbw July 1, 2019 16:08
@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 1, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: squeed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 1, 2019
@squeed squeed removed the request for review from danwinship July 1, 2019 16:08
@squeed
Copy link
Contributor Author

squeed commented Jul 2, 2019

Hmm, all the nodes came up Ready, so I don't think this caused any issues.
/retest

@squeed
Copy link
Contributor Author

squeed commented Jul 2, 2019

Found the problem:
Multus: Err in loading K8s Delegates k8s args: GetK8sClient: failed to get context for the kubeconfig /etc/cni/net.d/multus.d/multus.kubeconfig, refer Multus README.md for the usage guide: stat /etc/cni/net.d/multus.d/multus.kubeconfig: no such file or directory

…n kubelet.

This changes Multus to look for "default" CNI configurations in a
different directory than where it writes it's file for CRIO. It also
updates all plugins to write to that directory (when multus is enabled).
@squeed
Copy link
Contributor Author

squeed commented Jul 3, 2019

/retest

@danwinship
Copy link
Contributor

So can you clarify what this is changing? I thought we basically already did this...

@squeed
Copy link
Contributor Author

squeed commented Jul 3, 2019

Multus implemented this option, but now we actually use it.

@danwinship
Copy link
Contributor

But I could have sworn we already did this... or was that just for the CNI binaries and now this includes the conf files as well?

@squeed
Copy link
Contributor Author

squeed commented Jul 3, 2019

But I could have sworn we already did this... or was that just for the CNI binaries and now this includes the conf files as well?

CNI binaries have always been the same directory - and that doesn't change. All I recall is talking about implementing this in multus - and now we're using that feature.

@squeed
Copy link
Contributor Author

squeed commented Jul 4, 2019

Interesting, looks like it's actually failing upgrade from 4.1 - taking a look.

@squeed
Copy link
Contributor Author

squeed commented Jul 4, 2019

/test e2e-aws-upgrade

@squeed
Copy link
Contributor Author

squeed commented Jul 4, 2019

This continuously fails upgrade. It seems that crio is somehow marking the node as NotReady, even though the CNI configuration file (from multus) is definitely on disk. Of course, restarting crio fixes it.

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 4, 2019
@squeed
Copy link
Contributor Author

squeed commented Jul 4, 2019

Yeah, confirmed - there's a weird crio situation where if /etc/kubernetes/cni/net.d/80-openshift.conf is deleted, crio reports the node as NotReady, even though 00-multus.conf exists. Restarting crio fixes this, but that's not acceptable.

Of course, we're in to this situation exactly because we didn't do the right thing and relied on usually winning a race in the first place.

@openshift-ci-robot
Copy link
Contributor

@squeed: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-aws-upgrade 0080e57 link /test e2e-aws-upgrade

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@squeed
Copy link
Contributor Author

squeed commented Jul 4, 2019

Filed cri-o/ocicni#46 - let's see how that goes.

@openshift-ci-robot
Copy link
Contributor

@squeed: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 16, 2019
@squeed
Copy link
Contributor Author

squeed commented Jul 16, 2019

closing for now (so it doesn't get accidentally merged).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants