Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubectl 1.25 describe has no --timeout flag, breaking kubeflow install #810

Open
PenumbralFromage opened this issue Jan 8, 2024 · 2 comments · May be fixed by #821
Open

kubectl 1.25 describe has no --timeout flag, breaking kubeflow install #810

PenumbralFromage opened this issue Jan 8, 2024 · 2 comments · May be fixed by #821
Labels
bug Something isn't working

Comments

@PenumbralFromage
Copy link

Describe the bug
Installation of kubeflow v1.7.0 (using manifest install pattern, not terraform) breaks when installing the aws-secrets-sync deployment due to a mismatch in the kubectl syntax. Kubectl has no --timeout on a describe operation, which is used in the utils/utils.py line 273, which fails the installation task(s).

Steps To Reproduce

  1. Install all prereq's for cognito-rds-s3
  2. run make deploy-kubeflow INSTALLATION_OPTION=kustomize DEPLOYMENT_OPTION=cognito-rds-s3 PIPELINE_S3_CREDENTIAL_OPTION=irsa
  3. See error (listed below)

Expected behavior
Kubeflow installed, and all commands operated without error.

Environment

  • Kubernetes version
    v1.25.16-eks-8cb36c9

  • Using EKS (yes/no), if so version?
    Yes - v1.25.16-eks-8cb36c9

  • Kubeflow version
    v1.7.0

  • kubectl version
    Client Version: v1.25.0
    Kustomize Version: v4.5.7
    Server Version: v1.25.16-eks-8cb36c9

  • AWS build number
    AWS_RELEASE_VERSION="v1.7.0-aws-b1.0.3"

  • AWS service targeted (S3, RDS, etc.)
    S3, Cognito, RDS, EKS

Screenshots
No screens - here's the logs:

...
==========Installing aws-secrets-manager==========
# Warning: 'bases' is deprecated. Please use 'resources' instead. Run 'kustomize edit fix' to update your Kustomization automatically.
# Warning: 'patchesStrategicMerge' is deprecated. Please use 'patches' instead. Run 'kustomize edit fix' to update your Kustomization automatically.
deployment.apps/aws-secrets-sync unchanged
Warning: secrets-store.csi.x-k8s.io/v1alpha1 is deprecated. Use secrets-store.csi.x-k8s.io/v1 instead.
secretproviderclass.secrets-store.csi.x-k8s.io/rds-secret unchanged
Waiting for aws-secrets-manager pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'app in (aws-secrets-sync)' --timeout=240s -n kubeflow
error: no matching resources found
error: unknown flag: --timeout
See 'kubectl describe --help' for usage.
Waiting for aws-secrets-manager pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'app in (aws-secrets-sync)' --timeout=240s -n kubeflow
error: no matching resources found
error: unknown flag: --timeout
See 'kubectl describe --help' for usage.
Waiting for aws-secrets-manager pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'app in (aws-secrets-sync)' --timeout=240s -n kubeflow
error: no matching resources found
error: unknown flag: --timeout
See 'kubectl describe --help' for usage.
Traceback (most recent call last):
  File "utils/kubeflow_installation.py", line 324, in <module>
    install_kubeflow(
  File "utils/kubeflow_installation.py", line 101, in install_kubeflow
    install_component(
  File "utils/kubeflow_installation.py", line 180, in install_component
    validate_component_installation(installation_config, component_name)
  File "/usr/local/lib/python3.8/dist-packages/retrying.py", line 56, in wrapped_f
    return Retrying(*dargs, **dkw).call(f, *args, **kw)
  File "/usr/local/lib/python3.8/dist-packages/retrying.py", line 266, in call
    raise attempt.get()
  File "/usr/local/lib/python3.8/dist-packages/retrying.py", line 301, in get
    six.reraise(self.value[0], self.value[1], self.value[2])
  File "/usr/local/lib/python3.8/dist-packages/six.py", line 719, in reraise
    raise value
  File "/usr/local/lib/python3.8/dist-packages/retrying.py", line 251, in call
    attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
  File "utils/kubeflow_installation.py", line 192, in validate_component_installation
    kubectl_wait_pods(value, namespace, key)
  File "/kube/tests/e2e/utils/utils.py", line 275, in kubectl_wait_pods
    raise Exception("Timeout/error waiting for pod condition")
Exception: Timeout/error waiting for pod condition
...

Additional context
AWS, EKS, Kubeflow 1.7.0

@PenumbralFromage PenumbralFromage added the bug Something isn't working label Jan 8, 2024
@pythonking6
Copy link

I can confirm this is the case!

@panasenco
Copy link

panasenco commented Jun 12, 2024

I have also run into this issue. I've just put in PR #821 to resolve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants