-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pulumi operator process gets killed after running pip install #440
Comments
@JonCholas Thanks for reporting this issue. Would you also be able to provide the configuration and specs of your GKE cluster and nodes? Could you also provide a sample of your Python stack, or describe which resources you are using. It appears that the |
I did try giving the pod more resources and it worked. The odd thing is that the graphs on consumed resources weren't showing any spike, so I'm guessing the memory consumption rump up is too fast to show anything. Is there a way to improve the operator to show more details on why the process was killed? |
@rquitales Could you please adjust the labels and respond to Jon again? Thank you. |
@JonCholas Thanks for the update, and apologies for the delayed response on this. There currently isn't a way to show more details on why this process is killed. The kill signal is coming from pip itself (see https://stackoverflow.com/questions/43245196/pip-install-killed). We shell out to pip to install the required Python dependencies, and return any errors that may occur. Unfortunately, the error message returned from pip isn't descriptive, so we can only deduce that the issue is due to resource starvation based on context. We can try to improve logging within our operator with more info messages, but I'm not sure if we can provide more descriptive errors for pip being killed. |
As next steps, I'll look into possible ways to improve the current logging we have to better surface this issue instead of having a simple "signal killed" message. |
@rquitales sounds good, I think even a FAQ to troubleshooting doc about comment problems with the operator will fix it if there is no more details from pip |
how about logging stdout + stderr of the pip install command to the log? i have currently the same issue. I have been able to exec into the operator container. Running pip install from the commandline works and exists with error code 0. In my case, the operator's container hasn't been restarted. There are also no memory limits set on the container. |
Update: I was able to troubleshoot the issue by redeploying the operator with |
Good news everyone, we just released a preview of Pulumi Kubernetes Operator v2. This new release has a whole-new architecture that uses pods as the execution environment. The installation process is now based on Please read the announcement blog post for more information: Would love to hear your feedback! Feel free to engage with us on the #kubernetes channel of the Pulumi Slack workspace. |
What happened?
Um pushing a stack to pulumi operator, bit it always fails after doing a pip install with a
signal: killed
although the dependencies get installed as when it runs again and again, pip shows that the dependencies are cachedPip install command
Next log message
Expected Behavior
The workdir is setup correctly and pulumi up runs without problems
Steps to reproduce
Push a stack in python that has installation with pip
Output of
pulumi about
pulumi about
CLI
Version 3.64.0
Go Version go1.20.3
Go Compiler gc
Host
OS debian
Version 11.6
Arch x86_64
Pulumi locates its logs in /tmp by default
warning: Failed to read project: no Pulumi.yaml project file found (searching upwards from /). If you have not created a project yet, use
pulumi new
to do so: no project file foundwarning: A new version of Pulumi is available. To upgrade from version '3.64.0' to '3.65.1', visit https://pulumi.com/docs/reference/install/ for manual instructions and release notes.
Additional context
I have seen this before but it was retrying and then succeeding. Not sure what has changed that now it always fails.
Im running this in a GKE cluster
Contributing
Vote on this issue by adding a 👍 reaction.
To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).
The text was updated successfully, but these errors were encountered: