Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Healthcheck script exits when nginx is not available resulting deadlock #364

Closed
1 task done
a-b opened this issue Dec 27, 2023 · 1 comment
Closed
1 task done

Comments

@a-b
Copy link
Member

a-b commented Dec 27, 2023

Issue

When the health check script fails due to Nginx unavailability, it triggers a chain of events that causes Nginx to exit, resulting in a deadlock instead of retrying.

Context

  1. ccng_monit_http_healthcheck script exits when curl exit code is non zero
  2. this triggers monit to restart cloud_controller_ng
  3. that triggers nginx to exit due to no one is listening to the socket

Steps to Reproduce

Delay cloud_controller_ng start that it is not listening to the socket.

Expected result

Health-check script to retry when Nginx is not listening instead of exiting.

Current result

The current logic of the health check script triggers monit to restart the cloud_controller_ng web server, causing deadlock.

Possible Fix

  1. Make the health check script stay in the retry loop when the curl exit code is 7 to tolerate nginx availability.
  2. Ensure that the health check process starts only after both web server and Nginx are online and ready by investigating and correcting monit dependencies.

Tasks

Preview Give feedback
@a-b a-b changed the title Healthcheck script exits when nginx is not available resulting deadlock. Healthcheck script exits when nginx is not available resulting deadlock Dec 27, 2023
a-b added a commit to a-b/capi-release that referenced this issue Dec 27, 2023
a-b added a commit to a-b/capi-release that referenced this issue Dec 29, 2023
Addresses cloudfoundry#364

If Nginx restarts, the healthcheck curl exits with code 7, triggering
another Nginx restart. To avoid this, we need stay in the loop and retry
to allow Nginx to complete the restart.
@philippthun
Copy link
Member

PR got merged, so I'm going to close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants