Skip to content

Commit

Permalink
Fix monit connection failure when doing routeCheck/contianerCheck mon…
Browse files Browse the repository at this point in the history
…itor enable after config reload (#3698)

Fix:
sonic-net/sonic-buildimage#21268

How I did:
Reorder the sequence of doing enabling container_cheek and routeCheck before doing monit reload to avoid transient issue of monit sock error.

Also I add a sleep of 1 sec to make sure monitor enable configuration takes effect before we do reload.
  • Loading branch information
abdosi authored Dec 26, 2024
1 parent 34d2c8c commit 428d6da
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 7 deletions.
11 changes: 5 additions & 6 deletions config/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -949,18 +949,17 @@ def _restart_services():
# If load_minigraph exit before eth0 restart, commands after load_minigraph may failed
wait_service_restart_finish('interfaces-config', last_interface_config_timestamp)
wait_service_restart_finish('networking', last_networking_timestamp)

# Reload Monit configuration to pick up new hostname in case it changed
click.echo("Reloading Monit configuration ...")
clicommon.run_command(['sudo', 'monit', 'reload'])

try:
subprocess.check_call(['sudo', 'monit', 'status'], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
click.echo("Enabling container and routeCheck monitoring ...")
clicommon.run_command(['sudo', 'monit', 'monitor', 'container_checker'])
clicommon.run_command(['sudo', 'monit', 'monitor', 'routeCheck'])
clicommon.run_command(['sudo', 'monit', 'monitor', 'container_checker'])
time.sleep(1)
except subprocess.CalledProcessError as err:
pass
# Reload Monit configuration to pick up new hostname in case it changed
click.echo("Reloading Monit configuration ...")
clicommon.run_command(['sudo', 'monit', 'reload'])

def _per_namespace_swss_ready(service_name):
out, _ = clicommon.run_command(['systemctl', 'show', str(service_name), '--property', 'ActiveState', '--value'], return_cmd=True)
Expand Down
2 changes: 1 addition & 1 deletion tests/config_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,8 @@
Running command: config qos reload --no-dynamic-buffer --no-delay
Running command: pfcwd start_default
Restarting SONiC target ...
Reloading Monit configuration ...
Enabling container and routeCheck monitoring ...
Reloading Monit configuration ...
Please note setting loaded from minigraph will be lost after system reboot. To preserve setting, run `config save`.
Released lock on {0}
"""
Expand Down

0 comments on commit 428d6da

Please sign in to comment.