Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip Interlocked.Exchange and FlushConfig when no update detected #905

Merged
merged 5 commits into from
Jan 9, 2025

Conversation

vazois
Copy link
Contributor

@vazois vazois commented Jan 9, 2025

This PR fixes an issue that caused message flooding across the cluster. This happened because the config object kept being updated even when the received config did not change the local config during merge.

The check for an updated config is happening implicitly by comparing the new and current config objects

var currentCopy = current.Copy();
var next = currentCopy.Merge(senderConfig, workerBanList, logger).HandleConfigEpochCollision(senderConfig, logger);
if (currentCopy == next) return false;
if (Interlocked.CompareExchange(ref currentConfig, next, current) == current)
break;
}
FlushConfig();
return true;

This meant that the call to currentCopy.Merge should return the same object if no update has happened. This occurs correctly when merging the worker info

if (worker.ConfigEpoch <= workers[i].ConfigEpoch) return this;

However, an equivalent check was not there when the object was updated during the merging of the slotMap information.
The PR adds this check to fix the issue of message flooding when the cluster configuration is stable

Fixes #900

@vazois vazois force-pushed the vazois/config-update branch 3 times, most recently from 894c344 to 49c0edd Compare January 9, 2025 02:37
Copy link
Contributor

@badrishc badrishc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, please update version for a new release as well.

@vazois vazois marked this pull request as draft January 9, 2025 19:15
@vazois vazois force-pushed the vazois/config-update branch from e45088d to 952bc14 Compare January 9, 2025 19:49
@vazois vazois marked this pull request as ready for review January 9, 2025 20:17
@vazois vazois merged commit e9d7906 into microsoft:main Jan 9, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unresponsive Garner server upon multiple cluster meets
2 participants