-
Notifications
You must be signed in to change notification settings - Fork 219
feat: Add CancelHealthCheckOnNewRevision feature to avoid getting stuck on failing commits #1518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add CancelHealthCheckOnNewRevision feature to avoid getting stuck on failing commits #1518
Conversation
c1db411 to
d2bf2fc
Compare
stefanprodan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR @trashhalo can you please signoff your commit and address my comments
|
@stefanprodan thanks for taking a pass! I'll get this all cleaned up based on your feedback over the next day or so. |
|
@trashhalo we are release kustomize-controller today, so if we want this to be part of Flux 2.7 this PR needs to be merged today. Otherwise this feature will have to wait till next year when we'll release Flux 2.8 |
…ck on failing commits This feature allows health checks to be cancelled when a new source revision becomes available, preventing the controller from getting stuck waiting for full timeout durations when fixes are already available. Features: - New opt-in feature flag: CancelHealthCheckOnNewRevision (default: false) - Health checks are cancelled early when new revisions are detected (~5s vs 5min timeout) - Uses the new WaitForSetWithContext method for clean context-based cancellation - Preserves existing behavior when feature is disabled The implementation monitors source revisions during health checks and cancels ongoing checks when new revisions are available, allowing immediate processing of potential fixes instead of waiting for full timeout periods. Signed-off-by: Stephen Solka <[email protected]>
d2bf2fc to
ecfdfea
Compare
|
Ok we had enough of Claude Code, reviewing this for you to input the review in AI and push here takes too much of our time. @matheuscscp is going to rewrite this PR, you can stop pushing changes. |
|
Superseded by #1520 |
I'm sorry. |
Summary
This PR introduces the
CancelHealthCheckOnNewRevisionfeature flag to prevent kustomize-controller from getting stuck waiting for health check timeouts when new source revisions containing potential fixes are available.Problem
Currently, when a Kustomization fails health checks (e.g., due to a bad deployment), the controller waits for the full timeout duration (typically 30 seconds) before processing any new revisions. This means that even if a fix is pushed immediately after the failing commit, users must wait for the full timeout before the fix is applied.
Solution
CancelHealthCheckOnNewRevision(default:false)Behavior Change
Before (existing behavior):
After (with feature enabled):
Implementation Details
Testing
Comprehensive test coverage includes:
TestKustomizationReconciler_CancelHealthCheckOnNewRevision:TestKustomizationReconciler_NoHealthCheckCancellation_WhenFeatureDisabled:Usage
Enable the feature by starting kustomize-controller with:
Benefits
🤖 Generated with Claude Code