-
Notifications
You must be signed in to change notification settings - Fork 249
Open
Description
This is a follow-up to #4167, which laid the groundwork for more transparent and consistent reporting of errors in resource statuses, by:
- distinguishing errors between retryable and non-retryable ones:
- retryable errors are typically those related to interacting with the Kubernetes API server. Failures in those contexts may be related to transient network failures, timeouts, resources not being in the expected state yet... Therefore retries with exponential back-off can be suitable, without errors needing to be surfaced to users who may not be able to do anything to mitigate them. When such an error occurs, Fleet should:
- log it, to keep a trace of it somewhere, but without leading to repeated status updates
- return a controller
Resultwith a non-zeroRequeueAfterand anilerror, allowing a reconcile of the resource to be requeued
- non-retryable errors may come from e.g. configuration issues, invalid input data, etc, and are not expected to be resolved unless the user does something in that direction, which is why propagating such errors to a resource status is particularly important, especially in cases where users do not have access to Fleet logs. A non-retryable error should lead to:
- the reconciler returning:
- an empty controller
Result - a
TerminalError, instructing the reconciler not to requeue the resource, as per controller-runtime docs.
- an empty controller
- the error being propagated to the resource's status, making it visible to users
- the reconciler returning:
- retryable errors are typically those related to interacting with the Kubernetes API server. Failures in those contexts may be related to transient network failures, timeouts, resources not being in the expected state yet... Therefore retries with exponential back-off can be suitable, without errors needing to be surfaced to users who may not be able to do anything to mitigate them. When such an error occurs, Fleet should:
This logic has already been implemented in #4167, where it is being used for bundles.
This should be implemented for GitRepos as well.
Acceptance criteria:
- the gitOps reconciler treats retryable and non-retryable errors differently, as outlined above
- it does so by reusing, and possibly extending, common logic implemented in Improve bundle and GitRepo status error reporting #4167.
Metadata
Metadata
Assignees
Type
Projects
Status
🆕 New