Hi HyperShift maintainers,
While looking at the dedicated request-serving cleanup path, I noticed that deletePairConfigMaps lists matching ConfigMaps and then deletes each one, but it appears to return any Delete error directly.
The helper lists pair ConfigMaps in the placeholder namespace:
dedicated_request_serving_nodes.go:422-L428
It then deletes the ConfigMaps whose data points back to the HostedCluster:
dedicated_request_serving_nodes.go:429-L433
This cleanup runs during HostedCluster deletion before the scheduler finalizer is removed:
dedicated_request_serving_nodes.go:529-L545
The race I am wondering about is:
reconcile A lists a pair ConfigMap for the HostedCluster
reconcile B or another cleanup path deletes that ConfigMap
reconcile A calls Delete on the stale listed object
Delete returns NotFound
handleDeletion returns the error before removing the scheduler finalizer
For this kind of list-then-delete cleanup, NotFound after the list is usually an idempotent success: the desired cleanup state has already been reached. Returning the error can make the deletion path fail for a transient stale-list race and delay finalizer removal until a later reconcile.
Would it make sense to ignore apierrors.IsNotFound(err) around this r.Delete call, similar to other Kubernetes cleanup paths that treat an already-deleted object as success?
Hi HyperShift maintainers,
While looking at the dedicated request-serving cleanup path, I noticed that
deletePairConfigMapslists matching ConfigMaps and then deletes each one, but it appears to return anyDeleteerror directly.The helper lists pair ConfigMaps in the placeholder namespace:
dedicated_request_serving_nodes.go:422-L428It then deletes the ConfigMaps whose data points back to the HostedCluster:
dedicated_request_serving_nodes.go:429-L433This cleanup runs during HostedCluster deletion before the scheduler finalizer is removed:
dedicated_request_serving_nodes.go:529-L545The race I am wondering about is:
For this kind of list-then-delete cleanup,
NotFoundafter the list is usually an idempotent success: the desired cleanup state has already been reached. Returning the error can make the deletion path fail for a transient stale-list race and delay finalizer removal until a later reconcile.Would it make sense to ignore
apierrors.IsNotFound(err)around thisr.Deletecall, similar to other Kubernetes cleanup paths that treat an already-deleted object as success?