Open
Conversation
We used to handle this with a timeout, but we had a test to make sure the timeout never actually had to happen. The timeout could in fact need to happen, and the test was flaky. I tried to get Anthropic Claude to solve this, and it noticed the race, but was unable to come up with a design I liked for fixing it, so I did it myself. I don't *really* like my design either, but at least it's mine now. This moves responsibility for marking a WES workflow as CANCELED from the Celery task to the Celery task if it can and the ToilWorkflow get_state() method otherwise. When somebody asks for the state of a workflow, we ask Celery if it's actually stopped or not. If it has stopped without error and is supposedly CANCELING, we declare it CANCELED. I'm removing the timeout-based way to go from CANCELING to CANCELED, because if the task *is* still there and doing stuff, it can't really be canceled yet. It would still be nicer to have the responsibility in one place, but at least this way I'm reducing and not increasing the number of weird methods. To test this I added a sleep that can make the cancel attempt win the race, which involved adding an ugly argument to the fake-Celery code, because we can't monkey-patch at class scope and expect a Multiprocessing process to see it.
4b7c31b to
2960d3b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is on top of #5480 so that CI had a faint hope of passing. Only the last commit really goes here.
This will fix #5448.
Changelog Entry
To be copied to the draft changelog by merger:
CANCELINGfor several seconds when canceled before they fully start.Reviewer Checklist
issues/XXXX-fix-the-thingin the Toil repo, or from an external repo.camelCasethat want to be insnake_case.docs/running/{cliOptions,cwl,wdl}.rstMerger Checklist