Skip to content

ci: harden aws-actions/configure-aws-credentials (v6.1.0 + retry)#5981

Merged
Fedr merged 1 commit intomasterfrom
ci/aws-credentials-v6-with-retry
Apr 25, 2026
Merged

ci: harden aws-actions/configure-aws-credentials (v6.1.0 + retry)#5981
Fedr merged 1 commit intomasterfrom
ci/aws-credentials-v6-with-retry

Conversation

@Fedr
Copy link
Copy Markdown
Contributor

@Fedr Fedr commented Apr 24, 2026

Summary

Unify every aws-actions/configure-aws-credentials invocation in this repo on v6.1.0 SHA-pinned and add retry-max-attempts: 5 to each. Motivated by a silent failure of the step on a windows-build-test leg in run 24911558626: the older v4 SHA swallowed the error without printing anything, leaving no way to tell whether the GitHub OIDC fetch or the AWS STS call was responsible.

Changes

Eight invocations across five files, all touched uniformly:

File Line Before After
.github/workflows/build-test-windows.yml 101 @4ce2bbcf # v4 @ec61189d # v6.1.0
.github/workflows/pip-build.yml 281 @4ce2bbcf # v4 @ec61189d # v6.1.0
.github/workflows/prepare-images.yml 175 @4ce2bbcf # v4 @ec61189d # v6.1.0
.github/actions/python-regression-tests/action.yml 63 @e3dd6a42 # v4 @ec61189d # v6.1.0
.github/workflows/build-test-distribute.yml 326 @v6 (unpinned) @ec61189d # v6.1.0
.github/workflows/unity-nuget-test.yml 34, 55, 133 @v6 (unpinned) @ec61189d # v6.1.0

retry-max-attempts: 5 is added to the with: block of each invocation. Default is 3.

Why v6.1.0

  • Released 2026-04-06, stable.
  • Two major-version jumps ahead of the v4.0.2-era SHA we had pinned in build-test-windows et al. The v5 and v6 branches both improved diagnostics for the OIDC-token-fetch / STS-AssumeRole paths. v4's quiet-fail on those paths is the exact behaviour we want to stop hiding.
  • Matches what build-test-distribute.yml and unity-nuget-test.yml were already using (just unpinned), so we don't cross a major version boundary anywhere — just converge on one SHA and stop drifting.

Why retry-max-attempts: 5

  • Applies to the STS AssumeRoleWithWebIdentity call only. Retries on transient 5xx / throttle responses.
  • Does not mask legitimate IAM denies (those return 4xx on the first call and are not retried).
  • Does not help with GitHub's OIDC-token-fetch step failing upstream of STS — that's a separate failure mode; the v6.1.0 bump is what surfaces those.
  • Cost: nothing on the happy path; at worst a few extra seconds on a degraded-AWS path that would otherwise have been a hard fail.

What this PR deliberately does NOT do

  • No permissions: block changes — id-token: write was already set everywhere it's needed.
  • No role-to-assume / aws-region / output-credentials changes.
  • No workflow-step env: ACTIONS_STEP_DEBUG: true — kept for a follow-up if we still see silent failures after this lands. (Would double log volume, worth paying only if v6.1.0 alone isn't enough.)

Test plan

  • All 8 AWS-credentials invocations updated uniformly (grep -c 'configure-aws-credentials@ec61189d' .github returns 8).
  • Full-ci run on this PR shows every AWS-credentials step completing successfully on a first try.
  • No regressions on the dependent steps (vcpkg S3 cache restore, aws s3 sync, aws ec2 start/stop-instances, etc.).

🤖 Generated with Claude Code

…attempts: 5

Addresses a silent failure of the Configure AWS Credentials step on a
windows-build-test leg in run 24911558626 job 72954213459: the step
produced no log output between its setup group and end-group, was
marked failed, and skipped every subsequent build/test step in that
job. Classic pattern for an OIDC-to-STS flake that the older action
version was swallowing without any surfaced error.

Two changes here, both small:

1. Pin every invocation of aws-actions/configure-aws-credentials to
   v6.1.0 (commit ec61189d). Previously we had a mix:
     - three v4 SHAs (4ce2bbcf, in build-test-windows / pip-build /
       prepare-images)
     - one different v4 SHA (e3dd6a42, in the python-regression-tests
       composite action)
     - four floating @v6 tag refs (build-test-distribute +
       unity-nuget-test ×3)
   v6.1.0 carries improved error reporting for OIDC-token-fetch and
   STS-call failures -- the exact modes that failed silently on v4. The
   unpinned @v6 refs are now SHA-pinned for supply-chain hygiene.

2. Add `retry-max-attempts: 5` to every invocation (default is 3).
   Retries the STS AssumeRoleWithWebIdentity call on transient 5xx /
   throttle responses from AWS. No effect on legitimate IAM denies,
   which fail on the first call with 4xx and skip the retry loop.

Scope: 5 files, 8 step instances, no behavioural change on the happy
path. No workflow-permissions / role-to-assume / region changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Fedr Fedr merged commit d3bf5a8 into master Apr 25, 2026
35 checks passed
@Fedr Fedr deleted the ci/aws-credentials-v6-with-retry branch April 25, 2026 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant