feat(mit-learn-nextjs): remove blue/green EFS deployment, scope Fastly purge#4652
Open
blarghmatey wants to merge 4 commits into
Open
feat(mit-learn-nextjs): remove blue/green EFS deployment, scope Fastly purge#4652blarghmatey wants to merge 4 commits into
blarghmatey wants to merge 4 commits into
Conversation
…p_yarn for mit-learn-nextjs - Change all three purge_all calls to purge/html-pages surrogate key so immutable /_next/static/ chunks are never invalidated at deploy time - Remove build_target="build_skip_yarn" from mit-learn-nextjs AppPipelineParams now that next build is baked into the Docker image via standalone output Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…g update Remove the blue/green deployment mechanism and EFS PVC-based build approach in favour of a standard Kubernetes rolling update: - Remove: PVC creation (blue/green EFS volumes), Kubernetes build Job, blue and green Deployments, deployment-state ConfigMap, get_last_active_ from_configmap(), determine_colors(), create_deployment_for_color(), auto_toggle logic, and all color-dependent exports - Add: single Deployment with RollingUpdate strategy (maxUnavailable=0, maxSurge=1), static Service selector, single PodDisruptionBudget - Drop kubernetes-client import (no longer reads live cluster state during pulumi up) The next build is now baked into the Docker image (standalone output), so there is no need for an EFS volume or a build Job at deploy time. BREAKING: must be deployed together with the corresponding mit-open Dockerfile change that adds the standalone runner stage. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR simplifies the MIT Learn Next.js Kubernetes deployment by removing the blue/green EFS-based build/cache mechanism in favor of a standard rolling Deployment, and updates the Concourse pipeline to scope Fastly cache invalidation to HTML pages only.
Changes:
- Replace blue/green EFS build + dual-Deployment toggle logic with a single rolling
Deployment, staticServiceselector, and onePodDisruptionBudget(mit_learn_nextjs/__main__.py). - Remove the pipeline’s special
build_skip_yarnbuild target formit-learn-nextjsso it uses the default Dockerfile stage (pipeline.py). - Change Fastly invalidation from
purge_alltopurge/html-pagesformit-learn-nextjsdeploys (pipeline.py).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/ol_infrastructure/applications/mit_learn_nextjs/__main__.py |
Removes blue/green + EFS build workflow and defines a single rolling Deployment/Service/PDB for Next.js. |
src/ol_concourse/pipelines/infrastructure/k8s_apps/pipeline.py |
Updates mit-learn-nextjs pipeline params and scopes Fastly purges to the html-pages surrogate key. |
Comments suppressed due to low confidence (2)
src/ol_concourse/pipelines/infrastructure/k8s_apps/pipeline.py:484
- The Fastly purge command string ends with an extra empty argument (
""). This is interpreted by the shell as an additional (empty) URL argument tocurl, which can make the purge task fail. Remove the trailing""from the command.
"-exc",
# Purge only HTML pages (tagged with surrogate key "html-pages").
# /_next/static/ assets are content-addressed and immutable —
# purging them causes missing-chunk errors during rolling deployments.
f"""curl -H "Fastly-Key: ((fastly.fastly_api_token))" -H "Accept: application/json" -i -X POST "https://api.fastly.com/service/((fastly.{pipeline_parameters.fastly_service_prefix}service_id_qa))/purge/html-pages" """,
],
src/ol_concourse/pipelines/infrastructure/k8s_apps/pipeline.py:506
- The Fastly purge command includes a trailing empty-string argument (
""). When run viash -exc,curlreceives an extra empty URL argument and may exit non-zero, breaking the deployment pipeline. Drop the trailing""so the command only contains the intended purge URL.
"-exc",
# Purge only HTML pages (tagged with surrogate key "html-pages").
# /_next/static/ assets are content-addressed and immutable —
# purging them causes missing-chunk errors during rolling deployments.
f"""curl -H "Fastly-Key: ((fastly.fastly_api_token))" -H "Accept: application/json" -i -X POST "https://api.fastly.com/service/((fastly.{pipeline_parameters.fastly_service_prefix}service_id_production))/purge/html-pages" """,
],
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Remove stale phase-coordination comment from mit-learn-nextjs params; the Kubernetes build Job is already removed in this PR so the comment describing future Phase 3d work no longer applies - Remove trailing whitespace from curl command strings in all three purge-fastly-cache task steps (CI, QA, Production); the trailing space in the triple-quoted f-strings was benign but misleading Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
feoh
approved these changes
May 22, 2026
…rams
Add a new fastly_purge_scope field to AppPipelineParams (default: "purge_all")
that controls which Fastly purge endpoint is called when purge_fastly_cache
is enabled.
- "purge_all" (default) maps to POST /service/{id}/purge_all, preserving
the existing full-cache purge behaviour for all current consumers
- Any other string maps to POST /service/{id}/purge/{scope}, purging only
objects tagged with that surrogate key
Update mit-learn-nextjs to explicitly set fastly_purge_scope="html-pages"
rather than hardcoding the surrogate key in comments inside the curl strings.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What are the relevant tickets?
N/A
Description (What does it do?)
Simplifies the MIT Learn Next.js deployment by removing the blue/green EFS build mechanism and replacing it with a standard rolling update. Accompanies app-side changes in mitodl/mit-learn#3364.
1. Scoped Fastly cache invalidation (
pipeline.py)Changes all three
purge_allConcourse task calls (CI, QA, Production) topurge/html-pagesso immutable/_next/static/content-addressed chunks are never invalidated at deploy time. The mit-learn app now tags all HTML routes withSurrogate-Key: html-pages.2. Build baked into Docker image (
pipeline.py)Removes
build_target="build_skip_yarn"from the pipeline params.yarn buildis now run inside the Docker build (baked into the image's finalrunnerstage viaoutput: "standalone"innext.config.js) — not as a Kubernetes Job at deploy time. Without abuild_target, the pipeline builds the full Dockerfile to its last stage (runner).3. Replace blue/green EFS deployment with rolling update (
__main__.py)Removed (~350 lines):
nextjs-build-cache-efs-blue/green) backed by EFSyarn buildat deploy timecreate_deployment_for_color(),create_pvc_for_color(),determine_colors()get_last_active_from_configmap()(live Kubernetes API calls duringpulumi up)deployment_state_configmapConfigMapauto_toggle/last_active/ color-toggle logicfrom kubernetes import client, configimportAdded (~100 lines):
DeploymentwithRollingUpdate(maxUnavailable: 0,maxSurge: 1)volumesorvolumeMounts— no EFS dependencyServiceselector onk8s_app_labelsPodDisruptionBudgetdomainandimageonly4. Env vars now consumed at runtime (not build time)
Previously, the
NEXT_PUBLIC_*env vars set here were passed to the Kubernetes build Job at deploy time, where webpack'sDefinePlugininlined them as literals into the JS bundle. With the standalone build, that Job no longer exists.The mit-learn app (PR #3364) introduces a
PublicEnvScriptServer Component that readsprocess.envat request time and renders a synchronous<script>window.__ENV={...}</script>in<head>before any JS bundle loads. Client code reads env vars via anenv()helper that readswindow.__ENVinstead of using staticprocess.env.NEXT_PUBLIC_*dot-access (which DefinePlugin would inline to empty strings). TheNEXT_PUBLIC_*env vars defined in__main__.pyare therefore still required and correct — they are now runtime inputs rather than build-time inputs.How can this be tested?
pulumi previewon the CI stack and confirm the plan shows:nextjs-build-cache-efs-blue/green)mit-learn-nextjs-build-*)mit-learn-nextjs-blue/green)mit-learn-nextjs-deployment-state)mit-learn-nextjs)deployment-colorlabel)ruff checkandmypypass on__main__.py.Additional Context
Deployment coordination: this PR and mitodl/mit-learn#3364 must be merged and applied together as a single
pulumi up. The new standalone Dockerfile runner stage is incompatible with the old EFS volume mount path.Fastly purge transition: the first deployment after merging will have no cached objects tagged
html-pagesyet (nothing was tagged before). A one-time manualpurge_allcan be run after the first deployment to reset the Fastly cache slate if needed.Checklist:
pulumi uppulumi previewon CI stack before merging to confirm the resource diff