Skip to content

vlab scripts#1361

Closed
daniel-noland wants to merge 4 commits intomainfrom
pr/daniel-noland/vlab-scripts
Closed

vlab scripts#1361
daniel-noland wants to merge 4 commits intomainfrom
pr/daniel-noland/vlab-scripts

Conversation

@daniel-noland
Copy link
Copy Markdown
Collaborator

@daniel-noland daniel-noland commented Mar 20, 2026

Basically an "easy mode" for quickly starting a vlab.

It uses an "unlikely to be used" ip address of 192.168.19.1 to host zot and vlab in a container.

The whole show can be started up with a simple just vlab-up.

You can use just vlab-control to drop into k9s in the vlab and you can use just oci_insecure=true version=$X vlab-patch-dataplane or just oci_insecure=true version=$X vlab-patch-frr to build (sterile) containers and push them to the local zot and patch those into the vlab instance.

The recipes aren't complex so I didn't add much in the way of docs for them.

Claude was mostly used to check my work and manage minor rebases.

It also wrote commit messages.

@daniel-noland daniel-noland force-pushed the pr/daniel-noland/vlab-scripts branch from 20d8c29 to 89eda46 Compare March 23, 2026 00:03
@daniel-noland daniel-noland changed the title Pr/daniel noland/vlab scripts vlab scripts Mar 23, 2026
@daniel-noland daniel-noland changed the base branch from main to pr/daniel-noland/auto-bump March 23, 2026 00:03
@daniel-noland daniel-noland force-pushed the pr/daniel-noland/vlab-scripts branch 2 times, most recently from 2622c30 to 4e7853c Compare March 27, 2026 01:51
@daniel-noland daniel-noland changed the base branch from pr/daniel-noland/auto-bump to main March 27, 2026 01:52
@daniel-noland daniel-noland marked this pull request as ready for review March 27, 2026 01:56
@daniel-noland daniel-noland requested a review from a team as a code owner March 27, 2026 01:56
@daniel-noland daniel-noland requested review from Copilot and sergeymatov and removed request for a team March 27, 2026 01:57
@daniel-noland daniel-noland self-assigned this Mar 27, 2026
@daniel-noland daniel-noland added the enhancement New feature or request label Mar 27, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a set of scripts and just recipes to spin up a local “vlab easy mode” environment, including a local Zot registry bound to 192.168.19.1, and helper commands to control and patch the running vlab.

Changes:

  • Introduces a scripts/vlab/ container image + runtime scripts to start Zot + bootstrap hhfab inside the container.
  • Adds just vlab-up|vlab-down|vlab-control|vlab-patch-* recipes to manage the environment and patch images into the vlab cluster.
  • Adds Zot config/cert template files and updates .gitignore for generated cert artifacts.

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
scripts/vlab/run.sh Generates TLS certs, creates Docker network, builds/starts vlab container, runs zot + bootstraps hhfab/vlab.
scripts/vlab/root/etc/zot/config.json Configures Zot registry (TLS, storage, ghcr sync-on-demand).
scripts/vlab/root/etc/zot/cert.ini OpenSSL CSR config used by run.sh for generating the Zot cert.
scripts/vlab/control.sh Runs k9s (default) or arbitrary commands on the vlab control plane via ssh-from-container.
scripts/vlab/Dockerfile Defines the vlab container image and installs required tooling (docker, qemu, jq/yq, zot, oras).
justfile Adds vlab recipes and changes default oci_repo to the vlab registry address.
.gitignore Ignores generated Zot cert/key/csr/serial artifacts and creds file.

Comment thread scripts/vlab/run.sh Outdated
Comment thread scripts/vlab/root/etc/zot/config.json Outdated
Comment thread scripts/vlab/Dockerfile Outdated
Comment thread scripts/vlab/Dockerfile Outdated
Comment thread scripts/vlab/control.sh
Comment thread justfile Outdated
# OCI repo to push images to

oci_repo := "127.0.0.1:30000"
oci_repo := "192.168.19.1:30000"
Copy link

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confidence: 7
tags: [other]

Changing the global default oci_repo to 192.168.19.1:30000 affects all container push workflows, not just vlab. This is a behavior change for anyone who isn’t running the vlab registry and may break existing local dev setups that rely on 127.0.0.1:30000.

Consider keeping the previous default and setting oci_repo only within the vlab-related recipes (or introduce a vlab_oci_repo variable that vlab recipes use).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trouble with this logic is that it assumes the end user has a zot registry running on 127.0.0.1.

If they do then it won't work in the containerized setup because the containerized setup deliberately does not run in the host network namespace.

I don't think 127.0.0.1:30000 was ever a needed or relied on default and this arbitrary address is as good as any other (save setting it up with a private ipv6, but that is hard because docker has blithe contempt for ipv6 as a concept).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to leave this up in case any other reviewers want to express concern.

I don't think I can realistically do better without making this so complex that we may as well skip vlab entirely.

I would need to somehow search for and confirm the non-existence / use of a specific address and that seems... dumb

Comment thread justfile Outdated
Comment thread scripts/vlab/run.sh Outdated
| yq -y '.[]' \
| tee fab.yaml
"
docker exec vlab /vlab/hhfab vlab gen
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as it targets dp dev you can lower number of switches and servers to a min so it'll consume less resources and will be a bit faster

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is a really good idea. Will do

Copy link
Copy Markdown
Member

@Frostman Frostman Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd want to have externals automatically setup too with --externals-bgp=1 --externals-static=1 --external-orphan-connections=1 flags so more use cases could be tested.

As for decreasing number of switches, you probably just want a pair of spines and pair of standalone switches: --mclag-leafs-count=0 --eslag-leaf-groups="" --orphan-leafs-count=2

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@mvachhar
Copy link
Copy Markdown
Contributor

Can you address or resolve the Co-Pilot comments. Please make sure, to reply to copilot as to why you took the action you did.

@daniel-noland daniel-noland marked this pull request as draft March 27, 2026 22:26
@daniel-noland daniel-noland force-pushed the pr/daniel-noland/vlab-scripts branch 2 times, most recently from 40ccf3e to e622ce5 Compare April 18, 2026 23:52
@daniel-noland daniel-noland requested a review from Copilot April 19, 2026 00:46
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 14 changed files in this pull request and generated 3 comments.

Comment thread scripts/vlab/run.sh Outdated
Comment thread justfile Outdated
Comment thread scripts/vlab/run.sh Outdated
Copy link
Copy Markdown
Member

@Frostman Frostman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having zot simplified is nice but it's done in a way specific to dp so I'm not so excited about it. I really don't like wrapping vlab into scripts especially the ones that hardcode configs and etc - devs MUST know how to use vlab. I won't be blocking it, but I don't think it's a good idea to encourage it.

CC @mvachhar

Comment thread scripts/vlab/run.sh Outdated

docker exec vlab /bin/bash -c \
"curl -fsSL 'https://i.hhdev.io/hhfab' | USE_SUDO=false INSTALL_DIR=. VERSION=v0-master-${fabricator_rev} bash"
docker exec vlab /vlab/hhfab init --dev --registry-repo 192.168.19.1:30000 --gateway --import-host-upstream --force
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably pass --gateways=2 so we always run a pair of gateways.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is done :D

Comment thread scripts/vlab/run.sh Outdated
| yq -y '.[]' \
| tee fab.yaml
"
docker exec vlab /vlab/hhfab vlab gen
Copy link
Copy Markdown
Member

@Frostman Frostman Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd want to have externals automatically setup too with --externals-bgp=1 --externals-static=1 --external-orphan-connections=1 flags so more use cases could be tested.

As for decreasing number of switches, you probably just want a pair of spines and pair of standalone switches: --mclag-leafs-count=0 --eslag-leaf-groups="" --orphan-leafs-count=2

Comment thread scripts/vlab/run.sh Outdated
| tee fab.yaml
"
docker exec vlab /vlab/hhfab vlab gen
docker exec vlab /vlab/hhfab vlab up -v --controls-restricted=false -m=manual --recreate
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here is one of the main issues - why always re-create? You can stop and start vlab again which is way faster then recreating from scratch.

Copy link
Copy Markdown
Collaborator Author

@daniel-noland daniel-noland Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose I can make it not recreate easily enough. Just remove the --recreate.

The original motivation was to make the script declarative, but that may be overkill

Comment thread scripts/gen-pins.sh
npins add github KaTeX KaTeX

npins add github project-zot zot
npins add github githedgehog fabricator --branch master # floats with branch on pin bump
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean it'll not be always using latest master? If that's the case, it's quite bad - in dev you should use either last release or actual latest master, no pinning for some intermidiate commits.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand this logic, but I can make it use the latest master easily.

It is actually easier that way. It's just that we pin literally every other thing. All of it. Including the api and everything

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can also very easily make it use latest release, which makes more sense to me (but might cause DX problems)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 14 changed files in this pull request and generated 3 comments.

Comment thread scripts/vlab/run.sh
Comment thread scripts/vlab/entrypoint.sh Outdated
Comment thread scripts/vlab/control.sh
daniel-noland and others added 2 commits April 18, 2026 19:04
Add pins for two upstream projects used by the vlab development
environment that follows in the next commit:

- zot: OCI-native container registry used as a pull-through proxy for
  ghcr.io. Packaged in nix/pkgs/zot/ from a pinned upstream release
  binary (SRI-hashed, refreshed by scripts/bump.sh) and wired through
  the dataplane-dev overlay.

- fabricator: provides the hhfab test CLI. Tracked against master so
  the vlab startup script can install a matching hhfab build at
  runtime.

Neither is used by the dataplane crates themselves; both are strictly
for vlab tooling.

Co-authored-by: Claude (Opus 4.7, 1M context) <noreply@anthropic.com>
Add a local development environment for bringing up a vlab test
fabric. The environment runs as a privileged docker container built
from a nix closure (containers.vlab), which ships zot (configured as
a caching pull-through proxy for ghcr.io/githedgehog/*), hhfab
(installed at runtime, pinned to the fabricator npins revision), and
the usual toolchain (qemu, openssh, git, docker-client, etc.).

scripts/vlab/entrypoint.sh runs in three modes:

- check:     validate existing ghcr.io credentials (one-shot probe)
- provision: interactively prompt for a classic PAT with read:packages,
             verify against api.github.com, and persist the creds into
             the root-owned vlab-secrets docker volume
- run:       generate TLS material into a tmpfs, build a merged CA
             bundle at SSL_CERT_FILE, then exec zot

scripts/vlab/run.sh drives the lifecycle: ensures creds exist via a
one-shot provision container, starts zot detached, waits for the CA
bundle to land, installs hhfab, then brings the vlab fabric up.

Justfile recipes expose the workflow:

- vlab-up:              build containers.vlab and run the startup script
- vlab-control:         open k9s or run a command on the control plane
- vlab-down:            stop and remove the container + network
- vlab-purge:           vlab-down + drop vlab/zot/vlab-secrets volumes
- vlab-patch-dataplane: push the current dataplane image and patch the
                        running fabric to use it
- vlab-patch-frr:       same flow for the FRR image

TLS material is regenerated every startup into a tmpfs (never persists
on the host). Only the ghcr.io credential persists, and it lives in a
root-owned docker volume provisioned via a paste-token flow; the PAT
never touches a host-side file.

Co-authored-by: Claude (Opus 4.7, 1M context) <noreply@anthropic.com>
@daniel-noland daniel-noland force-pushed the pr/daniel-noland/vlab-scripts branch from 93e1103 to 171e028 Compare April 19, 2026 02:20
@daniel-noland
Copy link
Copy Markdown
Collaborator Author

I think having zot simplified is nice but it's done in a way specific to dp so I'm not so excited about it.

I'd actually be quite happy to recast this as a PR to fabricator. I don't care where the zot automation logic lives, so long as we can spin it without a bunch of work. I just wanted to make that process efficient.

I really don't like wrapping vlab into scripts especially the ones that hardcode configs and etc - devs MUST know how to use vlab.

I honestly didn't even consider that but it is a fair point. My only goal with this whole PR was to make a quick start button so I could use vlab more efficiently. I feel like I have been massively under using it and that needs to stop, especially with DPDK coming up.

I won't be blocking it, but I don't think it's a good idea to encourage it.

If your concern is the hard coded config then I can likely work with that.
I could make a yaml file (or something) which captures the config and let users spin whatever. My main goal was to get a semi declarative quick vlab test for things which are a pain to test / impossible to test in unit tests.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 14 changed files in this pull request and generated 1 comment.

Comment thread scripts/vlab/run.sh Outdated
@daniel-noland
Copy link
Copy Markdown
Collaborator Author

Closing PR as won't do

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants