LCORE-1872: Fix llama-stack container startup issues by anik120 · Pull Request #1800 · lightspeed-core/lightspeed-stack

anik120 · 2026-05-26T14:00:35Z

Description

Fixes two issues preventing llama-stack container from starting:

SQLite compatibility error - Switch base image from ubi9/ubi-minimal to ubi9/python-312. The minimal image ships with SQLite < 3.30.0 which lacks sqlite3_deserialize support required by Python 3.12's _sqlite3 module. The python-312 base image includes a compatible SQLite version and properly configured Python dependencies.
Read-only filesystem error - Config enrichment script now writes to /tmp/enriched-run.yaml instead of /opt/app-root/run.yaml. The latter is mounted read-only from the host, causing OSError when the enrichment script attempts to write the enriched configuration.

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

Related Issue #
Closes #

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

Release Notes

Chores
- Updated container build to use Python 3.12 UBI base image for improved runtime consistency.
Chores
- Adjusted entrypoint behavior to write enriched configuration to a different runtime location.
Tests
- Test manifests updated to run the enrichment step using the project's virtualenv Python interpreter.

coderabbitai · 2026-05-26T14:05:40Z

Walkthrough

Rebase test container onto Red Hat UBI Python 3.12 with dnf-installed build tools, change entrypoint to write enriched config to /tmp/enriched-run.yaml, and update test/prow manifests to invoke the enrichment script using the virtualenv Python.

Changes

Infrastructure and Deployment Updates

Layer / File(s)	Summary
Container base image and build tools `deploy/llama-stack/test.containerfile`	Containerfile rebased to Red Hat UBI Python 3.12 image; build dependencies (`git`, `tar`, `gcc`, `gcc-c++`, `make`) installed via `dnf install` with `dnf clean all`, replacing prior microdnf-based Python/tool installation.
Enriched config output path `scripts/llama-stack-entrypoint.sh`	`ENRICHED_CONFIG` updated to `/tmp/enriched-run.yaml`; enrichment block writes enriched config to this path.
Manifests: use venv Python for enrichment `tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml`, `tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-prow.yaml`	Pod/container startup commands updated to run `llama_stack_configuration.py` via `/opt/app-root/.venv/bin/python3` instead of `python3` during Lightspeed config enrichment.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

lightspeed-core/lightspeed-stack#1760: Updates to Makefile targets that build the test container image affected by the base image and build dependency changes in this PR.

Suggested reviewers

radofuchs
tisnik

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly reflects the two main fixes in the changeset: replacing the base image to resolve SQLite compatibility and updating the config path to fix the read-only filesystem error.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

✨ Simplify code

Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/llama-stack-entrypoint.sh`:
- Line 8: The ENRICHED_CONFIG path is inconsistent: the script sets
ENRICHED_CONFIG="/tmp/enriched-run.yaml" but the E2E manifests still hardcode
the old "/opt/app-root/run.yaml"; update the manifests to use
ENRICHED_CONFIG="/tmp/enriched-run.yaml" to match the script (or refactor to a
single shared source for the enrichment path) so the enrichment behavior cannot
diverge; look for the ENRICHED_CONFIG variable and any hardcoded
"/opt/app-root/run.yaml" occurrences in the llama-stack entrypoint and the E2E
manifest templates and make them use the same "/tmp/enriched-run.yaml" value (or
reference the centralized variable).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0ab071ab-5175-4264-aa92-1c69444cb139

📥 Commits

Reviewing files that changed from the base of the PR and between 42844d0 and 9c1b725.

📒 Files selected for processing (2)

deploy/llama-stack/test.containerfile
scripts/llama-stack-entrypoint.sh

📜 Review details

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2026-05-12T15:14:34.788Z

Learnt from: syedriko
Repo: lightspeed-core/lightspeed-stack PR: 1727
File: scripts/konflux_requirements.sh:9-15
Timestamp: 2026-05-12T15:14:34.788Z
Learning: In this repo, the `.konflux/` directory is committed/tracked and is guaranteed to exist in a fresh clone. Therefore, shell scripts that write output under `.konflux/` (e.g., create files like `.konflux/<...>`) should not waste effort by calling `mkdir -p .konflux` first. Only add directory-creation logic if the script may run in an environment/repo state where `.konflux/` might not be present.

Applied to files:

scripts/llama-stack-entrypoint.sh

🔇 Additional comments (2)

deploy/llama-stack/test.containerfile (2)

1-2: Clarify SQLite compatibility claim for the ubi9/python-312 base image

Red Hat docs I found don’t state the bundled SQLite version or guarantee the sqlite3_deserialize capability, so the “>= 3.30.0 with sqlite3_deserialize support” PR objective needs an in-container check for the exact image tag used at deploy/llama-stack/test.containerfile (lines 1-2):

python -c "import sqlite3; print(sqlite3.sqlite_version)"

python -c "import sqlite3; print(hasattr(sqlite3.Connection, 'deserialize'))"

6-9: ⚡ Quick win

llslibdev dependencies likely won’t need cmake/cargo in deploy/llama-stack/test.containerfile (lines 6-9)
The llslibdev dependency group in pyproject.toml doesn’t directly include fastuuid/maturin/cargo, and uv.lock provides manylinux wheels for fastuuid (so uv sync --locked --group llslibdev should not require a Rust toolchain on typical Linux platforms). Only targets without matching wheels would fall back to the fastuuid sdist and need cargo/build tooling.

tisnik

LGTM

Fixes two issues preventing llama-stack container from starting: 1.SQLite compatibility error - Switch base image from ubi9/ubi-minimal to ubi9/python-312. The minimal image ships with SQLite < 3.30.0 which lacks sqlite3_deserialize support required by Python 3.12's _sqlite3 module. The python-312 base image includes a compatible SQLite version and properly configured Python dependencies. 2.Read-only filesystem error - Config enrichment script now writes to /tmp/enriched-run.yaml instead of /opt/app-root/run.yaml. The latter is mounted read-only from the host, causing OSError when the enrichment script attempts to write the enriched configuration. Signed-off-by: Anik Bhattacharjee <anbhatta@redhat.com>

The enrichment script was failing with `ModuleNotFoundError: No module named 'yaml'` because it was invoked with system Python instead of virtualenv Python where dependencies are installed. Signed-off-by: Anik Bhattacharjee <anbhatta@redhat.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@deploy/llama-stack/test.containerfile`:
- Line 2: The Dockerfile uses the mutable tag
"registry.access.redhat.com/ubi9/python-312" in the FROM instruction; replace
that tag with the corresponding immutable digest by finding the correct sha256
for the image and updating the FROM line to use the digest form (e.g., FROM
registry.access.redhat.com/ubi9/python-312@sha256:<digest>), ensuring you pick
the exact digest that matches the desired image variant/architecture for
reproducible builds and supply-chain traceability.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8b7f2dad-9e5e-4d8d-97a8-6475fba17b5c

📥 Commits

Reviewing files that changed from the base of the PR and between 9c1b725 and 5dd5008.

📒 Files selected for processing (4)

deploy/llama-stack/test.containerfile
scripts/llama-stack-entrypoint.sh
tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml
tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-prow.yaml

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)

GitHub Check: unit_tests (3.12)
GitHub Check: build-pr
GitHub Check: unit_tests (3.13)
GitHub Check: Pylinter
GitHub Check: E2E: server mode / ci / group 1
GitHub Check: E2E: library mode / ci / group 1
GitHub Check: E2E: library mode / ci / group 3
GitHub Check: E2E: server mode / ci / group 3
GitHub Check: E2E: library mode / ci / group 2
GitHub Check: E2E: server mode / ci / group 2
GitHub Check: E2E Tests for Lightspeed Evaluation job
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request

🧰 Additional context used

🧠 Learnings (3)

📚 Learning: 2026-02-19T10:06:50.647Z

Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 1181
File: tests/e2e-prow/rhoai/manifests/lightspeed/mock-jwks.yaml:32-34
Timestamp: 2026-02-19T10:06:50.647Z
Learning: In the rhoai tests under tests/e2e-prow/rhoai/manifests, avoid static ConfigMap definitions for mock-jwks-script and mcp-mock-server-script since these ConfigMaps are created dynamically by the pipeline.sh deployment script using 'oc create configmap'. Ensure there are no static ConfigMap resources for these names in the manifests. If such ConfigMaps are added in the future, coordinate with the pipeline to reflect dynamic creation or adjust tests to rely on the dynamic provisioning.

Applied to files:

tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml
tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-prow.yaml

📚 Learning: 2026-05-20T08:09:30.641Z

Learnt from: max-svistunov
Repo: lightspeed-core/lightspeed-stack PR: 1580
File: docs/design/llama-stack-config-merge/poc-results/library-mode/synthesized-run.yaml:107-110
Timestamp: 2026-05-20T08:09:30.641Z
Learning: In Llama-stack config YAMLs, when defining a Llama Guard safety shield entry, set `provider_shield_id` to the *guard model identifier* (e.g., `meta-llama/Llama-Guard-3-8B`). Do not use a chat/generative model id (e.g., `openai/gpt-4o-mini`): a chat-model id (or `native_override`) indicates only an override landed and does **not** mean the safety shield is actually gating queries. Ensure any E2E coverage for the related implementation (JIRA/E2E tests) exercises a real Llama Guard model to verify that the shield is effective.

Applied to files:

tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml
tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-prow.yaml

📚 Learning: 2026-05-12T15:14:34.788Z

Learnt from: syedriko
Repo: lightspeed-core/lightspeed-stack PR: 1727
File: scripts/konflux_requirements.sh:9-15
Timestamp: 2026-05-12T15:14:34.788Z
Learning: In this repo, the `.konflux/` directory is committed/tracked and is guaranteed to exist in a fresh clone. Therefore, shell scripts that write output under `.konflux/` (e.g., create files like `.konflux/<...>`) should not waste effort by calling `mkdir -p .konflux` first. Only add directory-creation logic if the script may run in an environment/repo state where `.konflux/` might not be present.

Applied to files:

scripts/llama-stack-entrypoint.sh

🪛 Trivy (0.69.3)

deploy/llama-stack/test.containerfile

[warning] 2-2: ':latest' tag used

Specify a tag in the 'FROM' statement for image 'registry.access.redhat.com/ubi9/python-312'

Rule: DS-0001

Learn more

(IaC/Dockerfile)

[info] 1-1: No HEALTHCHECK defined

Add HEALTHCHECK instruction in your Dockerfile

Rule: DS-0026

Learn more

(IaC/Dockerfile)

🔇 Additional comments (4)

deploy/llama-stack/test.containerfile (1)

7-9: LGTM!

scripts/llama-stack-entrypoint.sh (1)

8-8: LGTM!

Also applies to: 15-15

tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-openai.yaml (1)

160-160: LGTM!

tests/e2e-prow/rhoai/manifests/lightspeed/llama-stack-prow.yaml (1)

150-150: LGTM!

coderabbitai · 2026-05-26T16:03:37Z

-# Upstream llama-stack built from Red Hat UBI
-FROM registry.access.redhat.com/ubi9/ubi-minimal
+# Upstream llama-stack built from Red Hat UBI Python 3.12 image
+FROM registry.access.redhat.com/ubi9/python-312


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

What is the current sha256 digest for registry.access.redhat.com/ubi9/python-312, and does Red Hat recommend digest pinning for reproducible builds?

💡 Result:

The container image registry.access.redhat.com/ubi9/python-312 is a mutable tag that points to updated image versions over time [1][2]. Consequently, it does not have a single, static SHA256 digest. Because Red Hat frequently updates these images to address security vulnerabilities and bug fixes, the digest associated with the ubi9/python-312 tag changes whenever a new image is pushed to the registry [1][3]. To find the current digest for your specific environment, you can use the command: podman inspect registry.access.redhat.com/ubi9/python-312 --format '{{.Digest}}' Red Hat explicitly recommends digest pinning for scenarios requiring reproducible builds and robust supply chain security [4][2][5]. While tags are convenient for general development, they can change at any time, which introduces ambiguity and prevents reproducibility [2]. By referencing an image by its immutable SHA256 digest, you ensure that your builds consistently use the exact same image content, effectively eliminating discrepancies caused by tag updates [2][5]. Furthermore, for certain Red Hat ecosystem requirements, such as Operator bundle certification, using image digests instead of tags is mandatory [5].

Citations:

1: https://access.redhat.com/errata/RHBA-2024:5682

2: https://developers.redhat.com/articles/2025/01/28/how-name-version-and-reference-container-images

3: https://access.redhat.com/errata/RHBA-2026:16041

4: https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/10/html/building_running_and_managing_containers/introduction-to-reproducible-container-builds

5: https://github.com/redhat-openshift-ecosystem/certification-releases/blob/main/4.9/ga/troubleshooting.md

Pin the base image to an immutable digest (deploy/llama-stack/test.containerfile:2).

registry.access.redhat.com/ubi9/python-312 is a mutable tag, so the image contents/digest can change over time, hurting reproducibility and supply-chain traceability. Red Hat recommends digest pinning for reproducible builds and stronger supply-chain security; replace the tag with the appropriate sha256 digest.

Suggested change

-FROM registry.access.redhat.com/ubi9/python-312 +FROM registry.access.redhat.com/ubi9/python-312@sha256:<resolved_digest>

🧰 Tools

🪛 Trivy (0.69.3)

[warning] 2-2: ':latest' tag used

Specify a tag in the 'FROM' statement for image 'registry.access.redhat.com/ubi9/python-312'

Rule: DS-0001

Learn more

(IaC/Dockerfile)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deploy/llama-stack/test.containerfile` at line 2, The Dockerfile uses the mutable tag "registry.access.redhat.com/ubi9/python-312" in the FROM instruction; replace that tag with the corresponding immutable digest by finding the correct sha256 for the image and updating the FROM line to use the digest form (e.g., FROM registry.access.redhat.com/ubi9/python-312@sha256:<digest>), ensuring you pick the exact digest that matches the desired image variant/architecture for reproducible builds and supply-chain traceability.

radofuchs

LGTM

coderabbitai Bot reviewed May 26, 2026

View reviewed changes

Comment thread scripts/llama-stack-entrypoint.sh

tisnik approved these changes May 26, 2026

View reviewed changes

tisnik requested a review from radofuchs May 26, 2026 14:15

anik120 force-pushed the fix-llama-stack-container-start-issue branch from 9c1b725 to da6a5cf Compare May 26, 2026 14:16

anik120 force-pushed the fix-llama-stack-container-start-issue branch from da6a5cf to 3b50327 Compare May 26, 2026 14:41

coderabbitai Bot reviewed May 26, 2026

View reviewed changes

radofuchs approved these changes May 27, 2026

View reviewed changes

tisnik merged commit 9700c70 into lightspeed-core:main May 27, 2026
31 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LCORE-1872: Fix llama-stack container startup issues#1800

LCORE-1872: Fix llama-stack container startup issues#1800
tisnik merged 2 commits into
lightspeed-core:mainfrom
anik120:fix-llama-stack-container-start-issue

anik120 commented May 26, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 26, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

tisnik left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 26, 2026

Uh oh!

radofuchs left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

anik120 commented May 26, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 26, 2026

Choose a reason for hiding this comment

Uh oh!

radofuchs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

anik120 commented May 26, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 26, 2026 •

edited

Loading