(WIP)feat: add l4t package#1166
Conversation
358d537 to
5b258e0
Compare
|
Hey @mmalyska thanks for this PR. Is this something you're still working on? It looks useful for other users but we don't have hardware to test or verify functionality. |
|
Hi @rothgar I'm still working on it(not that intensively as I don't have arm architecture to build the stuff efficiently, Just building kernel takes me ~8h on my Windows machine). Right now I'm thinking and looking for the solution as I'm stuck that GPU module needs changes inside kernel(I really don't want to rewrite GPU drivers to use |
|
@rothgar I just update PR description with steps to reproduce and the error I'm facing. If you need more info just reach me out :) |
|
This PR is stale because it has been open 45 days with no activity. |
|
Sorry, I completely missed the problem in the description. I think you extension filesystem may be incorrect. I think a system extension would merge with the base filesystem and overlay your module file (replacing the existing one). Maybe @frezbo can correct me on how that should work. |
lt4 needs to be a pkg first to ship the modules |
This PR is trying to add a l4t pkg and siderolabs/extensions#624 is adding the extension. @mmalyska Is the reason you're trying to replace host1x.ko because it's already provided by the base OS image and you need an updated one? |
Hi, I need to provide host1x(I hope this is the only one that needs replacement) build by package as it includes changes from patch that are needed by the nvidia driver for jetson-orin board. Those changes are not in the upstream but provided by nvidia developers. |
|
This PR is stale because it has been open 45 days with no activity. |
|
I think I have a working solution to the Instead of trying to build host1x out-of-tree (which conflicts with the built-in), I wrote
No Full working setup: https://github.com/schwankner/talos-jetson-orin Running on Talos v1.12.6 / kernel 6.18.18-talos / nvgpu 5.10.7 (OE4T) with verified CUDA inference in Kubernetes pods (no |
Adds OE4T-patched GPU driver stack for NVIDIA Jetson Orin NX (Tegra234 / GA10B): - OE4T host1x + host1x-fence: GA10B syncpoint support with ERRATA_SYNCPT_INVALID_ID_0 fix - nvmap, mc-utils, governor_pod_scaling: standard Tegra support modules - nvhost-ctrl-shim: /dev/nvhost-ctrl userspace interface for JetPack 6 CUDA runtime - nvgpu: main GA10B GPU driver (OE4T patches, Clang build, kernel 6.18 compat) The nvhost-ctrl-shim provides hardware syncpoint interrupt support for cudaStreamSynchronize via NVHOST_IOCTL_CTRL_SYNC_FENCE_CREATE + SYNC_FILE_EXTRACT, enabling full CUDA throughput instead of CPU semaphore polling. Built with Clang (LLVM=1), requires OE4T linux-nv-oot (wip-r36.5-take-2) for kernel 6.18 compatibility. CONFIG_TEGRA_GK20A_NVHOST=y uses OE4T host1x with HOST1X_SYNCPT_GPU support. Tested: ~60 tok/s qwen2.5:0.5b on Jetson Orin NX 16GB with Talos Linux v1.13. Continues: siderolabs#1166
Adds OE4T-patched GPU driver stack for NVIDIA Jetson Orin NX (Tegra234 / GA10B): - OE4T host1x + host1x-fence: GA10B syncpoint support with ERRATA_SYNCPT_INVALID_ID_0 fix - nvmap, mc-utils, governor_pod_scaling: standard Tegra support modules - nvhost-ctrl-shim: /dev/nvhost-ctrl userspace interface for JetPack 6 CUDA runtime - nvgpu: main GA10B GPU driver (OE4T patches, Clang build, kernel 6.18 compat) The nvhost-ctrl-shim provides hardware syncpoint interrupt support for cudaStreamSynchronize via NVHOST_IOCTL_CTRL_SYNC_FENCE_CREATE + SYNC_FILE_EXTRACT, enabling full CUDA throughput instead of CPU semaphore polling. Built with Clang (LLVM=1), requires OE4T linux-nv-oot (wip-r36.5-take-2) for kernel 6.18 compatibility. CONFIG_TEGRA_GK20A_NVHOST=y uses OE4T host1x with HOST1X_SYNCPT_GPU support. Tested: ~60 tok/s qwen2.5:0.5b on Jetson Orin NX 16GB with Talos Linux v1.13. Continues: siderolabs#1166 Signed-off-by: Alexander Schwankner <mrmoor4@googlemail.com>
Nice! I'll try o test it on my Orin NX 16GB as soon I update it to newest jetpack FW. |
Add support for Jetson Orin SBC. Those are modules and display drivers for tegra chips.
Problem:
I don't see module host1x to be loaded from the extension.


The module host1x is already build in inside kernel
so it cannot be replaced by extension module with the same name host1x.
How to run it
Building Talos
Create custom builder buildx
docker buildx create --driver docker-container --driver-opt network=host --name local1 --buildkitd-flags '--allow-insecure-entitlement security.insecure' --useRun local doker registry + ui
Building the pkgs
Building extensions
Building talos image
Create profile.yaml file
Create installer image:
List do changes in node config:
Apply changes and install new image: