Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions .github/workflows/publiccodeyml-check.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
name: Validate publiccode.yml

on:
push:
paths:
- "publiccode.yml"
- ".github/workflows/publiccodeyml-check.yml"
pull_request:
paths:
- "publiccode.yml"
- ".github/workflows/publiccodeyml-check.yml"

permissions: {}

jobs:
validate:
runs-on: ubuntu-latest
permissions:
contents: read
name: publiccode.yml validation
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4

- uses: italia/publiccode-parser-action@56e1200cba853b1efa73ee871600284d0705ab4d # v1
with:
publiccode: "publiccode.yml"
no-network: true

98 changes: 39 additions & 59 deletions publiccode.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
publiccodeYmlVersion: "0.2"
publiccodeYmlVersion: "0"

name: Onyxia
applicationSuite: Onyxia
url: "https://github.com/InseeFrLab/onyxia"
landingURL: "https://onyxia.sh"
creationDate: "2017-01-01"
releaseDate: "2020-03-22"
logo: https://inseefrlab.github.io/onyxia/icon.svg

Expand All @@ -24,9 +23,7 @@ usedBy:

fundedBy:
- name: Insee
url: https://lannuaire.service-public.fr/gouvernement/278bf3d4-9cf3-4a36-a317-ec56fb0abc52

roadmap: "https://docs.onyxia.sh/roadmap"
uri: https://lannuaire.service-public.fr/gouvernement/278bf3d4-9cf3-4a36-a317-ec56fb0abc52

developmentStatus: development

Expand All @@ -38,73 +35,65 @@ intendedAudience:

description:
en:
genericName: Onyxia
shortDescription: >
Web app to simplify data science environment setup on Kubernetes
Web app to simplify data science environment setup on Kubernetes

longDescription: >
Onyxia is a powerful open-source web application designed to streamline the creation of state-of-the-art data science environments. Its core mission is to make advanced cloud-based resources accessible—even to users without prior expertise in cloud technologies.
What distinguishes Onyxia is its intuitive, user-friendly interface that abstracts away the complexity of cloud infrastructure. Data scientists can effortlessly configure their preferred tools—such as Jupyter, RStudio, and VSCode—while selecting computational resources (GPU, CPU, RAM), defining environment variables, and setting up persistent storage. With just a few clicks, Onyxia automates the entire deployment process: it launches containerized environments, connects to S3-compatible storage systems, and manages secure credentials through integrations with Vault and OIDC providers.
Importantly, Onyxia is not the environment where data analysis itself takes place. Instead, it acts as the gateway—a preparatory layer—where data professionals define and launch their technical stack before entering their actual workspace (e.g., Jupyter or RStudio). This separation of concerns keeps the experience streamlined and focused.
A built-in file explorer facilitates the handling of large datasets, while Onyxia’s transparent architecture fosters learning and trust. Every action performed through the interface is logged in a terminal-style viewer, showing users the exact commands being executed (e.g., Helm, Kubernetes, Docker). This visibility not only promotes reproducibility and confidence but also serves as an educational bridge for those curious about the infrastructure powering their work.
For deployment, Onyxia is installed by system administrators on a Kubernetes cluster—either on-premises or through a cloud provider—and exposed via a web UI for the data science team. Once in place, it dramatically reduces onboarding friction, encourages best practices, and empowers teams to focus entirely on data-driven innovation.
Onyxia is a powerful open-source web application designed to streamline the creation of state-of-the-art data science environments. Its core mission is to make advanced cloud-based resources accessible—even to users without prior expertise in cloud technologies.
What distinguishes Onyxia is its intuitive, user-friendly interface that abstracts away the complexity of cloud infrastructure. Data scientists can effortlessly configure their preferred tools—such as Jupyter, RStudio, and VSCode—while selecting computational resources (GPU, CPU, RAM), defining environment variables, and setting up persistent storage. With just a few clicks, Onyxia automates the entire deployment process: it launches containerized environments, connects to S3-compatible storage systems, and manages secure credentials through integrations with Vault and OIDC providers.
Importantly, Onyxia is not the environment where data analysis itself takes place. Instead, it acts as the gateway—a preparatory layer—where data professionals define and launch their technical stack before entering their actual workspace (e.g., Jupyter or RStudio). This separation of concerns keeps the experience streamlined and focused.
A built-in file explorer facilitates the handling of large datasets, while Onyxia’s transparent architecture fosters learning and trust. Every action performed through the interface is logged in a terminal-style viewer, showing users the exact commands being executed (e.g., Helm, Kubernetes, Docker). This visibility not only promotes reproducibility and confidence but also serves as an educational bridge for those curious about the infrastructure powering their work.
For deployment, Onyxia is installed by system administrators on a Kubernetes cluster—either on-premises or through a cloud provider—and exposed via a web UI for the data science team. Once in place, it dramatically reduces onboarding friction, encourages best practices, and empowers teams to focus entirely on data-driven innovation.

documentation: "https://docs.onyxia.sh"

features:
- UI for launching docker images (Helm charts)
- Users can define the amount of RAM, CPU and GPU they would like to allocate to their containers
- Define environnement variables to be made available in the containers.
- Save and restore your service service configurations.
- View datasets
- UI for launching docker images (Helm charts)
- Users can define the amount of RAM, CPU and GPU they would like to allocate to their containers
- Define environnement variables to be made available in the containers.
- Save and restore your service service configurations.
Comment on lines +53 to +54
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix English feature-list typos.

Line 53 has “environnement” and Line 54 repeats “service” twice; both are user-facing text defects.

✏️ Suggested text fix
-      - Define environnement variables to be made available in the containers.
-      - Save and restore your service service configurations.
+      - Define environment variables to be made available in the containers.
+      - Save and restore your service configurations.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- Define environnement variables to be made available in the containers.
- Save and restore your service service configurations.
- Define environment variables to be made available in the containers.
- Save and restore your service configurations.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@publiccode.yml` around lines 53 - 54, Fix the two typos in the user-facing
bullet list: change "environnement" to "environment" in the bullet "- Define
environnement variables to be made available in the containers." and remove the
duplicate word in the second bullet by changing "Save and restore your service
service configurations." to "Save and restore your service configurations."

- View datasets
screenshots:
- https://github.com/InseeFrLab/onyxia/assets/6702424/f07e91e7-d597-4eca-b9df-2ddf457afb19
- https://github.com/InseeFrLab/onyxia/assets/6702424/77eb58e1-6f5d-43c4-8447-90f5c5aad5d2
- https://github.com/InseeFrLab/onyxia/assets/6702424/ae32ccab-e295-4079-b06e-c4035e67d7a4
- https://github.com/InseeFrLab/onyxia/assets/6702424/e8ec58ad-7dc8-410d-9cd3-5f0996e8f8f8
- https://github.com/InseeFrLab/onyxia/assets/6702424/b5256014-2af3-4e39-9ecb-f3f120aa920a
- https://github-production-user-asset-6210df.s3.amazonaws.com/6702424/273940140-f07e91e7-d597-4eca-b9df-2ddf457afb19.png
- https://github-production-user-asset-6210df.s3.amazonaws.com/6702424/273939460-77eb58e1-6f5d-43c4-8447-90f5c5aad5d2.png
- https://github-production-user-asset-6210df.s3.amazonaws.com/6702424/273939517-ae32ccab-e295-4079-b06e-c4035e67d7a4.png
- https://github-production-user-asset-6210df.s3.amazonaws.com/6702424/273939583-e8ec58ad-7dc8-410d-9cd3-5f0996e8f8f8.png
- https://github-production-user-asset-6210df.s3.amazonaws.com/6702424/273939654-b5256014-2af3-4e39-9ecb-f3f120aa920a.png
Comment on lines +57 to +61
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Expected: zero flagged lines.
# Flags screenshot URLs that do not appear to be repo-relative or repo-owned.
awk '
  /^[[:space:]]{4}screenshots:[[:space:]]*$/ {in_ss=1; next}
  in_ss && /^[[:space:]]{4}(videos|features|documentation):/ {in_ss=0}
  in_ss && /^[[:space:]]{6}- /{
    url=$0
    sub(/^[[:space:]]{6}- /, "", url)
    if (url ~ /^https?:\/\// &&
        url !~ /github\.com\/InseeFrLab\/onyxia/ &&
        url !~ /raw\.githubusercontent\.com\/InseeFrLab\/onyxia/) {
      print "Flag external screenshot URL: " url
      bad=1
    }
  }
  END { exit bad }
' publiccode.yml

Repository: InseeFrLab/onyxia

Length of output: 1561


🏁 Script executed:

#!/bin/bash
# Get the file size first
wc -l publiccode.yml

# Show lines 50-95 to see both sections mentioned (57-61 and 86-90)
echo "=== Lines 50-95 of publiccode.yml ==="
sed -n '50,95p' publiccode.yml | cat -n

Repository: InseeFrLab/onyxia

Length of output: 4891


🏁 Script executed:

#!/bin/bash
# Check if there's any documentation or schema reference in the repo about publiccode.yml
find . -type f -name "*.md" -o -name "*.yml" -o -name "*.yaml" | \
  xargs grep -l "publiccode" | head -20

Repository: InseeFrLab/onyxia

Length of output: 122


🏁 Script executed:

#!/bin/bash
# Check the publiccodeyml validation workflow
cat .github/workflows/publiccodeyml-check.yml

Repository: InseeFrLab/onyxia

Length of output: 695


🌐 Web query:

publiccode.yml schema screenshot URLs repository-relative paths documentation

💡 Result:

In the publiccode.yml schema, screenshot URLs are specified under the key description/[lang]/screenshots, where [lang] is a language code like 'en' or 'it'. This is an optional array of strings representing paths to screenshot files. These paths are repository-relative, starting from the root of the repository where the publiccode.yml file is located (e.g., "img/sshot1.jpg" or "data/screenshots/configuration.png"). They can also be absolute URLs, but the files must reside in the same repository. Supported formats are PNG and JPG. Suggested sizes: Desktop 1280x800 @1x, Tablet 1024x768 @2x, Mobile 375x667 @2x. Example from official schema: description: en: screenshots: - img/sshot1.jpg - img/sshot2.jpg Official documentation: https://yml.publiccode.tools/schema.core.html#1-1-22-8-key-description-lang-screenshots

Citations:


Move screenshot images to the repository and use repo-relative paths or raw.githubusercontent.com URLs.

Lines 57-61 and 86-90 use externally hosted github-production-user-asset URLs. The publiccode.yml schema expects repository-relative paths (e.g., img/screenshot.png) or absolute URLs where files reside in the same repository. These temporary CDN URLs are brittle for long-term maintenance—they can expire or become inaccessible. With no-network: true in CI validation, broken links won't be caught automatically.

Replace with repo-relative paths (checked into the repository) or raw.githubusercontent.com URLs under InseeFrLab/onyxia.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@publiccode.yml` around lines 57 - 61, The listed external
github-production-user-asset URLs in publiccode.yml should be replaced with
repository-hosted references: add the five screenshot files into the repo (e.g.,
img/ or docs/assets/) and update the entries in publiccode.yml to use
repo-relative paths (like img/screenshot1.png) or raw.githubusercontent.com URLs
pointing to the same repository and commit; locate the URL strings currently on
lines containing "github-production-user-asset-6210df..." in publiccode.yml and
replace each with the new repo-relative path or raw.githubusercontent.com URL
after committing the image files.

videos:
- https://youtu.be/FvpNfVrxBFM?si=goZHdAkOegWjrXBw
- https://youtu.be/FvpNfVrxBFM?si=goZHdAkOegWjrXBw

fr:
genericName: Onyxia
shortDescription: >
Application web pour simplifier la mise en place d'un environnement de data science sur Kubernetes
Application web pour simplifier la mise en place d'un environnement de data science sur Kubernetes

longDescription: >
Onyxia est une application web open-source conçue pour simplifier la mise en place d’environnements de travail avancés dédiés à la science des données. Sa vocation première est de rendre les ressources cloud accessibles, même aux utilisateurs n’ayant pas de compétences techniques poussées en infrastructure.
Ce qui distingue Onyxia, c’est son interface intuitive et ergonomique, qui masque toute la complexité des technologies cloud. En quelques clics, les utilisateurs peuvent sélectionner leurs outils de prédilection (Jupyter, RStudio, VSCode, etc.), allouer les ressources nécessaires (GPU, CPU, mémoire vive), configurer des variables d’environnement, et activer un stockage persistant. Onyxia prend ensuite en charge l’ensemble du processus : lancement des conteneurs, connexion à un stockage compatible S3, gestion sécurisée des identifiants via Vault et OIDC.
Onyxia n’est pas l’endroit où les data scientists effectuent leur analyse, mais plutôt la porte d’entrée de leur environnement technique. Il s’agit d’une étape intermédiaire dans leur workflow, leur permettant de configurer efficacement leur stack de travail avant d’accéder à leurs outils habituels.
L’explorateur de fichiers intégré facilite la manipulation de grands volumes de données, tandis que l’architecture transparente d’Onyxia en fait également un outil pédagogique. Chaque action réalisée via l’interface est affichée sous forme de commandes dans un terminal simulé. Cette transparence permet aux utilisateurs de comprendre les mécanismes sous-jacents (Kubernetes, Helm, Docker, etc.) et, s’ils le souhaitent, de répliquer les opérations eux-mêmes.
L’installation d’Onyxia est assurée par les administrateurs système sur un cluster Kubernetes — en local ou chez un fournisseur cloud — avant d’être mise à disposition de l’équipe data via une interface web. Une fois déployé, Onyxia réduit considérablement le temps d’onboarding, encourage les bonnes pratiques et permet aux équipes de se concentrer pleinement sur leurs projets de données.
Onyxia est une application web open-source conçue pour simplifier la mise en place d’environnements de travail avancés dédiés à la science des données. Sa vocation première est de rendre les ressources cloud accessibles, même aux utilisateurs n’ayant pas de compétences techniques poussées en infrastructure.
Ce qui distingue Onyxia, c’est son interface intuitive et ergonomique, qui masque toute la complexité des technologies cloud. En quelques clics, les utilisateurs peuvent sélectionner leurs outils de prédilection (Jupyter, RStudio, VSCode, etc.), allouer les ressources nécessaires (GPU, CPU, mémoire vive), configurer des variables d’environnement, et activer un stockage persistant. Onyxia prend ensuite en charge l’ensemble du processus : lancement des conteneurs, connexion à un stockage compatible S3, gestion sécurisée des identifiants via Vault et OIDC.
Onyxia n’est pas l’endroit où les data scientists effectuent leur analyse, mais plutôt la porte d’entrée de leur environnement technique. Il s’agit d’une étape intermédiaire dans leur workflow, leur permettant de configurer efficacement leur stack de travail avant d’accéder à leurs outils habituels.
L’explorateur de fichiers intégré facilite la manipulation de grands volumes de données, tandis que l’architecture transparente d’Onyxia en fait également un outil pédagogique. Chaque action réalisée via l’interface est affichée sous forme de commandes dans un terminal simulé. Cette transparence permet aux utilisateurs de comprendre les mécanismes sous-jacents (Kubernetes, Helm, Docker, etc.) et, s’ils le souhaitent, de répliquer les opérations eux-mêmes.
L’installation d’Onyxia est assurée par les administrateurs système sur un cluster Kubernetes — en local ou chez un fournisseur cloud — avant d’être mise à disposition de l’équipe data via une interface web. Une fois déployé, Onyxia réduit considérablement le temps d’onboarding, encourage les bonnes pratiques et permet aux équipes de se concentrer pleinement sur leurs projets de données.

documentation: "https://docs.onyxia.sh"

features:
- UI de lancement d'images docker (Helm charts)
- Les utilisateurs peuvent définir la quantité de RAM, de CPU et de GPU qu’ils souhaitent allouer à leurs conteneurs.
- Définir des variables d’environnement à rendre disponibles dans les conteneurs.
- Sauvegardez et restaurez vos configurations de services.
- Visualiser les jeux de données
- UI de lancement d'images docker (Helm charts)
- Les utilisateurs peuvent définir la quantité de RAM, de CPU et de GPU qu’ils souhaitent allouer à leurs conteneurs.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Expected: no output. Any output indicates a feature item > 100 chars.
awk '
  /^[[:space:]]{2}(en|fr):[[:space:]]*$/ {lang=$1; sub(":","",lang)}
  /^[[:space:]]{4}features:[[:space:]]*$/ {in_features=1; next}
  in_features && /^[[:space:]]{4}(screenshots|videos|documentation):/ {in_features=0}
  in_features && /^[[:space:]]{6}- /{
    text=$0
    sub(/^[[:space:]]{6}- /, "", text)
    if (length(text) > 100) {
      printf "Feature too long [%s]: %d chars | %s\n", lang, length(text), text
      bad=1
    }
  }
  END { exit bad }
' publiccode.yml

Repository: InseeFrLab/onyxia

Length of output: 212


Shorten the French feature item on line 80 to comply with publiccode standard (max 100 characters per feature).

The feature description "Les utilisateurs peuvent définir la quantité de RAM, de CPU et de GPU qu'ils souhaitent allouer à leurs conteneurs." is 120 characters, exceeding the 100-character limit. This violates the publiccode.yml schema constraint for description/[lang]/features items per https://docs.italia.it/italia/developers-italia/publiccodeyml-en/en/master/schema.core.html. Reduce the text to pass validation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@publiccode.yml` at line 80, The feature description "Les utilisateurs peuvent
définir la quantité de RAM, de CPU et de GPU qu’ils souhaitent allouer à leurs
conteneurs." exceeds 100 characters; replace that string in publiccode.yml with
a concise French variant under 100 characters (e.g., "Les utilisateurs
définissent la quantité de RAM, CPU et GPU allouée aux conteneurs") preserving
the original meaning and keeping total characters ≤100 so the
`description/[lang]/features` item passes publiccode schema validation.

- Définir des variables d’environnement à rendre disponibles dans les conteneurs.
- Sauvegardez et restaurez vos configurations de services.
- Visualiser les jeux de données

screenshots:
- https://github.com/InseeFrLab/onyxia/assets/6702424/f07e91e7-d597-4eca-b9df-2ddf457afb19
- https://github.com/InseeFrLab/onyxia/assets/6702424/77eb58e1-6f5d-43c4-8447-90f5c5aad5d2
- https://github.com/InseeFrLab/onyxia/assets/6702424/ae32ccab-e295-4079-b06e-c4035e67d7a4
- https://github.com/InseeFrLab/onyxia/assets/6702424/e8ec58ad-7dc8-410d-9cd3-5f0996e8f8f8
- https://github.com/InseeFrLab/onyxia/assets/6702424/b5256014-2af3-4e39-9ecb-f3f120aa920a
- https://github-production-user-asset-6210df.s3.amazonaws.com/6702424/273940140-f07e91e7-d597-4eca-b9df-2ddf457afb19.png
- https://github-production-user-asset-6210df.s3.amazonaws.com/6702424/273939460-77eb58e1-6f5d-43c4-8447-90f5c5aad5d2.png
- https://github-production-user-asset-6210df.s3.amazonaws.com/6702424/273939517-ae32ccab-e295-4079-b06e-c4035e67d7a4.png
- https://github-production-user-asset-6210df.s3.amazonaws.com/6702424/273939583-e8ec58ad-7dc8-410d-9cd3-5f0996e8f8f8.png
- https://github-production-user-asset-6210df.s3.amazonaws.com/6702424/273939654-b5256014-2af3-4e39-9ecb-f3f120aa920a.png
videos:
- https://youtu.be/FvpNfVrxBFM?si=goZHdAkOegWjrXBw

- https://youtu.be/FvpNfVrxBFM?si=goZHdAkOegWjrXBw

legal:
license: MIT
mainCopyrightOwner: INSEE
repoOwner: INSEE

authors:
distinctAuthorsCount: 0
distinctOrganizationsCount: 0

maintenance:
type: "internal"
Expand All @@ -117,15 +106,6 @@ maintenance:
email: "joseph.garrone@protonmail.com"
affiliation: INSEE

metadataFiles:
readme: README.md
license: LICENSE
contributing: CONTRIBUTING.md
changelog: null
codeOfConduct: CODE_OF_CONDUCT.md
governance: GOVERNANCE.md
funding: null

localisation:
localisationReady: true
availableLanguages:
Expand All @@ -141,5 +121,5 @@ localisation:

dependsOn:
open:
- name: Kubernetes
- name: Helm
- name: Kubernetes
- name: Helm