Currently, predefined suggestion lists for License and Programming Languages are shown to the user for selection. Users cannot choose values outside these lists. This creates two problems:
-
License value mismatch:
Extracted metadata may provide an SPDX license URL (e.g., https://spdx.org/licenses/MIT.html), which we display on the extraction page as a tag. In the license suggestion list, however, the available options are SPDX names/IDs, not URLs. As a result, users cannot add a license by URL (only by name/ID) and cannot select a URL value if it does not appear in the suggestion list.
-
Programming language coverage gap:
Extracted metadata can include languages that are not present in our suggestion list (e.g., HTML). If the user removes such an extracted metadata, they cannot re-add it because it is missing from the suggestions. We need to use a reliable method to cover all programming languages used on GitHub, GitLab, or other sources.
Currently, predefined suggestion lists for License and Programming Languages are shown to the user for selection. Users cannot choose values outside these lists. This creates two problems:
License value mismatch:
Extracted metadata may provide an SPDX license URL (e.g., https://spdx.org/licenses/MIT.html), which we display on the extraction page as a tag. In the license suggestion list, however, the available options are SPDX names/IDs, not URLs. As a result, users cannot add a license by URL (only by name/ID) and cannot select a URL value if it does not appear in the suggestion list.
Programming language coverage gap:
Extracted metadata can include languages that are not present in our suggestion list (e.g., HTML). If the user removes such an extracted metadata, they cannot re-add it because it is missing from the suggestions. We need to use a reliable method to cover all programming languages used on GitHub, GitLab, or other sources.