Add whatprovides resolution to verify-install plugin#4804
Add whatprovides resolution to verify-install plugin#4804vaibhavdaren wants to merge 2 commits intomainfrom
whatprovides resolution to verify-install plugin#4804Conversation
There was a problem hiding this comment.
Code Review
This pull request implements the resolution of virtual RPM provides to actual package names using 'rpm -q --whatprovides' during artifact preparation and installation verification. In 'verify_installation.py', replace the dictionary comprehension for 'verify_map' with a loop using 'setdefault' and 'tmt.utils.uniq' to prevent overwriting entries when multiple virtual provides resolve to the same package name.
725fbcf to
1d6cf0d
Compare
9914613 to
d189b00
Compare
d1de079 to
e08283d
Compare
whatprovides resolution to verify-install plugin
|
The test failures are unrelated to the change. Flaky Test : /tests/execute/restart/with-reboot 🔎 /packit retest-failed |
| The script must emit one line per capability in the same order as the input. | ||
| Each line is either a valid NEVRA string for a found capability, or an error | ||
| message (e.g. ``no package provides <cap>``) for one that is not provided by | ||
| any installed package. |
There was a problem hiding this comment.
The capabilities name is neither generic enough, nor is it correct in the dnf context. For dnf context, provides could be a better fit for what this is.
There was a problem hiding this comment.
changed to provides.
context: for choosing capability. We already have a provider.
| :param capabilities: Capabilities to resolve — package names, file paths, or | ||
| virtual provides (e.g. ``make``, ``/usr/bin/make``, ``pkgconfig(openssl)``). |
There was a problem hiding this comment.
Why the em-dash. Personally, if one uses em-dash, it feels like they did not consider the wording at all.
|
|
||
| :param capabilities: Capabilities to resolve — package names, file paths, or | ||
| virtual provides (e.g. ``make``, ``/usr/bin/make``, ``pkgconfig(openssl)``). | ||
| :returns: A shell script whose stdout contains one NEVRA line per capability. |
There was a problem hiding this comment.
Again, too technical. The whose stdout is not relevant and too literal. See the other docstrings
| :param capabilities: Capabilities to resolve — package names, file paths, or | ||
| virtual provides (e.g. ``make``, ``/usr/bin/make``, ``pkgconfig(openssl)``). | ||
| :returns: A shell script whose stdout contains one NEVRA line per capability. | ||
| :raises NotImplementedError: If the package manager does not support this query. |
There was a problem hiding this comment.
See the raises just below here. Be consistent.
| result[package] = parts[1] if len(parts) == 2 else SpecialPackageOrigin.UNKNOWN | ||
| return result | ||
|
|
||
| def resolve_capabilities(self, capabilities: Iterable[str]) -> 'dict[str, Optional[Version]]': |
There was a problem hiding this comment.
Incorrect quotations of the return type
|
|
||
| def resolve_capabilities(self, capabilities: Iterable[str]) -> 'dict[str, Optional[Version]]': | ||
| """ | ||
| Map each capability to the :py:class:`Version` of the installed package providing it. |
There was a problem hiding this comment.
| Map each capability to the :py:class:`Version` of the installed package providing it. | |
| Map each capability to the :py:class:`Version` of the packages providing it. |
- The package is not installed when you need to call this
- Multiple packages can provide the same provides, particularly if the user did not add relevant constraints
| try: | ||
| output = self.guest.execute(script) | ||
| stdout = output.stdout | ||
| except tmt.utils.RunError as exc: | ||
| stdout = exc.stdout |
There was a problem hiding this comment.
This is just wrong isn't it? Can't figure out what you are trying to catch here.
| except tmt.utils.RunError as exc: | ||
| stdout = exc.stdout | ||
|
|
||
| result: dict[str, Optional[Version]] = dict.fromkeys(caps, None) |
There was a problem hiding this comment.
Move this at the top and you can avoid the intermediate caps. I commented about this very recently didn't I?
| with contextlib.suppress(ValueError): | ||
| result[cap] = RpmVersion.from_nevra(line.strip()) |
There was a problem hiding this comment.
No contextlib.suppress. There is a special case being handled, then document it properly
There was a problem hiding this comment.
Sorry - just seeing this now, but prolly made a duplicate comment: #4804 (comment)
| def resolve_capabilities(self, *capabilities: str) -> ShellScript: | ||
| # Reuse the existing rpm --whatprovides script; output is one NEVRA per line, | ||
| return self._construct_presence_script(*[Package(c) for c in capabilities]) |
There was a problem hiding this comment.
Not a good approach, see the other comments. The repoquery might give you more flexibility on the format and allow you to get more information to process further like the repoid.
There was a problem hiding this comment.
We now , run repoquery for each of the packages in a script. Which is actually easier to parse. Addressed in 0fb9b50.
There was a problem hiding this comment.
Try running dnf repoquery --whatprovides=make,cmake and see what you get
There was a problem hiding this comment.
@LecrisUT , When multiple arguments are passed to --whatprovides, unresolved items are silently ignored and there isn’t a flag to make the command fail or report them explicitly. It also returns a merged result set without preserving input-to-output mapping.
➜ tmt git:(vaibhav-discover-only-providers-whatprovides) ✗ dnf repoquery --whatprovides=make,cmake,pikachu
Updating and loading repositories:
Repositories loaded.
cmake-0:3.31.11-1.fc42.aarch64
cmake-0:3.31.6-2.fc42.aarch64
make-1:4.4.1-10.fc42.aarch64Is there something clever we could do at this point?
There was a problem hiding this comment.
Check the queryformat and what it can do. fedrq does some complex handling also if you need more reference
There was a problem hiding this comment.
@LecrisUT 🥇 ,
I looked into queryformat and fedrq, but doing a bulk dnf repoquery creates a couple of blockers since we're just executing a shell script on the guest.
My findings,
- Lost order: DNF merges and sorts the final list, which completely destroys the 1:1 input-to-output mapping we need to track things.
- Silently missing packages: If DNF can't resolve an item, it just drops it from the output without throwing an error. We're left with no way to know which specific capability failed.
There was a problem hiding this comment.
- Lost order: DNF merges and sorts the final list, which completely destroys the 1:1 input-to-output mapping we need to track things.
Note that there is never a 1:1 mapping, e.g. look at the output of --whatprovides cmake there having multiple items that provide it, and that is expected and we have to work around them. What should be done, I don't know.
A concrete example case are the copr_repo provider.
- Silently missing packages: If DNF can't resolve an item, it just drops it from the output without throwing an error. We're left with no way to know which specific capability failed.
Yes certainly, that's why I mentioned looking in the queryformat tags. fedrq does provided the structured information when you do the multi-query, but I don't know how it does it if it's from the tags that repoquery provides or not. I will try to look on my end as well to see what I can gather
There was a problem hiding this comment.
Provides should return a set. did some refactoring in ad05ffb.
|
|
||
| result: dict[str, Optional[Version]] = dict.fromkeys(caps, None) | ||
| for cap, line in zip(caps, (stdout or '').splitlines()): | ||
| with contextlib.suppress(ValueError): |
There was a problem hiding this comment.
This supresses parsing errors (and others too) - ambigous. Suggest adding a debug() log here:
for cap, line in zip(caps, (stdout or '').splitlines()):
try:
result[cap] = RpmVersion.from_nevra(line.strip())
except ValueError:
self.debug(f"Could not resolve capability '{cap}': {line.strip()}")There was a problem hiding this comment.
thanks for the suggestion, addressed in f68411a.
0fb9b50 to
ad2eed7
Compare
ad2eed7 to
f68411a
Compare
f68411a to
bcab309
Compare
| """ | ||
| raise NotImplementedError | ||
|
|
||
| def resolve_provides(self, *provides: str) -> ShellScript: |
There was a problem hiding this comment.
Should be decorated as abstractmethod.
| result[package] = parts[1] if len(parts) == 2 else SpecialPackageOrigin.UNKNOWN | ||
| return result | ||
|
|
||
| def resolve_provides(self, provides: Iterable[str]) -> dict[str, Optional[Version]]: |
There was a problem hiding this comment.
What is the reason for different type of provides here and in the engine?
|
|
||
| provides = list(provides) | ||
| if not provides: | ||
| return result |
There was a problem hiding this comment.
Empty mapping is not mentioned as a possible return value. You could use defaultdict instead of dict so that all keys would be mapped to None by default.
| dependency_to_version = guest.package_manager.resolve_provides(self.data.verify.keys()) | ||
| except NotImplementedError: | ||
| dependency_to_version = {} | ||
| verify: dict[str, list[str]] = {} |
There was a problem hiding this comment.
Related to the note about defaultdict above, dependency_to_version as an empty mapping does not match what resolve_provides() says it would return. verify can also be defaultdict, it would be easier to understand than setdefault() below.
|
/packit retest-failed |
| # This guarantees O(1) alignment for the zip() pairing in the caller, | ||
| # while executing in milliseconds compared to looping 'dnf repoquery'. | ||
|
|
||
| qf = r"%{name}-%{epoch}:%{version}-%{release}.%{arch}\n" |
There was a problem hiding this comment.
%{epoch} can potentially output the literal string (none) under rpm -q for packages without an explicit epoch - no? I cannot remember correctly - in some provider while testing, if an epoch doesnt exist, none is returned @LecrisUT correct me please if Im wrong.
There was a problem hiding this comment.
@AthreyVinay thankyou.
%{epoch} outputs the literal string (none) for packages without
an explicit epoch (the majority of packages). With the old queryformat,
from_nevra silently parsed (none) as part of the version field — so bash would
appear to have version (none):5.2.37 instead of 5.2.37
qf -> spills out (none). Fixed in 7e978fc.
There was a problem hiding this comment.
%{epoch} is 0 if unset, confirmed it:
$ dnf repoquery --queryformat="%{epoch}\n" tmt
Updating and loading repositories:
Repositories loaded.
0There was a problem hiding this comment.
(dev) ➜ tmt git:(vaibhav-discover-only-providers-whatprovides) rpm -q tmt --queryformat "%{epoch}"
(none)%
(dev) ➜ tmt git:(vaibhav-discover-only-providers-whatprovides) dnf repoquery --queryformat="%{epoch}\n" tmt
Updating and loading repositories:
Repositories loaded.
0
(dev) ➜ tmt git:(vaibhav-discover-only-providers-whatprovides)There was a problem hiding this comment.
rpm -q vs dnf repoquery are completely different things. Which one should we use 🤷. rpm -q has more tags, but they provide less info
There was a problem hiding this comment.
%{epoch} is 0 if unset, confirmed it:
Thanks @LecrisUT , however, rpm -q is used here which outputs (none)
There was a problem hiding this comment.
Actually rpm -q is the wrong tool to use:
$ rpm -q --whatprovides fish --queryformat="%{nevra}\n"
no package provides fish
$ dnf repoquery --whatprovides=fish --queryformat="%{full_nevra}\n"
Updating and loading repositories:
Repositories loaded.
fish-0:4.0.2-5.fc43.x86_64
fish-0:4.2.0-2.fc43.x86_64It only queries what is already installed, but at this point we want to query what is available
There was a problem hiding this comment.
There are two scenarios to consider:
1. Before the RPM is installed
rpm -qdoes not work at this stage.- Only
dnf repoquerycan be used. - Note: For
repoqueryto work, repositories such astmt-artifact-shared(and any other required repos) must already be configured. Which is the case and hence we can go with repoquery.
2. After the RPM is installed
- Both
rpm -qanddnf repoquerywork as expected.
There was a problem hiding this comment.
- Note: For
repoqueryto work, repositories such astmt-artifact-shared(and any other required repos) must already be configured. Which is the case and hence we can go with repoquery.
Not really, we can point it to a repo with no /etc/yum/repos.d. The design would have to be changed to link this to repository to know which one to use if this is needed. I don't think it would be needed for now.
- After the RPM is installed
But that is too late:
- If the package was not installed, then this is in a weird state
- The original issue is to resolve the
requires/recommendsthat could use non-package names. This can either be used for constructing the verify step in the artifacts step or when handling the special case of downloaded rpm artifacts and re-installing pre-installed packages - This could be used to allow users to use the arbitrary package names in a custom
prepare/verifyphase, but should we be giving them that much flexibility in the first place?
| # path, we query the file list directly using `rpm -qf` to identify the package. | ||
| return ShellScript(f""" | ||
| for _provide in {provides_str}; do | ||
| _result=$(rpm -q --whatprovides "$_provide" --queryformat "{qf}" 2>/dev/null | head -1) |
There was a problem hiding this comment.
The head -1 is incorrect here. It is perfectly fine for multiple packages to provide the same provide.
| # This guarantees O(1) alignment for the zip() pairing in the caller, | ||
| # while executing in milliseconds compared to looping 'dnf repoquery'. | ||
|
|
||
| qf = r"%{name}-%|epoch?{%{epoch}}:{0}|:%{version}-%{release}.%{arch}\n" |
There was a problem hiding this comment.
You are replicating %{full_nevra} here
| # path, we query the file list directly using `rpm -qf` to identify the package. | ||
| return ShellScript(f""" | ||
| for _provide in {provides_str}; do | ||
| _result=$(rpm -q --whatprovides "$_provide" --queryformat "{qf}" 2>/dev/null | head -1) |
There was a problem hiding this comment.
Do you have a reason for hiding stderr?
There was a problem hiding this comment.
When I use repoquery
➜ ~ dnf repoquery --whatprovides=cmake --queryformat="%{full_nevra}\n"
Updating and loading repositories:
Repositories loaded.
cmake-0:3.31.11-1.fc42.aarch64
cmake-0:3.31.6-2.fc42.aarch64
➜ ~ dnf repoquery --whatprovides=cmake --queryformat="%{full_nevra}\n" 2>/dev/null
cmake-0:3.31.11-1.fc42.aarch64
cmake-0:3.31.6-2.fc42.aarch64
There was a problem hiding this comment.
But the stdout and stderr are already separated when you run the script with tmt, see CommandOutput
Allow resolving packages from file paths (e.g. /usr/bin/make) by leveraging DNF provides/whatprovides instead of requiring explicit package names.
7e978fc to
bf2d6e5
Compare
bf2d6e5 to
ad05ffb
Compare
Use whatprovides lookup for file-based dependencies
Allow resolving packages from file paths (e.g. /usr/bin/make)
by leveraging DNF provides/whatprovides instead of requiring
explicit package names.