Skip to content

Add whatprovides resolution to verify-install plugin#4804

Open
vaibhavdaren wants to merge 2 commits intomainfrom
vaibhav-discover-only-providers-whatprovides
Open

Add whatprovides resolution to verify-install plugin#4804
vaibhavdaren wants to merge 2 commits intomainfrom
vaibhav-discover-only-providers-whatprovides

Conversation

@vaibhavdaren
Copy link
Copy Markdown
Contributor

@vaibhavdaren vaibhavdaren commented Apr 17, 2026

Use whatprovides lookup for file-based dependencies

Allow resolving packages from file paths (e.g. /usr/bin/make)
by leveraging DNF provides/whatprovides instead of requiring
explicit package names.

@vaibhavdaren vaibhavdaren self-assigned this Apr 17, 2026
@vaibhavdaren vaibhavdaren moved this to implement in planning Apr 17, 2026
@vaibhavdaren vaibhavdaren added ci | full test Pull request is ready for the full test execution plugin | artifact Related to the `prepare/artifact` plugin. labels Apr 17, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements the resolution of virtual RPM provides to actual package names using 'rpm -q --whatprovides' during artifact preparation and installation verification. In 'verify_installation.py', replace the dictionary comprehension for 'verify_map' with a loop using 'setdefault' and 'tmt.utils.uniq' to prevent overwriting entries when multiple virtual provides resolve to the same package name.

Comment thread tmt/steps/prepare/verify_installation.py Outdated
@vaibhavdaren vaibhavdaren force-pushed the vaibhav-discover-only-providers-whatprovides branch from 725fbcf to 1d6cf0d Compare April 17, 2026 16:05
Base automatically changed from vaibhav-discover-only-providers to main April 17, 2026 16:32
@vaibhavdaren vaibhavdaren force-pushed the vaibhav-discover-only-providers-whatprovides branch 5 times, most recently from 9914613 to d189b00 Compare April 20, 2026 10:32
@psss psss added this to the 1.72 milestone Apr 20, 2026
@vaibhavdaren vaibhavdaren force-pushed the vaibhav-discover-only-providers-whatprovides branch 5 times, most recently from d1de079 to e08283d Compare April 21, 2026 00:32
@vaibhavdaren vaibhavdaren moved this from implement to review in planning Apr 21, 2026
@vaibhavdaren vaibhavdaren marked this pull request as ready for review April 21, 2026 00:37
@psss psss changed the title add whatprovides resolution to verify-install plugin Add whatprovides resolution to verify-install plugin Apr 21, 2026
@vaibhavdaren
Copy link
Copy Markdown
Contributor Author

The test failures are unrelated to the change. Flaky Test : /tests/execute/restart/with-reboot 🔎

/packit retest-failed

Comment thread tmt/package_managers/__init__.py Outdated
The script must emit one line per capability in the same order as the input.
Each line is either a valid NEVRA string for a found capability, or an error
message (e.g. ``no package provides <cap>``) for one that is not provided by
any installed package.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The capabilities name is neither generic enough, nor is it correct in the dnf context. For dnf context, provides could be a better fit for what this is.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to provides.
context: for choosing capability. We already have a provider.

Comment thread tmt/package_managers/__init__.py Outdated
Comment on lines +412 to +413
:param capabilities: Capabilities to resolve — package names, file paths, or
virtual provides (e.g. ``make``, ``/usr/bin/make``, ``pkgconfig(openssl)``).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the em-dash. Personally, if one uses em-dash, it feels like they did not consider the wording at all.

Comment thread tmt/package_managers/__init__.py Outdated

:param capabilities: Capabilities to resolve — package names, file paths, or
virtual provides (e.g. ``make``, ``/usr/bin/make``, ``pkgconfig(openssl)``).
:returns: A shell script whose stdout contains one NEVRA line per capability.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, too technical. The whose stdout is not relevant and too literal. See the other docstrings

:param capabilities: Capabilities to resolve — package names, file paths, or
virtual provides (e.g. ``make``, ``/usr/bin/make``, ``pkgconfig(openssl)``).
:returns: A shell script whose stdout contains one NEVRA line per capability.
:raises NotImplementedError: If the package manager does not support this query.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the raises just below here. Be consistent.

Comment thread tmt/package_managers/__init__.py Outdated
result[package] = parts[1] if len(parts) == 2 else SpecialPackageOrigin.UNKNOWN
return result

def resolve_capabilities(self, capabilities: Iterable[str]) -> 'dict[str, Optional[Version]]':
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect quotations of the return type

Comment thread tmt/package_managers/__init__.py Outdated

def resolve_capabilities(self, capabilities: Iterable[str]) -> 'dict[str, Optional[Version]]':
"""
Map each capability to the :py:class:`Version` of the installed package providing it.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Map each capability to the :py:class:`Version` of the installed package providing it.
Map each capability to the :py:class:`Version` of the packages providing it.
  • The package is not installed when you need to call this
  • Multiple packages can provide the same provides, particularly if the user did not add relevant constraints

Comment thread tmt/package_managers/__init__.py Outdated
Comment on lines +571 to +575
try:
output = self.guest.execute(script)
stdout = output.stdout
except tmt.utils.RunError as exc:
stdout = exc.stdout
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just wrong isn't it? Can't figure out what you are trying to catch here.

Comment thread tmt/package_managers/__init__.py Outdated
except tmt.utils.RunError as exc:
stdout = exc.stdout

result: dict[str, Optional[Version]] = dict.fromkeys(caps, None)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this at the top and you can avoid the intermediate caps. I commented about this very recently didn't I?

Comment thread tmt/package_managers/__init__.py Outdated
Comment on lines +579 to +580
with contextlib.suppress(ValueError):
result[cap] = RpmVersion.from_nevra(line.strip())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No contextlib.suppress. There is a special case being handled, then document it properly

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - just seeing this now, but prolly made a duplicate comment: #4804 (comment)

Comment thread tmt/package_managers/dnf.py Outdated
Comment on lines +212 to +214
def resolve_capabilities(self, *capabilities: str) -> ShellScript:
# Reuse the existing rpm --whatprovides script; output is one NEVRA per line,
return self._construct_presence_script(*[Package(c) for c in capabilities])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a good approach, see the other comments. The repoquery might give you more flexibility on the format and allow you to get more information to process further like the repoid.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now , run repoquery for each of the packages in a script. Which is actually easier to parse. Addressed in 0fb9b50.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try running dnf repoquery --whatprovides=make,cmake and see what you get

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LecrisUT , When multiple arguments are passed to --whatprovides, unresolved items are silently ignored and there isn’t a flag to make the command fail or report them explicitly. It also returns a merged result set without preserving input-to-output mapping.

➜  tmt git:(vaibhav-discover-only-providers-whatprovides) ✗ dnf repoquery --whatprovides=make,cmake,pikachu
Updating and loading repositories:
Repositories loaded.
cmake-0:3.31.11-1.fc42.aarch64
cmake-0:3.31.6-2.fc42.aarch64
make-1:4.4.1-10.fc42.aarch64

Is there something clever we could do at this point?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the queryformat and what it can do. fedrq does some complex handling also if you need more reference

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LecrisUT 🥇 ,
I looked into queryformat and fedrq, but doing a bulk dnf repoquery creates a couple of blockers since we're just executing a shell script on the guest.
My findings,

  • Lost order: DNF merges and sorts the final list, which completely destroys the 1:1 input-to-output mapping we need to track things.
  • Silently missing packages: If DNF can't resolve an item, it just drops it from the output without throwing an error. We're left with no way to know which specific capability failed.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Lost order: DNF merges and sorts the final list, which completely destroys the 1:1 input-to-output mapping we need to track things.

Note that there is never a 1:1 mapping, e.g. look at the output of --whatprovides cmake there having multiple items that provide it, and that is expected and we have to work around them. What should be done, I don't know.

A concrete example case are the copr_repo provider.

  • Silently missing packages: If DNF can't resolve an item, it just drops it from the output without throwing an error. We're left with no way to know which specific capability failed.

Yes certainly, that's why I mentioned looking in the queryformat tags. fedrq does provided the structured information when you do the multi-query, but I don't know how it does it if it's from the tags that repoquery provides or not. I will try to look on my end as well to see what I can gather

Copy link
Copy Markdown
Contributor Author

@vaibhavdaren vaibhavdaren Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Provides should return a set. did some refactoring in ad05ffb.

@LecrisUT LecrisUT self-assigned this Apr 21, 2026
Comment thread tmt/package_managers/__init__.py Outdated

result: dict[str, Optional[Version]] = dict.fromkeys(caps, None)
for cap, line in zip(caps, (stdout or '').splitlines()):
with contextlib.suppress(ValueError):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This supresses parsing errors (and others too) - ambigous. Suggest adding a debug() log here:

for cap, line in zip(caps, (stdout or '').splitlines()):
    try:
        result[cap] = RpmVersion.from_nevra(line.strip())
    except ValueError:
        self.debug(f"Could not resolve capability '{cap}': {line.strip()}")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the suggestion, addressed in f68411a.

@vaibhavdaren vaibhavdaren requested a review from psss April 21, 2026 17:09
@vaibhavdaren vaibhavdaren force-pushed the vaibhav-discover-only-providers-whatprovides branch from 0fb9b50 to ad2eed7 Compare April 21, 2026 17:28
@vaibhavdaren vaibhavdaren requested a review from LecrisUT April 21, 2026 17:32
@vaibhavdaren vaibhavdaren force-pushed the vaibhav-discover-only-providers-whatprovides branch from ad2eed7 to f68411a Compare April 21, 2026 20:23
@vaibhavdaren vaibhavdaren added the status | discuss Needs more discussion before closing label Apr 21, 2026
@vaibhavdaren vaibhavdaren force-pushed the vaibhav-discover-only-providers-whatprovides branch from f68411a to bcab309 Compare April 21, 2026 23:16
Comment thread tmt/package_managers/__init__.py Outdated
"""
raise NotImplementedError

def resolve_provides(self, *provides: str) -> ShellScript:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be decorated as abstractmethod.

Comment thread tmt/package_managers/__init__.py Outdated
result[package] = parts[1] if len(parts) == 2 else SpecialPackageOrigin.UNKNOWN
return result

def resolve_provides(self, provides: Iterable[str]) -> dict[str, Optional[Version]]:
Copy link
Copy Markdown
Contributor

@happz happz Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for different type of provides here and in the engine?


provides = list(provides)
if not provides:
return result
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty mapping is not mentioned as a possible return value. You could use defaultdict instead of dict so that all keys would be mapped to None by default.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in cdc2b24.

dependency_to_version = guest.package_manager.resolve_provides(self.data.verify.keys())
except NotImplementedError:
dependency_to_version = {}
verify: dict[str, list[str]] = {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the note about defaultdict above, dependency_to_version as an empty mapping does not match what resolve_provides() says it would return. verify can also be defaultdict, it would be easier to understand than setdefault() below.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in cdc2b24.

@vaibhavdaren
Copy link
Copy Markdown
Contributor Author

/packit retest-failed

Comment thread tmt/package_managers/dnf.py Outdated
# This guarantees O(1) alignment for the zip() pairing in the caller,
# while executing in milliseconds compared to looping 'dnf repoquery'.

qf = r"%{name}-%{epoch}:%{version}-%{release}.%{arch}\n"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%{epoch} can potentially output the literal string (none) under rpm -q for packages without an explicit epoch - no? I cannot remember correctly - in some provider while testing, if an epoch doesnt exist, none is returned @LecrisUT correct me please if Im wrong.

Copy link
Copy Markdown
Contributor Author

@vaibhavdaren vaibhavdaren Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AthreyVinay thankyou.

%{epoch} outputs the literal string (none) for packages without
an explicit epoch (the majority of packages). With the old queryformat,
from_nevra silently parsed (none) as part of the version field — so bash would
appear to have version (none):5.2.37 instead of 5.2.37

qf -> spills out (none). Fixed in 7e978fc.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%{epoch} is 0 if unset, confirmed it:

$ dnf repoquery --queryformat="%{epoch}\n" tmt
Updating and loading repositories:
Repositories loaded.
0

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(dev) ➜  tmt git:(vaibhav-discover-only-providers-whatprovides) rpm -q tmt --queryformat "%{epoch}"
(none)%
(dev) ➜  tmt git:(vaibhav-discover-only-providers-whatprovides) dnf repoquery --queryformat="%{epoch}\n" tmt
Updating and loading repositories:
Repositories loaded.
0
(dev) ➜  tmt git:(vaibhav-discover-only-providers-whatprovides)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rpm -q vs dnf repoquery are completely different things. Which one should we use 🤷. rpm -q has more tags, but they provide less info

Copy link
Copy Markdown
Member

@AthreyVinay AthreyVinay Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%{epoch} is 0 if unset, confirmed it:

Thanks @LecrisUT , however, rpm -q is used here which outputs (none)

Copy link
Copy Markdown
Member

@LecrisUT LecrisUT Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually rpm -q is the wrong tool to use:

$ rpm -q --whatprovides fish --queryformat="%{nevra}\n" 
no package provides fish
$ dnf repoquery --whatprovides=fish --queryformat="%{full_nevra}\n" 
Updating and loading repositories:
Repositories loaded.
fish-0:4.0.2-5.fc43.x86_64
fish-0:4.2.0-2.fc43.x86_64

It only queries what is already installed, but at this point we want to query what is available

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No arguing that 👍

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two scenarios to consider:

1. Before the RPM is installed

  • rpm -q does not work at this stage.
  • Only dnf repoquery can be used.
  • Note: For repoquery to work, repositories such as tmt-artifact-shared (and any other required repos) must already be configured. Which is the case and hence we can go with repoquery.

2. After the RPM is installed

  • Both rpm -q and dnf repoquery work as expected.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Note: For repoquery to work, repositories such as tmt-artifact-shared (and any other required repos) must already be configured. Which is the case and hence we can go with repoquery.

Not really, we can point it to a repo with no /etc/yum/repos.d. The design would have to be changed to link this to repository to know which one to use if this is needed. I don't think it would be needed for now.

  1. After the RPM is installed

But that is too late:

  • If the package was not installed, then this is in a weird state
  • The original issue is to resolve the requires/recommends that could use non-package names. This can either be used for constructing the verify step in the artifacts step or when handling the special case of downloaded rpm artifacts and re-installing pre-installed packages
  • This could be used to allow users to use the arbitrary package names in a custom prepare/verify phase, but should we be giving them that much flexibility in the first place?

Comment thread tmt/package_managers/dnf.py Outdated
# path, we query the file list directly using `rpm -qf` to identify the package.
return ShellScript(f"""
for _provide in {provides_str}; do
_result=$(rpm -q --whatprovides "$_provide" --queryformat "{qf}" 2>/dev/null | head -1)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The head -1 is incorrect here. It is perfectly fine for multiple packages to provide the same provide.

Comment thread tmt/package_managers/dnf.py Outdated
# This guarantees O(1) alignment for the zip() pairing in the caller,
# while executing in milliseconds compared to looping 'dnf repoquery'.

qf = r"%{name}-%|epoch?{%{epoch}}:{0}|:%{version}-%{release}.%{arch}\n"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are replicating %{full_nevra} here

Comment thread tmt/package_managers/dnf.py Outdated
# path, we query the file list directly using `rpm -qf` to identify the package.
return ShellScript(f"""
for _provide in {provides_str}; do
_result=$(rpm -q --whatprovides "$_provide" --queryformat "{qf}" 2>/dev/null | head -1)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a reason for hiding stderr?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I use repoquery

➜  ~ dnf repoquery --whatprovides=cmake --queryformat="%{full_nevra}\n"
Updating and loading repositories:
Repositories loaded.
cmake-0:3.31.11-1.fc42.aarch64
cmake-0:3.31.6-2.fc42.aarch64
➜  ~ dnf repoquery --whatprovides=cmake --queryformat="%{full_nevra}\n"  2>/dev/null
cmake-0:3.31.11-1.fc42.aarch64
cmake-0:3.31.6-2.fc42.aarch64

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the stdout and stderr are already separated when you run the script with tmt, see CommandOutput

Allow resolving packages from file paths (e.g. /usr/bin/make)
by leveraging DNF provides/whatprovides instead of requiring
explicit package names.
@vaibhavdaren vaibhavdaren force-pushed the vaibhav-discover-only-providers-whatprovides branch from 7e978fc to bf2d6e5 Compare April 22, 2026 19:11
Comment thread tmt/package_managers/__init__.py
@vaibhavdaren vaibhavdaren force-pushed the vaibhav-discover-only-providers-whatprovides branch from bf2d6e5 to ad05ffb Compare April 22, 2026 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci | full test Pull request is ready for the full test execution plugin | artifact Related to the `prepare/artifact` plugin. status | discuss Needs more discussion before closing

Projects

Status: review

Development

Successfully merging this pull request may close these issues.

5 participants