Add some logging to GPU detection by davidpanderson · Pull Request #6992 · BOINC/boinc

davidpanderson · 2026-04-13T00:22:13Z

In particular, for a generic OpenCL device, log the calculation of its FLOPS.

Also: add timestamp to GPU detection log messages.

Also: for CUDA, use cuMemGetInfo() instead of cuMemGetInfo_v2().
On my computer, the latter always returns a 201 error
('not context', even though we just created a context).

Summary by cubic

Add timestamps to GPU detection logs and improve diagnostics for CUDA and OpenCL. Prefer CUDA cuMemGetInfo() with fallback and clearer logging to avoid context errors.

New Features
- Timestamp all GPU detection warnings.
- Log OpenCL generic device peak FLOPS and inputs (compute units, clock MHz).
Bug Fixes
- Prefer cuMemGetInfo() and fall back to cuMemGetInfo_v2(); always log return code plus free/total memory, and warn if neither is available. Avoids error 201 ("no context") seen on some systems.

^{Written for commit b663abb. Summary will update on new commits.}

even though we just successfully created a context. cuMemGetInfo() doesn't get this error. So use it. Note: I'm not sure if GetInfo() works for > 4GB; my GPU is a bit less than 4GB OpenCL: for generic devices, write FLOPS info to log file

cubic-dev-ai

2 issues found across 3 files

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="client/gpu_nvidia.cpp">

<violation number="1" location="client/gpu_nvidia.cpp:601">
P1: This forced `false` disables the only safe fallback and can null-dereference `p_cuMemGetInfo` when only `cuMemGetInfo_v2` is exported.</violation>

<violation number="2" location="client/gpu_nvidia.cpp:610">
P2: `gpu_warning()` records a persistent warning, so this new success log will mark normal CUDA RAM probes as warnings.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

Copilot

Pull request overview

This PR improves GPU detection diagnostics by adding more detailed logging around OpenCL generic device FLOPS estimation, timestamping GPU detection stderr output, and adjusting CUDA memory queries.

Changes:

Log computed peak FLOPS for generic OpenCL devices (and warn/fallback when out of range).
Add timestamps to GPU detection messages written via gpu_warning().
Switch NVIDIA available-RAM query path toward cuMemGetInfo() (away from _v2) and add a detailed return-value log.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File	Description
`client/gpu_opencl.cpp`	Adds detailed logging for generic OpenCL peak-FLOPS calculation and fallback behavior.
`client/gpu_nvidia.cpp`	Alters CUDA memory info function selection and logs `cuMemGetInfo` return/free/total details.
`client/gpu_detect.cpp`	Prefixes `gpu_warning()` stderr output with a timestamp.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-13T00:41:48Z

 void gpu_warning(vector<string> &warnings, const char* msg) {
-    fprintf(stderr, "%s\n", msg);
+    fprintf(stderr, "[%s] %s\n", time_to_string(dtime()), msg);
    warnings.push_back(msg);


gpu_warning() now calls time_to_string(dtime()), but this file doesn't include str_util.h where time_to_string() is declared. This will fail to compile; include str_util.h (or otherwise ensure time_to_string is declared) in this translation unit.

Copilot · 2026-04-13T00:41:48Z

+    // cuMemGetInfo_v2() always returns 201 (no context)
+    if (p_cuMemGetInfo) {
        retval = (*p_cuMemGetInfo)(&memfree, &memtotal);
-    }
-    if (retval) {
        snprintf(buf, sizeof(buf),
-            "[coproc] cuMemGetInfo(%d) returned %d", cc.device_num, retval
+            "cuMemGetInfo() returned %d; dev %d free %zu total %zu",
+            retval, cc.device_num, memfree, memtotal


The if (false && p_cuMemGetInfo_v2) branch disables cuMemGetInfo_v2() unconditionally. If only cuMemGetInfo_v2 is available (p_cuMemGetInfo is null), the else-branch will dereference a null function pointer and crash. Prefer selecting the available symbol (use v2 when v1 is missing), or try v2 first and fall back to v1 on CUDA_ERROR_INVALID_CONTEXT (201).

Copilot · 2026-04-13T00:41:48Z

+                snprintf(buf2, sizeof(buf2),
+                    "OpenCL generic: peak FLOPS %f; Max units %u, max freq %u MHz",
+					prop.peak_flops,
+                    prop.max_compute_units, prop.max_clock_frequency
+                );
+                gpu_warning(warnings, buf2);
                if (prop.peak_flops <= 0 || prop.peak_flops > GPU_MAX_PEAK_FLOPS) {
-                    char buf2[256];
-                    snprintf(buf2, sizeof(buf2),
-                        "OpenCL generic: bad peak FLOPS; Max units %u, max freq %u MHz",
-                        prop.max_compute_units, prop.max_clock_frequency
-                    );
-                    gpu_warning(warnings, buf2);
+					gpu_warning(warnings, "bad peak flops value; using default");


This hunk introduces tab-indented lines (prop.peak_flops argument and the subsequent gpu_warning call), which are inconsistent with the surrounding space indentation and make the block harder to read/maintain. Replace the tabs with spaces to match the file’s existing indentation style.

Copilot · 2026-04-13T00:41:49Z

                if (prop.peak_flops <= 0 || prop.peak_flops > GPU_MAX_PEAK_FLOPS) {
-                    char buf2[256];
-                    snprintf(buf2, sizeof(buf2),
-                        "OpenCL generic: bad peak FLOPS; Max units %u, max freq %u MHz",
-                        prop.max_compute_units, prop.max_clock_frequency
-                    );
-                    gpu_warning(warnings, buf2);
+					gpu_warning(warnings, "bad peak flops value; using default");
                    prop.peak_flops = GPU_DEFAULT_PEAK_FLOPS;


The new warning message "bad peak flops value; using default" is ambiguous on its own (it doesn’t identify the device/vendor or the rejected value). Consider including at least the computed peak_flops (and/or vendor/name) so logs make sense when multiple devices are enumerated.

cubic-dev-ai

1 issue found across 1 file (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="client/gpu_nvidia.cpp">

<violation number="1" location="client/gpu_nvidia.cpp:602">
P1: Handle `cuMemGetInfo*()` failures before updating available RAM; otherwise an error now turns `available_ram` into 0 instead of preserving the fallback total memory value.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

cubic-dev-ai · 2026-04-13T00:42:02Z

-    else {
+    // cuMemGetInfo_v2() always returns 201 (no context)
+    if (p_cuMemGetInfo) {
        retval = (*p_cuMemGetInfo)(&memfree, &memtotal);


P1: Handle cuMemGetInfo*() failures before updating available RAM; otherwise an error now turns available_ram into 0 instead of preserving the fallback total memory value.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At client/gpu_nvidia.cpp, line 602: <comment>Handle `cuMemGetInfo*()` failures before updating available RAM; otherwise an error now turns `available_ram` into 0 instead of preserving the fallback total memory value.</comment> <file context> @@ -598,19 +598,22 @@ static void get_available_nvidia_ram(COPROC_NVIDIA &cc, vector<string>& warnings // cuMemGetInfo_v2() always returns 201 (no context) - if (false && p_cuMemGetInfo_v2) { + if (p_cuMemGetInfo) { + retval = (*p_cuMemGetInfo)(&memfree, &memtotal); + snprintf(buf, sizeof(buf), + "cuMemGetInfo() returned %d; dev %d free %zu total %zu", </file context>

AenBleidd · 2026-04-13T00:56:57Z

On my computer, the latter always returns a 201 error
('not context', even though we just created a context).

I have nVidia GPU, I'll try to debug this.

davidpanderson added 2 commits April 12, 2026 14:20

client: timestamp GPU detect warnings

6861268

Copilot AI review requested due to automatic review settings April 13, 2026 00:22

Copilot started reviewing on behalf of davidpanderson April 13, 2026 00:23 View session

cubic-dev-ai Bot reviewed Apr 13, 2026

View reviewed changes

Comment thread client/gpu_nvidia.cpp Outdated

Comment thread client/gpu_nvidia.cpp Outdated

client: logic tweak

b663abb

Copilot AI reviewed Apr 13, 2026

View reviewed changes

cubic-dev-ai Bot reviewed Apr 13, 2026

View reviewed changes

AenBleidd added C: Client PR: Enhancement labels Apr 13, 2026

AenBleidd added this to Client/Manager Apr 13, 2026

github-project-automation Bot moved this to In progress in Client/Manager Apr 13, 2026

AenBleidd added this to the Client/Manager milestone Apr 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add some logging to GPU detection#6992

Add some logging to GPU detection#6992
davidpanderson wants to merge 3 commits into
masterfrom
dpa_gpu_detect3

davidpanderson commented Apr 13, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 13, 2026

Uh oh!

Copilot AI Apr 13, 2026

Uh oh!

Copilot AI Apr 13, 2026

Uh oh!

Copilot AI Apr 13, 2026

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

cubic-dev-ai Bot Apr 13, 2026 •

edited

Loading

Uh oh!

AenBleidd commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

davidpanderson commented Apr 13, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by cubic

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AenBleidd commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

davidpanderson commented Apr 13, 2026 •

edited by cubic-dev-ai Bot

Loading

cubic-dev-ai Bot Apr 13, 2026 •

edited

Loading