ONNXRuntime: update to version 1.25.1#10516
ONNXRuntime: update to version 1.25.1#10516smuzaffar wants to merge 2 commits intoIB/CMSSW_17_0_X/masterfrom
Conversation
|
A new Pull Request was created by @smuzaffar for branch IB/CMSSW_17_0_X/master. @akritkbehera, @cmsbuild, @iarspider, @raoatifshad, @smuzaffar can you please review it and eventually sign? Thanks. |
|
cms-bot internal usage |
|
please test |
|
@fwyzard , looks like newer version of ONNXRuntime (ORT) fails for cuda arch 60 (Pascal). It only fails for one ORT source file. For now in this PR I propose to build ORT with cuda 6.x support. |
|
please test for el9_amd64_gcc14 |
|
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-0a93d4/52935/summary.html Comparison SummarySummary:
Max Memory Comparisons exceeding threshold@cms-sw/core-l2 , I found 42 workflow step(s) with memory usage exceeding the error threshold: Expand to see workflows ...
|
|
-1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-0a93d4/52940/summary.html Failed External BuildI found compilation error when building: ++ mv /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/el9_amd64_gcc14/external/onnxruntime/1.25.1-67747e4c1218bf29cba2862949bd66ba/cuda_gcc_supported.txt /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/cache/cuda_gcc_supported.txt
++ cat /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/cache/cuda_gcc_supported.txt
+ '[' true = true ']'
+ USE_CUDA=ON
/data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.9OHiTp: line 69: syntax error near unexpected token `<<<'
error: Bad exit status from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/tmp/rpm-tmp.9OHiTp (%build)
RPM build warnings:
Macro expanded in comment on line 488: %{pkginstroot}/${PYTHON3_LIB_SITE_PACKAGES}
|
We have not removed CUDA arch 6.0 from CMSSW (yet). Should we do that, then ? |
Dropping 6.0 in general now makes sense to me. |
|
OK, let's re-remove Pascal globally. |
|
please test with #10493 |
|
Pull request #10516 was updated. |
|
please test with #10493 |
|
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-0a93d4/52980/summary.html The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
You can see more details here: Comparison SummarySummary:
Max Memory Comparisons exceeding threshold@cms-sw/core-l2 , I found 42 workflow step(s) with memory usage exceeding the error threshold: Expand to see workflows ...
|
|
enable gpu |
|
please test with #10493 |
|
-1 Failed Tests: RelVals nvidia_l40sUnitTests The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic: You can see more details here: Failed RelVals
Expand to see more relval errors ...
AMD_MI300X Comparison SummarySummary:
AMD_W7900 Comparison SummarySummary:
|
|
please test with #10493 |
|
-1 Failed Tests: RelVals-AMD_MI300X Failed RelVals-AMD_MI300XThe relvals timed out after 4 hours. Comparison SummarySummary:
AMD_W7900 Comparison SummarySummary:
NVIDIA_H100 Comparison SummarySummary:
NVIDIA_L40S Comparison SummarySummary:
Max Memory Comparisons exceeding threshold@cms-sw/core-l2 , I found 42 workflow step(s) with memory usage exceeding the error threshold: Expand to see workflows ...
|
This PR updates ONNXRuntime to latest version v1.25.1.
60as compilation of https://github.com/microsoft/onnxruntime/blob/v1.25.1/onnxruntime/contrib_ops/cuda/quantization/matmul_8bits.cu failed with error [a] when cude arch 60 is used[a]