Skip to content

[llvm-profgen] Read build ID from binary for perfscript address filtering#190862

Open
aaupov wants to merge 4 commits intomainfrom
users/aaupov/spr/llvm-profgen-read-build-id-from-binary-for-perfscript-address-filtering-6
Open

[llvm-profgen] Read build ID from binary for perfscript address filtering#190862
aaupov wants to merge 4 commits intomainfrom
users/aaupov/spr/llvm-profgen-read-build-id-from-binary-for-perfscript-address-filtering-6

Conversation

@aaupov
Copy link
Copy Markdown
Contributor

@aaupov aaupov commented Apr 7, 2026

For shared libraries (.so), read the binary's build ID during load()
using object::getBuildID() and store it as FilterBuildID. Main
executables keep FilterBuildID empty, matching the convention that
their perfscript addresses have no buildid prefix.

This enables automatic build ID-based filtering of perfscript
addresses in [buildid:]0xaddr format without requiring a CLI option.

Created using spr 1.3.4
@aaupov aaupov requested review from HighW4y2H3ll and apolloww April 7, 2026 22:27
@aaupov aaupov marked this pull request as ready for review April 7, 2026 22:27
@llvmbot llvmbot added the PGO Profile Guided Optimizations label Apr 7, 2026
@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Apr 7, 2026

@llvm/pr-subscribers-pgo

Author: Amir Ayupov (aaupov)

Changes

For shared libraries (.so), read the binary's build ID during load()
using object::getBuildID() and store it as FilterBuildID. Main
executables keep FilterBuildID empty, matching the convention that
their perfscript addresses have no buildid prefix.

This enables automatic build ID-based filtering of perfscript
addresses in [buildid:]0xaddr format without requiring a CLI option.


Full diff: https://github.com/llvm/llvm-project/pull/190862.diff

2 Files Affected:

  • (modified) llvm/tools/llvm-profgen/ProfiledBinary.cpp (+11)
  • (modified) llvm/tools/llvm-profgen/ProfiledBinary.h (+9)
diff --git a/llvm/tools/llvm-profgen/ProfiledBinary.cpp b/llvm/tools/llvm-profgen/ProfiledBinary.cpp
index 915e991e4068c..f6be04a4790fd 100644
--- a/llvm/tools/llvm-profgen/ProfiledBinary.cpp
+++ b/llvm/tools/llvm-profgen/ProfiledBinary.cpp
@@ -11,6 +11,7 @@
 #include "MissingFrameInferrer.h"
 #include "Options.h"
 #include "ProfileGenerator.h"
+#include "llvm/ADT/StringExtras.h"
 #include "llvm/DebugInfo/PDB/IPDBSession.h"
 #include "llvm/DebugInfo/PDB/PDB.h"
 #include "llvm/DebugInfo/PDB/PDBSymbolFunc.h"
@@ -246,6 +247,16 @@ void ProfiledBinary::load() {
   // Find the preferred load address for text sections.
   setPreferredTextSegmentAddresses(Obj);
 
+  // For shared libraries, read build ID to filter perfscript addresses
+  // in [buildid:]addr format. Main executables use empty FilterBuildID
+  // since their addresses have no buildid prefix.
+  StringRef FileName = llvm::sys::path::filename(Path);
+  if (FileName.ends_with(".so") || FileName.contains(".so.")) {
+    auto BID = object::getBuildID(Obj);
+    if (!BID.empty())
+      FilterBuildID = llvm::toHex(BID, /*LowerCase=*/true);
+  }
+
   // Load debug info of subprograms from DWARF section.
   // If path of debug info binary is specified, use the debug info from it,
   // otherwise use the debug info from the executable binary.
diff --git a/llvm/tools/llvm-profgen/ProfiledBinary.h b/llvm/tools/llvm-profgen/ProfiledBinary.h
index 16218d430d78c..d5ad61d870770 100644
--- a/llvm/tools/llvm-profgen/ProfiledBinary.h
+++ b/llvm/tools/llvm-profgen/ProfiledBinary.h
@@ -28,6 +28,7 @@
 #include "llvm/MC/MCRegisterInfo.h"
 #include "llvm/MC/MCSubtargetInfo.h"
 #include "llvm/MC/MCTargetOptions.h"
+#include "llvm/Object/BuildID.h"
 #include "llvm/Object/ELFObjectFile.h"
 #include "llvm/ProfileData/SampleProf.h"
 #include "llvm/Support/CommandLine.h"
@@ -337,6 +338,11 @@ class ProfiledBinary {
 
   bool IsCOFF = false;
 
+  // Build ID used to filter perfscript addresses in [buildid:]addr format.
+  // For shared libraries, set to the binary's build ID.
+  // For main executables, kept empty (addresses have no buildid prefix).
+  std::string FilterBuildID;
+
   void setPreferredTextSegmentAddresses(const object::ObjectFile *O);
 
   // LLVMSymbolizer's symbolize{Code, Data} interfaces requires a section index
@@ -425,6 +431,9 @@ class ProfiledBinary {
 
   bool isCOFF() const { return IsCOFF; }
 
+  // Return the build ID used for filtering perfscript addresses.
+  StringRef getFilterBuildID() const { return FilterBuildID; }
+
   // Canonicalize to use preferred load address as base address.
   uint64_t canonicalizeVirtualAddress(uint64_t Address) {
     return Address - BaseAddress + getPreferredBaseAddress();

Created using spr 1.3.4
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

✅ With the latest revision this PR passed the C/C++ code formatter.

Created using spr 1.3.4
Copy link
Copy Markdown
Member

@HighW4y2H3ll HighW4y2H3ll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Created using spr 1.3.4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PGO Profile Guided Optimizations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants