cmake: Add VOLK_STATIC_DISPATCH for compile time machine selection#858
cmake: Add VOLK_STATIC_DISPATCH for compile time machine selection#858xerpi wants to merge 1 commit intognuradio:mainfrom
VOLK_STATIC_DISPATCH for compile time machine selection#858Conversation
|
Only Build on ubuntu22.04 armv7 g++ failed: Could be a flaky test. EDIT: After running again now it's passing. |
498ab7d to
706b774
Compare
|
Again, |
jdemel
left a comment
There was a problem hiding this comment.
So far looks good. I couldn't review it completely yet, though.
yes, unfortunately it is. |
|
I like those additions. Though I'm worried this will break the API in some way. If we break the API, we can only add this feature in a new major release "v4".
Can you add a CI test for your case? From prior experience: everything that does not receive a CI test will break soon. |
Updates:
Regarding API compatibility:
I've added a |
4c9de05 to
c2d8ff5
Compare
When VOLK_STATIC_DISPATCH is set to a machine name (e.g. neonv8,
avx2_64_mmx_orc), the build generates a header only dispatch layer
that maps generic kernel names directly to the best implementation
via #define and static inline wrappers. No runtime CPU detection,
no function pointer indirection, no filesystem access.
This is useful for baremetal/embedded targets where the CPU is known
at compile time and the runtime dispatch infrastructure (cpu_features,
volk_prefs, volk_rank_archs) is not available or desirable.
The generated volk_dispatch.h either contains:
Static dispatch: LV_HAVE_* defines, kernel header includes,
#define aliases and static inline dispatchers
Dynamic dispatch: extern function pointer declarations (unchanged)
The common parts (includes, VOLK_OR_PTR) live in a static volk.h
that includes the generated volk_dispatch.h.
When static dispatch is active, ENABLE_APPS, ENABLE_TESTING,
ENABLE_PROFILING and ENABLE_MODTOOL are automatically disabled.
cpu_features, fmt, ORC and dlfcn dependencies are all skipped.
Signed-off-by: Sergi Granell Escalfet <xerpi.g.12@gmail.com>
|
Actually, I realized that some of the options such as disabling apps and testing are not related to static dispatch but to cross-compiling. We can still generate a static dispatch build for the host architecture and run tests without any problem. So I switched some options to be gated on |
When
VOLK_STATIC_DISPATCHis set to a machine name (e.g.neonv8,avx2_64_mmx_orc), the build generates a header only static dispatch layer that maps generic kernel names directly to the best implementation via#defineandstatic inlinewrappers. No runtime CPU detection, no function pointer indirection, no filesystem access.This is useful for baremetal/embedded targets where the CPU is known at compile time and the runtime dispatch infrastructure (
cpu_features,volk_prefs,volk_rank_archs) is not available or desirable.The generated
volk_dispatch.heither contains:LV_HAVE_*defines, kernel header includes,#definealiases andstatic inlinedispatchers.The common parts (
#includes,VOLK_OR_PTR) live in a staticvolk.hthat includes the generatedvolk_dispatch.h.When static dispatch is active,
ENABLE_APPS,ENABLE_TESTING,ENABLE_PROFILINGandENABLE_MODTOOLare automatically disabled.cpu_features,fmt,ORCanddlfcndependencies are all skipped.For example, compiling with
-DVOLK_STATIC_DISPATCH="avx2_64_mmx_orc"produces alibvolkwith justvolk_freeandvolk_mallocVOLK-exported symbols, and avolk_dispatch.hheader (included byvolk.h) that looks like:With
VOLK_STATIC_DISPATCH="neonv8":