cmake: Add VOLK_STATIC_DISPATCH for compile time machine selection#858
cmake: Add VOLK_STATIC_DISPATCH for compile time machine selection#858xerpi wants to merge 1 commit into
VOLK_STATIC_DISPATCH for compile time machine selection#858Conversation
|
Only Build on ubuntu22.04 armv7 g++ failed: Could be a flaky test. EDIT: After running again now it's passing. |
498ab7d to
706b774
Compare
|
Again, |
jdemel
left a comment
There was a problem hiding this comment.
So far looks good. I couldn't review it completely yet, though.
yes, unfortunately it is. |
|
I like those additions. Though I'm worried this will break the API in some way. If we break the API, we can only add this feature in a new major release "v4".
Can you add a CI test for your case? From prior experience: everything that does not receive a CI test will break soon. |
Updates:
Regarding API compatibility:
I've added a |
4c9de05 to
c2d8ff5
Compare
When VOLK_STATIC_DISPATCH is set to a machine name (e.g. neonv8,
avx2_64_mmx_orc), the build generates a header only dispatch layer
that maps generic kernel names directly to the best implementation
via #define and static inline wrappers. No runtime CPU detection,
no function pointer indirection, no filesystem access.
This is useful for baremetal/embedded targets where the CPU is known
at compile time and the runtime dispatch infrastructure (cpu_features,
volk_prefs, volk_rank_archs) is not available or desirable.
The generated volk_dispatch.h either contains:
Static dispatch: LV_HAVE_* defines, kernel header includes,
#define aliases and static inline dispatchers
Dynamic dispatch: extern function pointer declarations (unchanged)
The common parts (includes, VOLK_OR_PTR) live in a static volk.h
that includes the generated volk_dispatch.h.
When static dispatch is active, ENABLE_APPS, ENABLE_TESTING,
ENABLE_PROFILING and ENABLE_MODTOOL are automatically disabled.
cpu_features, fmt, ORC and dlfcn dependencies are all skipped.
Signed-off-by: Sergi Granell Escalfet <xerpi.g.12@gmail.com>
|
Actually, I realized that some of the options such as disabling apps and testing are not related to static dispatch but to cross-compiling. We can still generate a static dispatch build for the host architecture and run tests without any problem. So I switched some options to be gated on |
jdemel
left a comment
There was a problem hiding this comment.
Thanks for your PR. It looks good but there's still something broken with the static compile test. I suggest to fix this. Also, that'd help a lot to reproduce what you implemented. Thanks.
| if(NOT ENABLE_TESTING) | ||
| return() | ||
| endif(NOT ENABLE_TESTING) | ||
|
|
There was a problem hiding this comment.
The intention here is that we can gate tests behind that common switch. The common place for these decisions is at the beginning of the corresponding CMakeLists.txt file. I assume this was part of some debug change and might need to be reverted?
| add_subdirectory(tests) | ||
| if(ENABLE_TESTING) | ||
| add_subdirectory(tests) | ||
| endif() |
There was a problem hiding this comment.
Ah. That relates to my earlier comment. I suggest to keep the checks in the folder where they are needed. Thanks =D
When
VOLK_STATIC_DISPATCHis set to a machine name (e.g.neonv8,avx2_64_mmx_orc), the build generates a header only static dispatch layer that maps generic kernel names directly to the best implementation via#defineandstatic inlinewrappers. No runtime CPU detection, no function pointer indirection, no filesystem access.This is useful for baremetal/embedded targets where the CPU is known at compile time and the runtime dispatch infrastructure (
cpu_features,volk_prefs,volk_rank_archs) is not available or desirable.The generated
volk_dispatch.heither contains:LV_HAVE_*defines, kernel header includes,#definealiases andstatic inlinedispatchers.The common parts (
#includes,VOLK_OR_PTR) live in a staticvolk.hthat includes the generatedvolk_dispatch.h.When static dispatch is active,
ENABLE_APPS,ENABLE_TESTING,ENABLE_PROFILINGandENABLE_MODTOOLare automatically disabled.cpu_features,fmt,ORCanddlfcndependencies are all skipped.For example, compiling with
-DVOLK_STATIC_DISPATCH="avx2_64_mmx_orc"produces alibvolkwith justvolk_freeandvolk_mallocVOLK-exported symbols, and avolk_dispatch.hheader (included byvolk.h) that looks like:With
VOLK_STATIC_DISPATCH="neonv8":