granular parallel generic kernel for 64u_byteswap#679
granular parallel generic kernel for 64u_byteswap#679marcusmueller wants to merge 1 commit intognuradio:mainfrom
Conversation
Signed-off-by: Marcus Müller <mmueller@gnuradio.org>
651abe0 to
0a17287
Compare
|
is a good addition to make the byteswap somewhat performant on non-x86 platforms, especially in light of #680 |
|
Does this PR touch the intend of #606 ? I know, the concern different implementations. |
|
No, it's unrelated. However, #680 addressed that quite directly. |
jdemel
left a comment
There was a problem hiding this comment.
This PR LGTM.
However, this whole kernel has issues. The include guards are at least confusing. The _a include guard is around the _u kernels and vice versa. Tail handling creates copypasta code. Loop variables are defined outside of loops. All in all, this kernel needs even more clean up. This is beyond this PR though.
One concern: We just removed all the _a_generic kernels. The diff looks like you rename one of these kernels.
Could you rebase your PR first before we merge it?
jdemel
left a comment
There was a problem hiding this comment.
Could you rebase this PR onto the current main? I'd say we can go with it then.
This simplifies (at least to me) understanding what the generic kernel does, and it's also about 1.5 times faster on my machines.