remove buggy and slow neonv8 kernel#680
Conversation
e952f32 to
34f4575
Compare
|
Looks like this kernel has been slower than generic since it was added in #196. Good riddance. |
|
For the record, the bug in the implementation was that the load (
|
Signed-off-by: Marcus Müller <mmueller@gnuradio.org>
49d8822 to
d49a8cd
Compare
|
this is a bit strange, it seems I need to figure out a method to avoid the |
|
Why do you need to avoid building the neon kernel if neonv8 is defined? I didn't think that made sense in #668 so I removed the nested |
|
Because on arm64 machines, the neon kernel malfunctions: |
|
Ah. That seems like a bug that should be fixed. |
|
https://github.com/gnuradio/volk/actions/runs/6617156936/job/17972917000?pr=680#step:3:2453 Seeing that on our armv7 machine it doesn't run at all, do we test that Neon non-v8 kernels on anything, @jdemel ? |
|
In #668 I found that nested ifdefs prevent the neon kernel from running even on 32-bit ARM, so it's possible it's broken on 32-bit ARM too. |
|
Yep, it's broken. Change |
I too noticed that. It seems on the armv7 build, NEON is not detected at all. Perhaps a bug in platform detection? |
|
By the way, neon is slower than generic on my Raspberry Pi, 2345.84 ms vs. 1977.6 ms. |
|
Since that Pi and maybe an E310 would be the main target for that kernel: should we maybe just eradicate both? |
|
Yeah, I'd say get rid of it. As far as I can tell, it's been broken since it was created in 2014 (158a6b2). It only correctly swaps the first two integers in the input vector, so I can't imagine it's ever done anyone any good. |
This was hidden for 9 years du to being shadowed when NEONV8 was available Signed-off-by: Marcus Müller <mmueller@gnuradio.org>
Unfortunately, no. If you happen to find some CI infrastructure for this, please add it. I didn't. We used to test these when TravisCI was still working. |
jdemel
left a comment
There was a problem hiding this comment.
LGTM. The discussion concluded that these kernels were always broken.
as noticed by argilo when fixing the integer generation in #677 , that kernel was buggy. It seems compilers are better at building byte-swapping code than people writing SIMD intrinsics, so falling back on generic doesn't hurt.