felix/bpf/ut: drain shared ring buffer in BPF UT helpers#12515
felix/bpf/ut: drain shared ring buffer in BPF UT helpers#12515lucastigera wants to merge 1 commit intoprojectcalico:masterfrom
Conversation
cali_rb_evnt is a pinned, shared BPF_MAP_TYPE_RINGBUF. Its consumer
and producer positions are kernel state that persist across
ringbuf.New() calls, so a reader opened by one test sees any records
left unread by earlier tests. ParsePolicyVerdict then slices past the
end of, e.g., an 84-byte EVENT_PROTO_STATS record (emitted by the
enterprise TestKprobe) and panics:
panic: runtime error: slice bounds out of range [:88] with capacity 76
Introduce a newTestRingBuf() helper in bpf_prog_test.go that opens a
reader on cali_rb_evnt and immediately drains any leftover records, so
rb.Next() only returns events produced after the call. Convert all six
ring-buffer-consuming tests (flow_log_events_test.go, ringbuf_events_test.go)
to use it so future tests get the same protection for free. Also assert
the expected event type in the flow log tests, so a mismatch fails with
a clear message instead of panicking inside ParsePolicyVerdict.
There was a problem hiding this comment.
Pull request overview
This PR hardens Felix BPF unit tests that consume the shared pinned ring buffer (cali_rb_evnt) by ensuring each test starts from a clean read position, preventing leftover records from earlier tests (including enterprise-only producers) from causing mis-parsing and panics.
Changes:
- Added a
newTestRingBuf()helper that opens the shared ring buffer reader and drains any pre-existing events. - Updated ring-buffer-consuming UTs to use
newTestRingBuf()instead of constructing readers directly. - Added explicit event-type assertions in flow log tests to fail clearly on unexpected event types.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
felix/bpf/ut/bpf_prog_test.go |
Adds newTestRingBuf() helper that drains the shared pinned ring buffer on open. |
felix/bpf/ut/ringbuf_events_test.go |
Switches ring buffer tests to use the draining helper; updates comments around clean-state assumptions. |
felix/bpf/ut/flow_log_events_test.go |
Uses the draining helper and asserts event types before parsing policy verdicts. |
| rb := newTestRingBuf() | ||
|
|
There was a problem hiding this comment.
TestRingBufReaderRecovery creates a ring buffer reader without a defer rb.Close(). If an assertion fails before the explicit rb.Close() call, the reader can be leaked (mmap/epoll fds), which may impact subsequent tests. Consider adding a deferred close immediately after creation (even if you still close early to test re-opening).
| func newTestRingBuf() *ringbuf.RingBuffer { | ||
| rb, err := ringbuf.New(ringBufMap, rbSize) | ||
| Expect(err).NotTo(HaveOccurred()) | ||
| rb.Drain() | ||
| return rb |
There was a problem hiding this comment.
newTestRingBuf() depends on rbSize, which is currently declared in ringbuf_events_test.go, and initMapsOnce() hard-codes the ring buffer map size as 1024*1024. That cross-file coupling makes it easy for these values to drift or for the helper to stop compiling if the constant is moved/guarded by build tags. Consider defining a single shared constant (e.g., in bpf_prog_test.go) and using it both for ringbuf.Map(..., size) and ringbuf.New(..., size) (and have tests reference that constant).
Summary
cali_rb_evntis a pinned, sharedBPF_MAP_TYPE_RINGBUF. Its consumer/producer positions are kernel state and persist acrossringbuf.New()calls, so a reader opened by one test sees any records left unread by an earlier test.ParsePolicyVerdictthen slices past the end of, e.g., an 84-byteEVENT_PROTO_STATSrecord (emitted by the enterprise `TestKprobe`) and panics:Fix:
The OSS test suite alone cannot reproduce the panic because only the enterprise fork has kprobe programs that emit stray records into `cali_rb_evnt` during earlier tests, but the fix hardens the pattern for everyone.
Test plan
Release note: