Skip to content

Flaky E2E: paginated exporter listing fails with etcd request timeout #596

@raballew

Description

@raballew

Summary

The E2E test "paginated exporter listing returns all exporters" (e2e/test/e2e_test.go:343) is flaky. It fails intermittently with an etcd timeout when creating many exporters for pagination testing.

Failure Details

  • Test: Core E2E Tests > Lease operations > paginated exporter listing returns all exporters
  • File: e2e/test/e2e_test.go:343
  • Duration before failure: 2m52s (likely hit a timeout)
  • Root cause: Kubernetes API returns HTTP 500 while creating exporter pagination-exp-93:
    ApiException: (500)
    Reason: Internal Server Error
    HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"etcdserver: request timed out","code":500}
    

Observed In

Analysis

The test creates ~100 exporters sequentially to test pagination. Under CI resource constraints, the kind cluster's etcd can timeout around exporter 93 of 100. This is an infrastructure/resource pressure issue, not a code bug.

Possible Mitigations

  • Reduce the number of exporters created (e.g., from 100 to 25, with a smaller page size)
  • Add retry logic around exporter creation in the test
  • Add a brief sleep between creation batches to reduce etcd pressure
  • Create exporters in parallel with a semaphore to control concurrency

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions