pimd: retry MSDP listener bind after VRF master interface up#22506
pimd: retry MSDP listener bind after VRF master interface up#22506hnattamaisub wants to merge 2 commits into
Conversation
Greptile SummaryThis PR fixes a startup race in pimd where MSDP listener peers configured inside a VRF block fail to bind because
Confidence Score: 3/5Safe to merge for the common non-auth VRF MSDP case, but auth-configured MSDP peers in VRFs will remain broken after this change The retry helper unconditionally calls the global non-auth socket path for every listener peer it finds — including auth-configured ones. For an auth peer this creates the wrong shared listener, sets PIM_MSDPF_LISTENER, and permanently blocks the per-peer pim_msdp_sock_auth_listen() path from ever being retried. Operators combining MSDP peer authentication with VRF isolation will see the same silent bind failure the PR set out to fix. pimd/pim_msdp.c — specifically the pim_msdp_vrf_iface_up() function where the auth/non-auth branch is missing Important Files Changed
Sequence Diagram%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant Conf as Config Parser
participant MSDP as pim_msdp.c
participant Iface as pim_iface.c
participant Sock as pim_msdp_socket.c
Conf->>MSDP: ip msdp peer X source Y (in VRF)
MSDP->>MSDP: pim_msdp_peer_add() → pim_msdp_peer_listen()
MSDP->>Sock: "pim_msdp_sock_listen(pim) [vrf_id==VRF_UNKNOWN]"
Sock->>Sock: "if_lookup_by_name(vrf->name, VRF_UNKNOWN) → NULL"
Sock-->>MSDP: return -1 (bind fails, no listener)
Note over Conf,Sock: Later: libvrf activates VRF
Iface->>Iface: pim_ifp_create(vrf_master_ifp) or pim_ifp_up(vrf_master_ifp)
Iface->>MSDP: pim_msdp_vrf_iface_up(pim)
MSDP->>MSDP: check PIM_MSDPF_LISTENER (not set yet)
MSDP->>MSDP: find first listener peer in peer_list
alt "auth_type == MSDP_AUTH_NONE"
MSDP->>Sock: pim_msdp_sock_listen(pim) [vrf_id now valid]
Sock->>Sock: if_lookup_by_name succeeds, vrf_bind + bind + event_add_read
Sock-->>MSDP: fd (sets PIM_MSDPF_LISTENER)
else "auth_type != MSDP_AUTH_NONE (bug: not handled)"
MSDP->>Sock: pim_msdp_sock_listen(pim) [wrong path]
Note over MSDP,Sock: Should call pim_msdp_sock_auth_listen(mp)
end
Note over Conf,Sock: On VRF delete
Iface->>MSDP: pim_msdp_exit(pim)
MSDP->>MSDP: event_cancel + close(listener.fd) + clear flag
MSDP->>MSDP: pim_msdp_mg_free() → free all peers
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant Conf as Config Parser
participant MSDP as pim_msdp.c
participant Iface as pim_iface.c
participant Sock as pim_msdp_socket.c
Conf->>MSDP: ip msdp peer X source Y (in VRF)
MSDP->>MSDP: pim_msdp_peer_add() → pim_msdp_peer_listen()
MSDP->>Sock: "pim_msdp_sock_listen(pim) [vrf_id==VRF_UNKNOWN]"
Sock->>Sock: "if_lookup_by_name(vrf->name, VRF_UNKNOWN) → NULL"
Sock-->>MSDP: return -1 (bind fails, no listener)
Note over Conf,Sock: Later: libvrf activates VRF
Iface->>Iface: pim_ifp_create(vrf_master_ifp) or pim_ifp_up(vrf_master_ifp)
Iface->>MSDP: pim_msdp_vrf_iface_up(pim)
MSDP->>MSDP: check PIM_MSDPF_LISTENER (not set yet)
MSDP->>MSDP: find first listener peer in peer_list
alt "auth_type == MSDP_AUTH_NONE"
MSDP->>Sock: pim_msdp_sock_listen(pim) [vrf_id now valid]
Sock->>Sock: if_lookup_by_name succeeds, vrf_bind + bind + event_add_read
Sock-->>MSDP: fd (sets PIM_MSDPF_LISTENER)
else "auth_type != MSDP_AUTH_NONE (bug: not handled)"
MSDP->>Sock: pim_msdp_sock_listen(pim) [wrong path]
Note over MSDP,Sock: Should call pim_msdp_sock_auth_listen(mp)
end
Note over Conf,Sock: On VRF delete
Iface->>MSDP: pim_msdp_exit(pim)
MSDP->>MSDP: event_cancel + close(listener.fd) + clear flag
MSDP->>MSDP: pim_msdp_mg_free() → free all peers
Prompt To Fix All With AIFix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
pimd/pim_msdp.c:948-960
**Auth-configured listener peers silently skipped by retry**
`pim_msdp_vrf_iface_up` unconditionally calls `pim_msdp_sock_listen(pim)` — the shared global (non-auth) listener — even when the first `PIM_MSDP_PEER_IS_LISTENER` peer has `auth_type != MSDP_AUTH_NONE`. That call succeeds, sets `PIM_MSDPF_LISTENER`, and returns. The early-return guard at line 945 then silently prevents any future retry for the *per-peer* auth socket that should have been set up via `pim_msdp_sock_auth_listen(mp)`. Auth VRF peers are left permanently without a listening socket after the first VRF master interface-up event.
The original `pim_msdp_peer_listen()` (line 979) already has the correct branch: call `pim_msdp_sock_listen` for `MSDP_AUTH_NONE`, `pim_msdp_sock_auth_listen` otherwise. The retry helper should mirror that logic, and should *not* `break` early for auth peers since each one needs its own per-peer socket.
Reviews (1): Last reviewed commit: "pimd: retry MSDP listener bind after VRF..." | Re-trigger Greptile |
Issue: When pimd parses an "ip msdp peer X source Y" inside a VRF block at config time, libvrf has not yet activated the VRF and pim->vrf is a name-only stub with vrf_id == VRF_UNKNOWN. For passive (listener-side) peers, pim_msdp_peer_listen() immediately calls pim_msdp_sock_listen(), whose if_lookup_by_name(vrf->name, VRF_UNKNOWN) deterministically returns NULL. The function logs "Unable to lookup vrf interface" and returns -1. No code path ever retries it The race is timing-dependent. Some startup orderings happen to populate pim->vrf->vrf_id before MSDP peer config is processed and the listener binds normally. Fix: 1)Adding a small pim_msdp_vrf_iface_up() helper that retries pim_msdp_sock_listen() when pim_iface.c sees the VRF-master interface appear (pim_ifp_create / pim_ifp_up). 2)tear down MSDP listener in pim_msdp_exit() Signed-off-by: harini <hnattamaisub@nvidia.com>
Handled the same logic for Auth enabled peers also. Initial commit: pim_msdp_vrf_iface_up() called the shared pim_msdp_sock_listen() unconditionally and broke early. Auth peers (auth_type != MSDP_AUTH_NONE) need a per-peer socket via pim_msdp_sock_auth_listen() and would otherwise be left without a listener. This commit handles for auth peers also. Signed-off-by: harini <hnattamaisub@nvidia.com>
Issue:
When pimd parses an "ip msdp peer X source Y" inside a VRF block at config time, libvrf has not yet activated the VRF and pim->vrf is a name-only stub with vrf_id == VRF_UNKNOWN. For passive (listener-side) peers, pim_msdp_peer_listen() immediately calls pim_msdp_sock_listen(), whose if_lookup_by_name(vrf->name, VRF_UNKNOWN) deterministically returns NULL. The function logs "Unable to lookup vrf interface" and returns -1. No code path ever retries it
The race is timing-dependent. Some startup orderings happen to populate pim->vrf->vrf_id before MSDP peer config is processed and the listener binds normally.
Fix:
1)Adding a small pim_msdp_vrf_iface_up() helper that retries pim_msdp_sock_listen() when pim_iface.c sees the VRF-master interface appear (pim_ifp_create / pim_ifp_up).
2)tear down MSDP listener in pim_msdp_exit()