Pcep more testing#256
Conversation
The validation code crashes, because the validation code is asking for the nb code to assert if the path does not exist yet. Modify the code such that the non-existence of the policy yet is not a crashable event in validation. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Currently the testing coverage of pathd/pceplib is abysmmally small. Let's add a bit more testing to get greater code coverage of this code. This test suite adds a basic harness to allow for interactions with pceplib as well. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
config.json details
Settings:
- Strictness 2, status check enabled, re-reviews on every push
- Comment types: logic, syntax, best_practices, security
Rule structure:
Each rule has an id, description, scope, and severity.
- Scope: glob patterns that limit which files the rule applies to
(e.g. **/*.c, **/*.yang, lib/**)
- Severity: high (P0/P1 — must fix), medium (P2 — should fix), low
(NOTE — suggestion)
Rules cover:
- Memory safety: XMALLOC/XCALLOC/XFREE enforcement, no raw malloc/free
- String safety: no strcpy/strcat/sprintf, enforce strlcpy/snprintf
- Banned functions: no system(), no fork()+exec()
- Added popen() to banned-functions rule (same issue with SIGINT to be
ignored and break daemon shutdown)
- Typesafe containers: reject legacy linklist.h/hash.h, enforce lib/typesafe.h
- CLI conventions: DEFPY required, YANG model required for YANGified daemons
- YANG/northbound: API-agnostic callbacks, JSON output backed by YANG models
- Packet parsing: bounds checking on all network message parsing
- RCU safety: lock discipline, no direct XFREE on RCU-protected data
- Logging: zlog_* only, debug guarded by CLI flags
- Licensing: SPDX headers, GPL compatibility checks
- Formatting: include order, whitespace discipline, printfrr usage
- Commit messages: subsystem prefix, imperative mood, Signed-off-by
- Topotest coverage (P0): blocks merge if daemon code changes without
tests/topotests/ changes
- User docs (P0): blocks merge if CLI/YANG changes without doc/user/ updates
- Mark User docs as Prority 1 checks to run first
- Flag doc updates as mandatory for features
- Topotest coverage: check for coverage as priority 2 (after used docs)
- Flag Topotests coverage as mandatory for features
Signed-off-by: Suphawit Wanlaung <suphawitw@netdef.org>
Greptile SummaryThis PR fixes a crash in pathd's northbound config callback when a policy
|
| Filename | Overview |
|---|---|
| pathd/path_nb_config.c | Bug fix for crash in pathd_srte_policy_name_modify during NB_EV_VALIDATE on a not-yet-applied policy. The VALIDATE path is correctly fixed but the APPLY path now dereferences policy without a NULL guard. |
| tests/topotests/pathd_pcep_topo1/test_pathd_pcep_topo1.py | New topotest exercising pathd PCEP PCC session lifecycle; uses json_cmp, run_and_expect, and pytest framework correctly. One test deliberately avoids a crashing command, documented with an inline comment. |
| tests/topotests/pathd_pcep_topo1/pce/pce_sim.py | New minimal PCEP PCE simulator script; correctly implements Open/Keepalive exchange with Stateful+SR TLVs; threaded per-connection model is appropriate for a test helper. |
| tests/topotests/pathd_pcep_topo1/r1/frr.conf | New FRR configuration for the topotest router; defines SR-TE policy, segment-list, and PCEP PCE config matching the simulator parameters. |
| tests/topotests/pathd_pcep_topo1/init.py | Empty Python package marker; required by pytest for topotest directory discovery. |
Sequence Diagram
sequenceDiagram
participant NB as FRR Northbound
participant CB as name_modify callback
participant RC as Running Config
Note over NB,RC: BEFORE fix — crash on first-time policy name set
NB->>CB: NB_EV_VALIDATE
CB->>RC: "nb_running_get_entry abort=true"
RC-->>CB: NULL (policy not yet applied)
CB->>CB: assert/abort crash
Note over NB,RC: AFTER fix
NB->>CB: NB_EV_VALIDATE
CB->>RC: "nb_running_get_entry abort=false"
RC-->>CB: NULL
CB->>CB: "policy==NULL so name not fixed, return NB_OK"
NB->>CB: NB_EV_APPLY
CB->>RC: "nb_running_get_entry abort=false"
RC-->>CB: policy ptr (parent already applied)
CB->>CB: "strlcpy policy->name — no NULL guard"
Note over NB,CB: topotest PCEP session lifecycle
participant R1 as FRR r1 pathd PCC
participant PCE as pce_sim.py
R1->>PCE: TCP connect port 4189
R1->>PCE: PCEP Open with SR and Stateful TLVs
PCE->>R1: PCEP Open with SR and Stateful TLVs
PCE->>R1: Keepalive ack PCC Open
R1->>PCE: Keepalive ack PCE Open
Note over R1,PCE: Session UP verified by test
loop Periodic keepalives
R1->>PCE: Keepalive
PCE->>R1: Keepalive
end
Note over PCE: pce_sim killed
PCE--xR1: TCP closed or dead timer expires
Note over R1: pcepSessionsConnected drops to 0
Comments Outside Diff (2)
-
pathd/path_nb_config.c, line 303-307 (link)Unguarded NULL dereference in NB_EV_APPLY path
nb_running_get_entry(..., false)is now used for both VALIDATE and APPLY events. The VALIDATE path is correctly guarded withif (policy && ...). However, the APPLY path at line 304 callsstrlcpy(policy->name, ...)with no NULL check onpolicy. Ifnb_running_get_entryreturns NULL during APPLY — which, while unlikely in normal flow, can happen if the parent policy's APPLY step was somehow skipped or failed — this silently segfaults instead of the explicit abort-with-diagnostics that the originaltrueflag provided. The proper fix is to look up the entry withtrueonly in the APPLY branch, or to add an explicit NULL check withflog_err+return NB_ERR_RESOURCEbefore thestrlcpy.Prompt To Fix With AI
This is a comment left during a code review. Path: pathd/path_nb_config.c Line: 303-307 Comment: **Unguarded NULL dereference in NB_EV_APPLY path** `nb_running_get_entry(..., false)` is now used for both VALIDATE and APPLY events. The VALIDATE path is correctly guarded with `if (policy && ...)`. However, the APPLY path at line 304 calls `strlcpy(policy->name, ...)` with no NULL check on `policy`. If `nb_running_get_entry` returns NULL during APPLY — which, while unlikely in normal flow, can happen if the parent policy's APPLY step was somehow skipped or failed — this silently segfaults instead of the explicit abort-with-diagnostics that the original `true` flag provided. The proper fix is to look up the entry with `true` only in the APPLY branch, or to add an explicit NULL check with `flog_err` + `return NB_ERR_RESOURCE` before the `strlcpy`. How can I resolve this? If you propose a fix, please make it concise.
-
tests/topotests/pathd_pcep_topo1/test_pathd_pcep_topo1.py, line 1246-1255 (link)Known crash in
show sr-te pcep countersleft unexercisedThe comment at lines 1248–1255 documents that
show sr-te pcep counterscrashes pathd inmt_count_free()on ASAN/development builds viapath_pcep_cli.c::path_pcep_cli_show_srte_pcep_counters. The test works around this by using the textual per-session output instead. While the workaround is reasonable for unblocking test landing, the underlying memory-management bug incopy_counter_group()should be tracked and fixed so this test can be strengthened to validate counter values directly.Prompt To Fix With AI
This is a comment left during a code review. Path: tests/topotests/pathd_pcep_topo1/test_pathd_pcep_topo1.py Line: 1246-1255 Comment: **Known crash in `show sr-te pcep counters` left unexercised** The comment at lines 1248–1255 documents that `show sr-te pcep counters` crashes pathd in `mt_count_free()` on ASAN/development builds via `path_pcep_cli.c::path_pcep_cli_show_srte_pcep_counters`. The test works around this by using the textual per-session output instead. While the workaround is reasonable for unblocking test landing, the underlying memory-management bug in `copy_counter_group()` should be tracked and fixed so this test can be strengthened to validate counter values directly. How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
pathd/path_nb_config.c:303-307
**Unguarded NULL dereference in NB_EV_APPLY path**
`nb_running_get_entry(..., false)` is now used for both VALIDATE and APPLY events. The VALIDATE path is correctly guarded with `if (policy && ...)`. However, the APPLY path at line 304 calls `strlcpy(policy->name, ...)` with no NULL check on `policy`. If `nb_running_get_entry` returns NULL during APPLY — which, while unlikely in normal flow, can happen if the parent policy's APPLY step was somehow skipped or failed — this silently segfaults instead of the explicit abort-with-diagnostics that the original `true` flag provided. The proper fix is to look up the entry with `true` only in the APPLY branch, or to add an explicit NULL check with `flog_err` + `return NB_ERR_RESOURCE` before the `strlcpy`.
### Issue 2 of 2
tests/topotests/pathd_pcep_topo1/test_pathd_pcep_topo1.py:1246-1255
**Known crash in `show sr-te pcep counters` left unexercised**
The comment at lines 1248–1255 documents that `show sr-te pcep counters` crashes pathd in `mt_count_free()` on ASAN/development builds via `path_pcep_cli.c::path_pcep_cli_show_srte_pcep_counters`. The test works around this by using the textual per-session output instead. While the workaround is reasonable for unblocking test landing, the underlying memory-management bug in `copy_counter_group()` should be tracked and fixed so this test can be strengthened to validate counter values directly.
Reviews (1): Last reviewed commit: "greptile: Add Greptile configuration for..." | Re-trigger Greptile
|
@greptileai what is the rules that list in ./greptile |
|
There is no If you're referring to Greptile's custom review rules, those are typically defined in |
Fix a crash in pathd when it receives commands in a certain order, add a test that provides more coverage to pathd and pceplib