Pcep more testing by suphawitwanl · Pull Request #256 · opensourcerouting/frr

suphawitwanl · 2026-05-14T08:37:11Z

Fix a crash in pathd when it receives commands in a certain order, add a test that provides more coverage to pathd and pceplib

The validation code crashes, because the validation code is asking for the nb code to assert if the path does not exist yet. Modify the code such that the non-existence of the policy yet is not a crashable event in validation. Signed-off-by: Donald Sharp <sharpd@nvidia.com>

Currently the testing coverage of pathd/pceplib is abysmmally small. Let's add a bit more testing to get greater code coverage of this code. This test suite adds a basic harness to allow for interactions with pceplib as well. Signed-off-by: Donald Sharp <sharpd@nvidia.com>

config.json details Settings: - Strictness 2, status check enabled, re-reviews on every push - Comment types: logic, syntax, best_practices, security Rule structure: Each rule has an id, description, scope, and severity. - Scope: glob patterns that limit which files the rule applies to (e.g. **/*.c, **/*.yang, lib/**) - Severity: high (P0/P1 — must fix), medium (P2 — should fix), low (NOTE — suggestion) Rules cover: - Memory safety: XMALLOC/XCALLOC/XFREE enforcement, no raw malloc/free - String safety: no strcpy/strcat/sprintf, enforce strlcpy/snprintf - Banned functions: no system(), no fork()+exec() - Added popen() to banned-functions rule (same issue with SIGINT to be ignored and break daemon shutdown) - Typesafe containers: reject legacy linklist.h/hash.h, enforce lib/typesafe.h - CLI conventions: DEFPY required, YANG model required for YANGified daemons - YANG/northbound: API-agnostic callbacks, JSON output backed by YANG models - Packet parsing: bounds checking on all network message parsing - RCU safety: lock discipline, no direct XFREE on RCU-protected data - Logging: zlog_* only, debug guarded by CLI flags - Licensing: SPDX headers, GPL compatibility checks - Formatting: include order, whitespace discipline, printfrr usage - Commit messages: subsystem prefix, imperative mood, Signed-off-by - Topotest coverage (P0): blocks merge if daemon code changes without tests/topotests/ changes - User docs (P0): blocks merge if CLI/YANG changes without doc/user/ updates - Mark User docs as Prority 1 checks to run first - Flag doc updates as mandatory for features - Topotest coverage: check for coverage as priority 2 (after used docs) - Flag Topotests coverage as mandatory for features Signed-off-by: Suphawit Wanlaung <suphawitw@netdef.org>

greptile-apps · 2026-05-14T08:43:34Z

Greptile Summary

This PR fixes a crash in pathd's northbound config callback when a policy name is set at the same time as the parent policy is created (VALIDATE fires before APPLY, so the policy isn't yet in the running config), and adds a new topotest that brings the PCEP PCC session up against a lightweight Python PCE simulator and tears it down.

pathd/path_nb_config.c: changes nb_running_get_entry(..., true) to false in pathd_srte_policy_name_modify and adds a policy && guard in the VALIDATE branch — correct fix, but the subsequent APPLY branch still dereferences policy without a NULL check.
tests/topotests/pathd_pcep_topo1/: new topotest directory with a Python PCE simulator (pce_sim.py), FRR config, and a pytest file covering session-up, message-counter verification, standalone pcep_pcc binary, and session-down-on-PCE-death scenarios. One test function deliberately skips show sr-te pcep counters because that command crashes pathd (pre-existing bug, documented inline).

Confidence Score: 3/5

The C change fixes a real crash but leaves a residual NULL dereference risk in the APPLY path; the topotest additions are well-structured with no blocking concerns.

The core bug fix in path_nb_config.c converts a controlled abort (abort_if_not_found=true) into a silent potential NULL dereference on the APPLY code path. In normal operation the policy entry will be present during APPLY, so the crash won't reproduce under typical conditions — but the code no longer has an explicit guard. The topotest files are solid: they use json_cmp, run_and_expect, and the pytest framework correctly, and they cover session lifecycle end-to-end.

pathd/path_nb_config.c — specifically the APPLY branch of pathd_srte_policy_name_modify around the strlcpy call.

Important Files Changed

Filename	Overview
pathd/path_nb_config.c	Bug fix for crash in pathd_srte_policy_name_modify during NB_EV_VALIDATE on a not-yet-applied policy. The VALIDATE path is correctly fixed but the APPLY path now dereferences policy without a NULL guard.
tests/topotests/pathd_pcep_topo1/test_pathd_pcep_topo1.py	New topotest exercising pathd PCEP PCC session lifecycle; uses json_cmp, run_and_expect, and pytest framework correctly. One test deliberately avoids a crashing command, documented with an inline comment.
tests/topotests/pathd_pcep_topo1/pce/pce_sim.py	New minimal PCEP PCE simulator script; correctly implements Open/Keepalive exchange with Stateful+SR TLVs; threaded per-connection model is appropriate for a test helper.
tests/topotests/pathd_pcep_topo1/r1/frr.conf	New FRR configuration for the topotest router; defines SR-TE policy, segment-list, and PCEP PCE config matching the simulator parameters.
tests/topotests/pathd_pcep_topo1/init.py	Empty Python package marker; required by pytest for topotest directory discovery.

Sequence Diagram

sequenceDiagram
    participant NB as FRR Northbound
    participant CB as name_modify callback
    participant RC as Running Config

    Note over NB,RC: BEFORE fix — crash on first-time policy name set
    NB->>CB: NB_EV_VALIDATE
    CB->>RC: "nb_running_get_entry abort=true"
    RC-->>CB: NULL (policy not yet applied)
    CB->>CB: assert/abort crash

    Note over NB,RC: AFTER fix
    NB->>CB: NB_EV_VALIDATE
    CB->>RC: "nb_running_get_entry abort=false"
    RC-->>CB: NULL
    CB->>CB: "policy==NULL so name not fixed, return NB_OK"
    NB->>CB: NB_EV_APPLY
    CB->>RC: "nb_running_get_entry abort=false"
    RC-->>CB: policy ptr (parent already applied)
    CB->>CB: "strlcpy policy->name — no NULL guard"

    Note over NB,CB: topotest PCEP session lifecycle
    participant R1 as FRR r1 pathd PCC
    participant PCE as pce_sim.py

    R1->>PCE: TCP connect port 4189
    R1->>PCE: PCEP Open with SR and Stateful TLVs
    PCE->>R1: PCEP Open with SR and Stateful TLVs
    PCE->>R1: Keepalive ack PCC Open
    R1->>PCE: Keepalive ack PCE Open
    Note over R1,PCE: Session UP verified by test
    loop Periodic keepalives
        R1->>PCE: Keepalive
        PCE->>R1: Keepalive
    end
    Note over PCE: pce_sim killed
    PCE--xR1: TCP closed or dead timer expires
    Note over R1: pcepSessionsConnected drops to 0

Comments Outside Diff (2)

pathd/path_nb_config.c, line 303-307 (link)

Unguarded NULL dereference in NB_EV_APPLY path

nb_running_get_entry(..., false) is now used for both VALIDATE and APPLY events. The VALIDATE path is correctly guarded with if (policy && ...). However, the APPLY path at line 304 calls strlcpy(policy->name, ...) with no NULL check on policy. If nb_running_get_entry returns NULL during APPLY — which, while unlikely in normal flow, can happen if the parent policy's APPLY step was somehow skipped or failed — this silently segfaults instead of the explicit abort-with-diagnostics that the original true flag provided. The proper fix is to look up the entry with true only in the APPLY branch, or to add an explicit NULL check with flog_err + return NB_ERR_RESOURCE before the strlcpy.

Prompt To Fix With AI

This is a comment left during a code review.
Path: pathd/path_nb_config.c
Line: 303-307

Comment:
**Unguarded NULL dereference in NB_EV_APPLY path**

`nb_running_get_entry(..., false)` is now used for both VALIDATE and APPLY events. The VALIDATE path is correctly guarded with `if (policy && ...)`. However, the APPLY path at line 304 calls `strlcpy(policy->name, ...)` with no NULL check on `policy`. If `nb_running_get_entry` returns NULL during APPLY — which, while unlikely in normal flow, can happen if the parent policy's APPLY step was somehow skipped or failed — this silently segfaults instead of the explicit abort-with-diagnostics that the original `true` flag provided. The proper fix is to look up the entry with `true` only in the APPLY branch, or to add an explicit NULL check with `flog_err` + `return NB_ERR_RESOURCE` before the `strlcpy`.

How can I resolve this? If you propose a fix, please make it concise.

tests/topotests/pathd_pcep_topo1/test_pathd_pcep_topo1.py, line 1246-1255 (link)

Known crash in show sr-te pcep counters left unexercised

The comment at lines 1248–1255 documents that show sr-te pcep counters crashes pathd in mt_count_free() on ASAN/development builds via path_pcep_cli.c::path_pcep_cli_show_srte_pcep_counters. The test works around this by using the textual per-session output instead. While the workaround is reasonable for unblocking test landing, the underlying memory-management bug in copy_counter_group() should be tracked and fixed so this test can be strengthened to validate counter values directly.

Prompt To Fix With AI

This is a comment left during a code review.
Path: tests/topotests/pathd_pcep_topo1/test_pathd_pcep_topo1.py
Line: 1246-1255

Comment:
**Known crash in `show sr-te pcep counters` left unexercised**

The comment at lines 1248–1255 documents that `show sr-te pcep counters` crashes pathd in `mt_count_free()` on ASAN/development builds via `path_pcep_cli.c::path_pcep_cli_show_srte_pcep_counters`. The test works around this by using the textual per-session output instead. While the workaround is reasonable for unblocking test landing, the underlying memory-management bug in `copy_counter_group()` should be tracked and fixed so this test can be strengthened to validate counter values directly.

How can I resolve this? If you propose a fix, please make it concise.

Prompt To Fix All With AI

Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
pathd/path_nb_config.c:303-307
**Unguarded NULL dereference in NB_EV_APPLY path**

`nb_running_get_entry(..., false)` is now used for both VALIDATE and APPLY events. The VALIDATE path is correctly guarded with `if (policy && ...)`. However, the APPLY path at line 304 calls `strlcpy(policy->name, ...)` with no NULL check on `policy`. If `nb_running_get_entry` returns NULL during APPLY — which, while unlikely in normal flow, can happen if the parent policy's APPLY step was somehow skipped or failed — this silently segfaults instead of the explicit abort-with-diagnostics that the original `true` flag provided. The proper fix is to look up the entry with `true` only in the APPLY branch, or to add an explicit NULL check with `flog_err` + `return NB_ERR_RESOURCE` before the `strlcpy`.

### Issue 2 of 2
tests/topotests/pathd_pcep_topo1/test_pathd_pcep_topo1.py:1246-1255
**Known crash in `show sr-te pcep counters` left unexercised**

The comment at lines 1248–1255 documents that `show sr-te pcep counters` crashes pathd in `mt_count_free()` on ASAN/development builds via `path_pcep_cli.c::path_pcep_cli_show_srte_pcep_counters`. The test works around this by using the textual per-session output instead. While the workaround is reasonable for unblocking test landing, the underlying memory-management bug in `copy_counter_group()` should be tracked and fixed so this test can be strengthened to validate counter values directly.

_{Reviews (1): Last reviewed commit: "greptile: Add Greptile configuration for..." | Re-trigger Greptile}

suphawitwanl · 2026-05-14T08:45:00Z

@greptileai what is the rules that list in ./greptile

greptile-apps · 2026-05-14T08:45:26Z

There is no .greptile directory or file in the opensourcerouting/frr repository. It doesn't appear to exist.

If you're referring to Greptile's custom review rules, those are typically defined in .greptile/rules.md. You can create one to customize how I review code — see the docs at greptile.com/docs/code-review/custom-standards.

donaldsharp and others added 3 commits May 14, 2026 15:33

suphawitwanl closed this May 14, 2026

suphawitwanl reopened this May 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pcep more testing#256

Pcep more testing#256
suphawitwanl wants to merge 3 commits into
aitest-base-pr-21896from
aitest-sw-pr-21896

suphawitwanl commented May 14, 2026

Uh oh!

greptile-apps Bot commented May 14, 2026 •

edited

Loading

Confidence Score: 3/5

Important Files Changed

Comments Outside Diff (2)

Uh oh!

suphawitwanl commented May 14, 2026

Uh oh!

greptile-apps Bot commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

suphawitwanl commented May 14, 2026

Uh oh!

greptile-apps Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (2)

Uh oh!

suphawitwanl commented May 14, 2026

Uh oh!

greptile-apps Bot commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented May 14, 2026 •

edited

Loading