[caclmgrd]: Preserve externally-owned INPUT rules across CACL rebuild#396
[caclmgrd]: Preserve externally-owned INPUT rules across CACL rebuild#396Xichen96 wants to merge 1 commit into
Conversation
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
149cbfe to
68f7e81
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
68f7e81 to
7a1650c
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This PR updates caclmgrd (the host control-plane ACL manager) to preserve specific externally-owned iptables INPUT rules across the daemon’s flush-and-rebuild cycle, preventing those kernel-only rules (e.g., the dhcp_server docker0 syslog exception) from being silently removed after a CACL update.
Changes:
- Add an allowlist (
IGNORED_INPUT_RULE_COMMENTS) and logic to snapshot matching commentedINPUTrules before a rebuild and re-add them early in the rebuilt chain. - Insert preserved rules into the generated iptables command sequence before any DROP rules are appended.
- Add unit tests covering snapshot filtering, graceful failure, ordering relative to the catch-all DROP, and namespace scoping.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
scripts/caclmgrd |
Snapshots allowlisted commented INPUT rules pre-flush and re-inserts them early during CACL rebuild; adds helper get_preserved_input_rules(). |
tests/caclmgrd/caclmgrd_preserved_input_rules_test.py |
Adds unit tests validating preservation behavior, ordering before DROP, failure handling, and host-namespace-only behavior. |
| for line in output.splitlines(): | ||
| tokens = line.split() | ||
| if len(tokens) < 2 or tokens[0] != '-A' or tokens[1] != 'INPUT': | ||
| continue | ||
| if '--comment' not in tokens: | ||
| continue | ||
| comment = tokens[tokens.index('--comment') + 1].strip('"') | ||
| if comment not in self.IGNORED_INPUT_RULE_COMMENTS: | ||
| continue | ||
| preserved.append(self.iptables_cmd_ns_prefix[namespace] + ['iptables'] + tokens) |
There was a problem hiding this comment.
Good catch — switched to shlex.split() with defensive --comment extraction so comments containing spaces (quoted by iptables -S) stay byte-identical on re-add and the owning service's -C/-D keep matching. Added a unit test covering a spaced comment. Fixed in the latest push.
caclmgrd owns the host INPUT chain and flushes/rebuilds it from CONFIG_DB on every control-plane ACL change. Rules installed by OTHER services live only in the kernel and are silently wiped by that rebuild. The concrete case is the dhcp_server container docker0 syslog (UDP 514) ACCEPT (sonic-net/sonic-buildimage#27584): after any CACL update it is dropped until dhcp_server restarts, breaking syslog from the bridge-mode container. Make caclmgrd not clobber externally-owned INPUT rules. caclmgrd never comments its own rules, so an iptables --comment is an unambiguous marker of external ownership. Before the flush, snapshot any INPUT rule whose --comment is in an explicit ignore list (IGNORED_INPUT_RULE_COMMENTS, currently just "dhcp_server_syslog") and re-add it verbatim early in the rebuild -- before the ip2me and catch-all DROPs. The INPUT policy is ACCEPT throughout the rebuild and the catch-all DROP is appended last, so there is no window where the rule is missing while a DROP exists; no syslog is lost. The owning service keeps full lifecycle control: caclmgrd never creates or deletes the rule, it only mirrors it, and re-adds it byte-identically so the container own -C/-D keep matching. Host namespace / IPv4 only. No change to the dhcp_server container. Fixes sonic-net/sonic-buildimage#27584 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Xichen96 <lukelin0907@gmail.com>
7a1650c to
80744ee
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@Bojun-Feng this addresses sonic-net/sonic-buildimage#27584 with a different approach than #390 — instead of caclmgrd regenerating the rule from FEATURE state, it preserves the externally-owned INPUT rule (identified by its iptables cc @tirupatihemanth @yejianquan @vmittal-msft @Sourabh-Kumar7 — review requests sent, feedback welcome. |
|
@StormLiangMS adding you for visibility — caclmgrd fix for the |
|
Hi @Xichen96 I understand that we are trying to preserve specific commented filtering rules here. Since dhcp container is in bridge networking and we will always need this rule for dhcp syslog traffic - can we make this as part of default CONFIG_DB instead of adding the rule preservation logic in caclmgrd? The pov I am looking at is dhcp_server should modify kernel ip table rules through CONFIG_DB and not directly. |
Why I did it
Fixes sonic-net/sonic-buildimage#27584
caclmgrdflushes and rebuilds the hostINPUTchain fromCONFIG_DBon every control-plane ACL (CACL) change. Thedhcp_servercontainer (the only bridge-mode container) installs adocker0syslog exception directly in the kernel:That rule is not in
CONFIG_DB, so the rebuild wipes it and the catch-all DROP then silently dropsdhcp_serversyslog until the container restarts.Work item tracking
How I did it
Make
caclmgrdpreserve INPUT rules it does not own instead of clobbering them:caclmgrdnever comments its own rules, so an iptables--commentmarks an externally-owned rule. An explicit ignore listIGNORED_INPUT_RULE_COMMENTS(currentlydhcp_server_syslog) says which to keep.iptables -S INPUT) and re-add them verbatim early in the rebuild — before any DROP.Design notes (concerns considered)
caclmgrdonly mirrors these rules (never creates/deletes them), sodhcp_serverstays the sole owner via its existing start/stop add/remove. No container change.ACCEPTduring the rebuild and the catch-all DROP is appended last, while the preserved rule is re-added early, so a DROP never coexists with the rule missing. Verified step-by-step that syslog is accepted at every point of the rebuild.iptables -S), so the container's own-C/-Dstill match it after a rebuild. Verified.caclmgrd-only: it never writesCONFIG_DB, signals, or restartsdhcp_server. (A rare simultaneous flush/container-update can momentarily duplicate the rule; harmless and self-healing.)How to verify it
iptables -S INPUT | grep dhcp_server_syslog— rule present.sonic-db-cli CONFIG_DB HSET "ACL_TABLE|TEST_CACL" "policy_desc" "test" "type" "CTRLPLANE" "stage" "ingress" "services@" "SSH"iptables -S INPUT | grep dhcp_server_syslog— still present, above the catch-all DROP (previously it disappeared).pytest tests/caclmgrd/caclmgrd_preserved_input_rules_test.pyWhich release branch to backport (provide reason below if selected)
The rule and the CACL flush-and-rebuild both exist on 202511 (the rule was added by sonic-net/sonic-buildimage#26637, included in 202511), so the same drop happens there.
Description for the changelog
[caclmgrd] Preserve externally-owned INPUT rules (e.g. the dhcp_server docker0 syslog rule, tagged by an iptables comment) across the control-plane ACL flush-and-rebuild.