Skip to content

sonic-swss-common: port to libyang3#973

Merged
lguohan merged 3 commits into
sonic-net:masterfrom
bradh352:bradh352/libyang3
May 27, 2026
Merged

sonic-swss-common: port to libyang3#973
lguohan merged 3 commits into
sonic-net:masterfrom
bradh352:bradh352/libyang3

Conversation

@bradh352

@bradh352 bradh352 commented Feb 15, 2025

Copy link
Copy Markdown
Contributor

Port from libyang1 to libyang3.

This PR currently integrates #1196 due to libyang3 being impacted by the race condition (its unrelated but seems to affect timing)

Main Tracking PR here: sonic-net/sonic-buildimage#22385

Fixes sonic-net/sonic-buildimage#22385

@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

Comment thread common/defaultvalueprovider.h Outdated
@bradh352 bradh352 force-pushed the bradh352/libyang3 branch from b08a922 to 42f329d Compare March 1, 2025 19:57
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

bradh352 added a commit to bradh352/sonic-buildimage that referenced this pull request Mar 6, 2025
@qiluo-msft qiluo-msft requested a review from liuh-80 April 18, 2025 18:04
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@bradh352 bradh352 force-pushed the bradh352/libyang3 branch from 0fc8d81 to ce8d9fa Compare May 18, 2026 14:27
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@bradh352 bradh352 force-pushed the bradh352/libyang3 branch from 5a5cf68 to abc0cab Compare May 22, 2026 16:43
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@bradh352 bradh352 force-pushed the bradh352/libyang3 branch from a86fda4 to 3ad009f Compare May 22, 2026 18:28
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

bradh352 and others added 3 commits May 22, 2026 17:09
…te CI

Port the YANG schema-walking code in defaultvalueprovider.{h,cpp} from
libyang1 to libyang3. sonic-swss-common is a submodule whose libyang
dependency is pinned by the parent project (sonic-buildimage), so the
source needs to compile against either version. Use LY_ARRAY_COUNT
(defined only by libyang3) as a preprocessor sentinel:

  * Trivial spelling/accessor differences (node types, ->child,
    leaf->dflt extraction, module->data layout, ly_ctx_destroy,
    ly_ctx_load_module) are hidden behind SWSS_LYS_* macros so the
    call sites stay near-identical to the libyang1 source.
  * Structurally divergent code (list-key extraction via lysc_is_key,
    leaf-list default values, the ly_ctx_new return-by-out-param
    signature) still needs explicit #ifdef branches.

SWIG's preprocessor cannot see <libyang/libyang.h>, so when it parses
defaultvalueprovider.h it always takes the libyang1 branch of the
version shim and generates wrappers that fail to compile against
libyang3 headers. `%ignore swss::DefaultValueHelper` in
pyext/swsscommon.i suppresses those wrappers (the helper is internal —
no Python or Go caller uses it) and keeps the generated bindings
version-agnostic.

CI changes wired up for the libyang3 packages:

  * Switch the .deb download/install patterns from libyang1
    (libyang_1.0*, libyang-*_1.0*, libyang-cpp_*, python3-yang_*) to
    libyang3 (libyang3_*, libyang-dev_3*) across azure-pipelines.yml
    and the build/sairedis/swss/test-docker templates so CI pulls the
    libyang3 artifacts the parent project produces.
  * In the amd64/ubuntu-22.04 host job, replace the python3-libyang
    .deb install with `pip install libyang==3.3.0`. The bookworm-
    built python3-libyang_3.1.0 .deb pins python3 (>= 3.11~,
    << 3.12) so it only installs on systems with Python 3.11; the
    vmImage ships Python 3.10. PyPI's libyang ships sdist only, so
    pip has to build the CFFI extension — use --no-build-isolation
    with apt's python3-cffi to sidestep a cffi version-mismatch
    Exception that jammy's pip 22.0.2 build-isolation env triggers
    (newer cffi in the overlay vs. the system /usr/lib/python3/
    dist-packages/_cffi_backend.so that still wins on sys.path).
    3.3.0 is the last 3.x release and includes patches we
    upstreamed against 3.1.0.
  * Drop the amd64/ubuntu-20.04 job, which has been pinned to
    `condition: false` (i.e. never runs) for some time.

Signed-off-by: Brad House <brad@brad-house.com>
ZmqServer dispatches incoming messages on its poll thread by looking up
a ZmqMessageHandler pointer in a (db, table) map populated by
ZmqConsumerStateTable's constructor via registerMessageHandler() and
calling handler->handleReceivedData(). There was no symmetric
removeMessageHandler and no ZmqConsumerStateTable destructor — so when
a consumer went out of scope, the map kept the now-dangling pointer.

That bites in test code that declares its ZmqServer before its
ZmqConsumerStateTable (the natural order — server has to exist to be
passed by reference into the consumer's constructor). Reverse-order
destruction then runs the consumer's destructor first, which destroys
m_selectableEvent and closes the underlying eventfd, while the server's
mqPollThread is still running and may be partway through a dispatch
that ends in m_selectableEvent.notify(). write() on the closed fd
returns EBADF, SelectableEvent::notify() throws
runtime_error("write failed"), the exception escapes the poll thread,
and std::terminate aborts the process — visible in CI as a crash
attributed to whichever test happened to be RUNing next
(ZmqServerLazzyBind in practice, since gtest prints [ RUN ] before the
test body executes).

The fix moves the handler map and its mutex out of ZmqServer into a
small, separately-allocated ZmqHandlerRegistry class. The server and
every registered consumer co-own a std::shared_ptr<ZmqHandlerRegistry>,
so:

  * The mutex is held across handler dispatch, so the consumer's
    destructor (which calls removeHandler() on the registry) blocks
    until any in-flight callback into it has returned — after which
    its other members can safely destruct.

  * The registry survives as long as either the server OR any
    registered handler still holds a reference. Destruction order
    between server and consumer no longer matters: whichever goes
    away last leaves a live registry behind for the other's
    destructor to use without touching freed memory.

The server's public surface — registerMessageHandler() and the new
removeMessageHandler() — is unchanged in signature; both delegate to
the registry. A new (internal) ZmqServer::getHandlerRegistry()
accessor returns the shared_ptr so ZmqMessageHandler implementations
can co-own the registry.

ZmqConsumerStateTable now caches its db name as std::string at
construction and stores std::shared_ptr<ZmqHandlerRegistry> instead of
ZmqServer&, so its destructor never dereferences the DBConnector or
the ZmqServer it was created with — only the shared registry.

Handler callbacks still must not themselves call register/removeHandler
on the registry (would self-deadlock); ZmqConsumerStateTable::
handleReceivedData only enqueues and notifies, so the restriction is
harmless in tree. Documented on the ZmqHandlerRegistry class.

Signed-off-by: Brad House <bhouse@nexthop.ai>
sonic-swss commit e6c482ca ("...") added -lnexthopgroup to fpmsyncd's
link line, expecting libnexthopgroup_*.deb to already be installed
when the swss build runs. sonic-swss-common's downstream "Compile
sonic swss" stage installs everything it needs from the common-lib
artifact, but its pattern list never picked up the new package, so
the link now fails with `cannot find -lnexthopgroup`. Add the
libnexthopgroup pattern next to the other Azure.sonic-buildimage.
common_libs entries in build-swss-template.yml so dpkg picks it up
in the existing install step.

Signed-off-by: Brad House <brad@brad-house.com>
@bradh352 bradh352 force-pushed the bradh352/libyang3 branch from c558f41 to 608472e Compare May 22, 2026 21:10
@mssonicbld

Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@bhouse-nexthop

Copy link
Copy Markdown
Contributor

/azpw run

@mssonicbld

Copy link
Copy Markdown
Collaborator

⚠️ Notice: /azpw run only runs failed jobs now. If you want to trigger a whole pipline run, please rebase your branch or close and reopen the PR.
💡 Tip: You can also use /azpw retry to retry failed jobs directly.

Retrying failed(or canceled) jobs...

@mssonicbld

Copy link
Copy Markdown
Collaborator

Retrying failed(or canceled) stages in build 1120488:

✅Stage Test:

  • Job vstest: retried.

@lguohan

lguohan commented May 27, 2026

Copy link
Copy Markdown
Contributor

in favor of #1194 , this is simplier to maintain.

@lguohan lguohan merged commit 3595131 into sonic-net:master May 27, 2026
17 of 19 checks passed
prsunny pushed a commit to sonic-net/sonic-swss that referenced this pull request May 28, 2026
* Download libyang3 in the build pipeline
What I did
When building in the PR check pipeline, download and install libyang3.

Why I did it
sonic-net/sonic-swss-common#973 changed swss-common to depend on libyang3
lolyu pushed a commit to sonic-net/sonic-sairedis that referenced this pull request May 29, 2026
…pgroup (#1916)

Approach
What is the motivation for this PR?
sonic-net/sonic-swss-common#973 changed swss-common to depend on libyang3

Work item tracking
Microsoft ADO (number only):
How did you do it?
When building in the PR check pipeline, download and install libyang3. A similar change was made for sonic-swss: sonic-net/sonic-swss#4618

How did you verify/test it?
Any platform specific information?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enhancement: Upgrade to libyang3

6 participants