Skip to content

[Debugging] Add lldb data formatter for RigidArray#607

Draft
kastiglione wants to merge 13 commits intoapple:mainfrom
kastiglione:Debugging-Add-lldb-data-formatter-for-RigidArray
Draft

[Debugging] Add lldb data formatter for RigidArray#607
kastiglione wants to merge 13 commits intoapple:mainfrom
kastiglione:Debugging-Add-lldb-data-formatter-for-RigidArray

Conversation

@kastiglione
Copy link
Copy Markdown

@kastiglione kastiglione commented Mar 19, 2026

This introduces a debugging data formatter for RigidArray using a new feature of LLDB: Formatter Bytecode. The scope of this PR is a single initial data type, with the short and long term plan being to support all types vended by Swift Collections.

The bytecode for this implementation is produced using a Python -> Formatter Bytecode compiler, contained in the formatter_bytecode module.

The workflow for authoring and shipping the formatter is:

  1. Update RigidArray.py as needed
  2. Compile RigidArray.py to RigidArray+Formatter.swift (again, formatter_bytecode)
  3. Compile RigidArray+Formatter.swift as part of the BasicContainers module

By including the bytecode in the binary, lldb can load RigidArray's data formatter automatically, without any friction that comes with installation, setup, config, updating, or versioning.

This introduces a debugging data formatter for RigidArray using a new feature of LLDB:
[Formatter Bytecode][1]. The scope of this PR is a single initial data type, with the
short and long term plan being to support all types vended by Swift Collections.

The bytecode for this implementation is produced using a Python -> Formatter Bytecode
compiler, contained in the [`formatter_bytecode`][2] module.

The workflow for authoring and shipping the formatter is:

1. Update RigidArray.py as needed
2. Compile RigidArray.py to RigidArray+Formatter.swift (again, `formatter_bytecode`)
3. Compile RigidArray+Formatter.swift as part of the BasicContainers module

By including the bytecode in the binary, lldb can load RigidArray's data formatter
automatically, without any friction that comes with installation, setup, config,
updating, or versioning.

[1]: https://lldb.llvm.org/resources/formatterbytecode.html
[2]: https://github.com/llvm/llvm-project/blob/main/lldb/examples/python/formatter_bytecode.py
@kastiglione kastiglione marked this pull request as draft March 19, 2026 02:10
@kastiglione
Copy link
Copy Markdown
Author

kastiglione commented Mar 19, 2026

Draft mode until completion of the following:

  • Add script to invoke compilation of Python formatters to formatter bytecode
  • Add separate lldb integration test for verifying the formatter works as expected

To manage the scope of this first PR, I'm leaving the following desirable features for future followups:

  1. A SPM plugin that generates the formatter sources on the fly, so that they don't have to be checked into the project
  2. Running the formatter verification tests in CI

@@ -0,0 +1,38 @@
#if swift(>=6.3)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding either a comment at the top of this file with instructions for how to regenerate it from the python source?
I guess that could even be a feature of the compiler script...

Copy link
Copy Markdown
Author

@kastiglione kastiglione Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment is now added as part of the new generate_formatters.sh script.

Comment on lines +2 to +6
#if os(macOS) || os(iOS) || os(watchOS) || os(tvOS) || os(visionOS)
@section("__DATA_CONST,__lldbformatters")
#else
@section(".lldbformatters")
#endif
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be using:

  • #if objectFormat(…) from SE-0492 — with section names for MachO, ELF, COFF, and Wasm?

  • #if os(anyAppleOS) when using a Swift 6.4 compiler?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL I learned about these. Thank you. I will update it accordingly.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +7 to +8
@used
let __BasicContainers___RigidArray______formatter: (UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8) = (
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this use a raw identifier from SE-0451, instead of the replacement underscores?

I was also going to ask about using an InlineArray literal, but then I found your previous discussion.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will look into whether it would be feasible to allow InlineArray to be used in the future as a constant.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A raw identifier is another good suggestion. The name of this is irrelevant to lldb.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

kastiglione added a commit to llvm/llvm-project that referenced this pull request Mar 31, 2026
Following feedback from @benrimmington in
apple/swift-collections#607, this changes the
following:

1. Uses `objectFormat()` compiler conditional instead of `os()` (see
"Cross-platform object file format support" in
[SE-0492](https://github.com/swiftlang/swift-evolution/blob/main/proposals/0492-section-control.md#cross-platform-object-file-format-support))
2. Uses a raw identifier for the generated Swift symbol name, instead of
an escaped name (see
[SE-0451](https://github.com/swiftlang/swift-evolution/blob/main/proposals/0451-escaped-identifiers.md))
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Mar 31, 2026
…(#189425)

Following feedback from @benrimmington in
apple/swift-collections#607, this changes the
following:

1. Uses `objectFormat()` compiler conditional instead of `os()` (see
"Cross-platform object file format support" in
[SE-0492](https://github.com/swiftlang/swift-evolution/blob/main/proposals/0492-section-control.md#cross-platform-object-file-format-support))
2. Uses a raw identifier for the generated Swift symbol name, instead of
an escaped name (see
[SE-0451](https://github.com/swiftlang/swift-evolution/blob/main/proposals/0451-escaped-identifiers.md))
//
// This source file is part of the Swift Collections open source project
//
// Copyright (c) 2024 - 2026 Apple Inc. and the Swift project authors
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Copyright (c) 2024 - 2026 Apple Inc. and the Swift project authors
// Copyright (c) 2026 Apple Inc. and the Swift project authors

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@section(".lldbformatters")
#endif
@used
let `^BasicContainers[.]RigidArray<.+>$ formatter`: (UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8, UInt8) = (
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail to work in single-module mode (COLLECTIONS_SINGLE_MODULE), where the sources are combined into a single module. (The Xcode project inside Xcode is a good way to exercise that. The module name in that configuration is Collections in this repo, but we edit that when we build for adopters inside Apple OSes.)

We could surround the template with #if !COLLECTIONS_SINGLE_MODULE, but that would make debugging code less pleasant for OS engineers, so the right move is probably to duplicate this:

#if COLLECTIONS_SINGLE_MODULE
...
let `^BasicContainers[.]RigidArray<.+>$ formatter`: ... = ...
#else 
...
let `^Collections[.]RigidArray<.+>$ formatter`: ... = ...
#endif

That's yucky, but it works. (I don't think we can do #if conditions just around the symbol name... 😞

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could put the set of known module names into the regex:

^(BasicContainers|Collections)[.]RigidArray<.+>$

We've discussed having the module name be implicit, but I don't yet know if that's doable in practice.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a much better solution! 👍

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

##
## This source file is part of the Swift Collections open source project
##
## Copyright (c) YEARS Apple Inc. and the Swift project authors
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## Copyright (c) YEARS Apple Inc. and the Swift project authors
## Copyright (c) 2026 Apple Inc. and the Swift project authors

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +22 to +24
curl -fsSL \
"https://raw.githubusercontent.com/llvm/llvm-project/$COMPILER_VERSION/lldb/examples/python/formatter_bytecode.py" \
-o "$compiler"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reservations about downloading scripts sight unseen for execution on engineers' machines. Can we just copy the formatter directly into this repo, like we used to do with gyb?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

When `COLLECTIONS_SINGLE_MODULE` is set, a single `Collections` module is
built. Type names need to match accordingly.
sriyalamar pushed a commit to sriyalamar/cpullvm-toolchain that referenced this pull request Apr 7, 2026
Following feedback from @benrimmington in
apple/swift-collections#607, this changes the
following:

1. Uses `objectFormat()` compiler conditional instead of `os()` (see
"Cross-platform object file format support" in
[SE-0492](https://github.com/swiftlang/swift-evolution/blob/main/proposals/0492-section-control.md#cross-platform-object-file-format-support))
2. Uses a raw identifier for the generated Swift symbol name, instead of
an escaped name (see
[SE-0451](https://github.com/swiftlang/swift-evolution/blob/main/proposals/0451-escaped-identifiers.md))

(cherry picked from commit b66d98a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants