Skip to content

feat(stdlib): Rewrite toString to use a layout tree#2385

Open
spotandjake wants to merge 6 commits intograin-lang:oscar/gc-rebasedfrom
spotandjake:spotandjake/toString
Open

feat(stdlib): Rewrite toString to use a layout tree#2385
spotandjake wants to merge 6 commits intograin-lang:oscar/gc-rebasedfrom
spotandjake:spotandjake/toString

Conversation

@spotandjake
Copy link
Copy Markdown
Member

@spotandjake spotandjake commented Apr 29, 2026

This pr refactors toString to improve the formatting of printed items. Previously, we were naively concatenating strings while stringifying. While this approach was relatively performant and very compact, it meant we weren't able to support high-quality formatting akin to how we format grain code itself. In order to solve this problem, I ported the doc printing engine used in the formatter into Grain and rewrote our implementation of it.
Screenshot 2026-04-29 at 5 09 02 PM

More information on the implementation itself and design decisions can be found here: https://github.com/spotandjake/Grain-toString/tree/spotandjake/gc, along with some more in-depth notes on both the size and performance changes.

A few more notable things:

  • When testing as a library, I was only seeing about a 40% increase in the size of Hello World.
    • Our smallest grain module increased in size by about 100 bytes, looking into the wasm itself this seems to be related to the builders in doc.gr, these are capable of being optimized away however I don't think our current passes handle them very well.
  • Performance for small prints has stayed the same; however, it seems that when printing very long lists with around 750k items, performance degrades a little bit compared to before.
    • I tested this in wasmtime, and most of the time seemed to have been spent in gc. I think this is much less of an issue in v8 and will improve in wasmtime in the future.
    • Most of the time, users are going to be printing small items or singular large dumps when debugging. The performance I was seeing seemed very acceptable under these workloads.
  • This refactor adds a few very useful libraries to the runtime, doc, miniBuffer and vector, all of which are generic enough to be used elsewhere.
  • While I did optimize the doc library for grain, I didn't change the api from the formatter; this means that we can use it in other places, such as improving JSON stringification in the future.
  • Each datatype exposes a separate printer, which will allow us to provide more specific implementations or, in the future, maybe have the compiler monomorphize things like print("Hello World").
  • Now instead of just printing unknown when we don't have type metadata for a variant or record we print the data inside and leave the fields or variant itself as unknown.

This work is based on: #2378

Closes: #2087
Work towards: #1794, #2127

@spotandjake spotandjake force-pushed the spotandjake/toString branch from cadd6d9 to 6e4f63b Compare April 29, 2026 22:11
@spotandjake spotandjake force-pushed the spotandjake/toString branch from 6e4f63b to e7b1b19 Compare April 29, 2026 22:20
It seems that refmt removes extra whitespace in raw strings, which breaks our testing so I had to go back to using a slightly more ugly regular string.
I noticed that the builders being global allocations were causing some optimization issues, that can easily be avoided by turning them into functions. This has a slight performance drawback however it is used very sparingly so it should be reasonable.
@spotandjake spotandjake force-pushed the spotandjake/toString branch from c390917 to 84e6f27 Compare April 30, 2026 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant