Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughThis PR transitions BAML's Python codegen from an ObjectPool-based, template-driven architecture to a SymbolPool-based emitter system with structured leaf-path routing. It restructures the bridge-python package from Changes
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 442605df8f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| cached = _RESOLVE_TYPE_CACHE.get(baml_type_expr) | ||
| if cached is not None: | ||
| return cached |
There was a problem hiding this comment.
Scope resolved-type cache by runtime sdk_root
This cache is keyed only by baml_type_expr, but initialize_runtime explicitly allows replacing the process runtime (and its sdk_root). After a second SDK is initialized in the same process, _resolve_type can return class objects cached from the previous SDK, leading to mismatched types (isinstance failures and wrong model classes) during outbound decode. Include sdk_root in the cache key or clear the cache on runtime reinit.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
baml_language/crates/baml_compiler2_emit/src/lib.rs (1)
925-941:⚠️ Potential issue | 🔴 CriticalApply namespace-aware type lowering consistently across class fields, function parameters, and stream return types.
The justification at line 925–931 applies equally to class field types (line 305) and stream return types (line 980): using bare
lower_type_exprwith&[]loses unqualified references in non-root namespaces toTy::Unknown → Ty::Void.
Class field types (line 305): Replace
lower_type_expr(db, &te.expr, pkg_items, &[], &mut diags)withlower_type_expr_in_ns(db, &te.expr, pkg_items, &pkg_info.namespace_path, &[], &mut diags).Stream return type (line 980): Replace
lower_type_expr(db, type_expr, pkg_items, &pkg_info.namespace_path, &mut diags)withlower_type_expr_in_ns(db, type_expr, pkg_items, &pkg_info.namespace_path, &[], &mut diags). The current code passesnamespace_pathas thegeneric_paramsargument, which is a parameter-order bug.Unit tests should verify non-root-namespace class fields whose type is a sibling class in the same namespace.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/baml_compiler2_emit/src/lib.rs` around lines 925 - 941, Replace uses of lower_type_expr that pass an empty or mis-ordered namespace/generic-params with the namespace-aware API lower_type_expr_in_ns so unqualified type names in non-root namespaces resolve correctly: update the class field lowering site (where lower_type_expr(db, &te.expr, pkg_items, &[], &mut diags) is used) to call lower_type_expr_in_ns(db, &te.expr, pkg_items, &pkg_info.namespace_path, &[], &mut diags) and update the stream return-type lowering site (where lower_type_expr(db, type_expr, pkg_items, &pkg_info.namespace_path, &mut diags) is used) to call lower_type_expr_in_ns(db, type_expr, pkg_items, &pkg_info.namespace_path, &[], &mut diags); ensure you pass pkg_items, &pkg_info.namespace_path, an empty generic params slice, and &mut diags in that order and then feed the returned tir type into the same cache.convert call as before; add/adjust unit tests to cover a non-root namespace class field referencing a sibling type.baml_language/crates/bridge_python/tests/test_tracing.py (1)
388-396:⚠️ Potential issue | 🟡 MinorTrace-file fixture wipes externally-set
BAML_TRACE_FILEper test.When
BAML_TRACE_FILEis set externally (the inspection scenario the docstring describes), this fixture still callsos.remove(path)at the start of every test, so a multi-test run only leaves the last test's events on disk. The example invocation (BAML_TRACE_FILE=debug.jsonl pytest -k test_name) implicitly assumes a single-test run; consider either documenting the single-test caveat more loudly or skipping the unlink whenalready_setis true and instead recording the byte offset to slice from afteryield.Not a release blocker — just surprising for someone who reads the docstring without the
-kfilter.baml_language/rig_tests/crates/empty/build.rs (1)
1-256: 🛠️ Refactor suggestion | 🟠 MajorSignificant duplication with
llm_functions/build.rs— extract to a shared helper.This
build.rsis nearly byte-for-byte identical tobaml_language/rig_tests/crates/llm_functions/build.rs(sections 5-8 in particular: symlink, conftest, pyproject, test.sh/test.ps1, rerun-if-changed, walk_files). The only meaningful deltas are thepyproject.tomlnamefield and an extraassert!(!baml_files.is_empty(), …)inllm_functions. The README already documents that "Crates are hand-maintained" and that new crates copyllm_functions/'s shape, so each new fixture currently inherits ~250 lines of copy/paste.Consider factoring the shared scaffolding into a small internal helper crate (e.g.
rig_build_support) exposing something likerig_build_support::run(BuildConfig { project_name, require_nonempty: bool, .. }), so each per-fixturebuild.rscollapses to a few lines. This both reduces the per-fixture surface area for bugs and makes future changes (e.g. tweaking theconftest.pystub ortest.sh) a single edit instead of N.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/rig_tests/crates/empty/build.rs` around lines 1 - 256, The build.rs duplicates large scaffolding present in llm_functions; extract the shared logic into a small helper crate (e.g. rig_build_support) that exposes a single entry point like rig_build_support::run(BuildConfig { project_name: String, require_nonempty: bool, customizable_dir: PathBuf, baml_src: PathBuf, generated_dir: PathBuf, .. }), move the symlink logic (currently using customizable_dir), conftest_py/pyproject_toml/test.sh/test.ps1 generation, rerun-if-changed printing, and helpers watch_dir/walk_files into that crate, and then simplify this main() to build the ProjectDatabase, diagnostics, symbol pool and call rig_build_support::run(...) with the project-specific pieces (pass generated_dir, project name "baml-test-empty", and whether to require non-empty baml_files); update llm_functions/build.rs similarly to call the same helper.
🧹 Nitpick comments (15)
baml_language/crates/bex_project/src/bex.rs (1)
34-56: Consider adding unit tests forcoerce_arg_to_declared_type.The coercion rules (Map→Instance, Instance class-name override, Variant enum-name override, fall-through) are simple but type-driven and central to host bridge correctness across Python/Node. A few small unit tests would lock the behavior in (e.g., Map+Class → Instance with overridden name; Instance with mismatched class name gets normalized; Variant enum-name override; non-matching value/type pair passes through unchanged).
As per coding guidelines: "Prefer writing Rust unit tests over integration tests where possible".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/bex_project/src/bex.rs` around lines 34 - 56, Add unit tests for coerce_arg_to_declared_type covering: Map -> Instance coercion when Ty::Class(type_name, _) should produce BexExternalValue::Instance with class_name overridden to type_name and fields preserved; Instance input with a different class name should be normalized to class_name = type_name for Ty::Class; Variant input should have enum_name overridden to type_name for Ty::Enum; and a non-matching (e.g., Int or String) value with any Ty should be returned unchanged. Put tests in the bex.rs unit-test module using the BexExternalValue::{Map,Instance,Variant} constructors and Ty::{Class,Enum} so behavior is locked in and fails if the function regresses.baml_language/crates/baml_cli/src/generate.rs (1)
125-133: Optional: defend against path-escape in generator-emittedrel_path.
output_dir.join(rel_path)will silently escapeoutput_dirif a generator ever returns an absolute path or one containing... Today the codegen is in-tree and trusted, so this is not exploitable, but a one-line check thatrel_path.is_relative()and contains noParentDircomponents would catch a future regression where a leaf-router bug produces a malformed path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/baml_cli/src/generate.rs` around lines 125 - 133, Validate generator paths before joining: ensure each rel_path from generated is not absolute and contains no ParentDir components to prevent escaping output_dir. In the loop over generated (rel_path, content) check rel_path.is_relative() and iterate rel_path.components() to reject any Component::ParentDir (or otherwise return an error/context) before calling output_dir.join(rel_path) and writing dest; keep the same error handling via with_context on std::fs::write and increment count only after a successful write.baml_language/crates/bridge_python/tests/test_sim_phase3.py (1)
19-21: Optional: consider scoping thesys.pathmutation.Mutating
sys.pathat module import time persists for the rest of the pytest session and can shadow modules in unrelated tests. A session-scoped autouse fixture (ormonkeypatch.syspath_prependper-test) would isolate the change. Not a blocker for a sim-only test file.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/bridge_python/tests/test_sim_phase3.py` around lines 19 - 21, The top-level mutation of sys.path using SIM_ROOT should be moved into a fixture to avoid global state leakage; replace the module-level block that calls sys.path.insert(0, str(SIM_ROOT)) with a pytest fixture (e.g., add_sim_root_to_syspath) that either uses monkeypatch.syspath_prepend(str(SIM_ROOT)) per-test or is declared autouse=True, scope="session" to prepend SIM_ROOT for the test(s) and automatically restore sys.path afterward — update test_sim_phase3.py to define and use that fixture instead of mutating sys.path at import time.baml_language/crates/bridge_python/tests/test_engine.py (1)
178-187: Loose regex inpytest.raises(match=...).
match="argument"will accept any exception whose message contains the substring "argument" anywhere — including e.g."too many positional arguments to function"or completely unrelated PyO3 errors. Since the docstring already documents both phrasings ("Invalid argument" vs "Missing argument"), prefer a regex that asserts which phrasings are actually accepted:♻️ Tighter match
- with pytest.raises(Exception, match="argument"): + with pytest.raises(Exception, match=r"(?i)(missing|invalid)\s+argument"): call_function_sync(rt,"ReturnNumber", {})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/bridge_python/tests/test_engine.py` around lines 178 - 187, The test test_missing_argument_raises uses a too-loose regex match="argument"; change it to assert the expected phrasings by replacing the match with a tighter regex such as r"(Invalid|Missing) argument" (or equivalent) when calling pytest.raises in the test that calls call_function_sync(rt, "ReturnNumber", {}) after creating rt via make_runtime(EXPR_FUNCS_BAML), so the failure only matches the documented "Invalid argument" or "Missing argument" messages.baml_language/crates/baml_codegen_python/src/emit/function.rs (1)
18-34: Consider addingDebug(and possiblyClone) toPyFunction.
SyncAsyncderivesClone, Copy, Debug, PartialEq, Eq, butPyFunctionhas no derives at all. Per the AI summary, this struct is constructed inemit/mod.rs, fanned out into multiple variants per BAML function, then consumed inleaf.rsfor rendering — at minimumDebugis helpful fordbg!/log output during emitter development, andClonemay be needed once companion fan-out lands.♻️ Proposed derives
/// Top-level factory binding. Renders as a `__define_function(...)` /// three-arg call per 09b §3 / 09b2 §2. One `PyFunction` per emitted /// line — sync and async are distinct stubs that share their FQN and /// `param_names` (the call-time positional parameter names, sourced /// from `Function.arguments` for free functions and from each /// companion's own `arguments` for companions). +#[derive(Debug, Clone)] pub(crate) struct PyFunction {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/baml_codegen_python/src/emit/function.rs` around lines 18 - 34, Add derives to the PyFunction struct so it can be easily inspected and duplicated during emit/fan-out; update the declaration of PyFunction to derive Debug and Clone (e.g., #[derive(Debug, Clone)]) so dbg! logging in emit/mod.rs and subsequent consumption in leaf.rs work smoothly and cloning companion stubs is supported; retain SyncAsync's current derives and ensure no other trait collisions occur.baml_language/crates/bridge_python/src/runtime.rs (1)
76-83: Failure-atomicity: engine init can succeed whileSDK_ROOTwrite fails.If
bridge_cffi::engine::initialize_runtimesucceeds andstore_sdk_rootthen errors (RwLock poisoning is the only realistic failure here, but it's reachable if a prior writer panicked), the engine singleton is live butSDK_ROOTis empty/stale. Subsequentget_runtime()calls will raise the canned "did baml_sdk/init.py fail to import?" error, which is misleading — the SDK root file did import, the lock just got poisoned.Low probability, but cheap to fix: write
SDK_ROOTbefore surfacing the runtime, or fold the lock-poisoned path into a distinct error message.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/bridge_python/src/runtime.rs` around lines 76 - 83, The initialize_runtime success path can expose the engine singleton before store_sdk_root may fail, leaving a live BamlRuntime while SDK_ROOT wasn't written; fix by calling store_sdk_root(sdk_root.clone()) and handling its error before constructing/returning Ok(BamlRuntime { bex, sdk_root }) (i.e., move the store_sdk_root call to precede creation/return of BamlRuntime), and when store_sdk_root returns an error detect RwLock poisoning and convert it into a clear Err (instead of leaving the engine live) so get_runtime() won't later produce the misleading import error—adjust error mapping accordingly (reference: bridge_cffi::engine::initialize_runtime, store_sdk_root, BamlRuntime, get_runtime).baml_language/crates/baml_codegen_python/src/routing.rs (1)
25-30: Doc comment grammar nit."Empty segments means the root leaf" is awkward. Suggest "An empty
segmentslist represents the root leaf …".-/// Leaf path under `baml_sdk/`. Empty segments means the root leaf -/// (i.e. `baml_sdk/__init__.py`). +/// Leaf path under `baml_sdk/`. An empty `segments` list represents +/// the root leaf (i.e. `baml_sdk/__init__.py`).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/baml_codegen_python/src/routing.rs` around lines 25 - 30, The doc comment for struct LeafPath is grammatically awkward; update it to say that an empty segments list represents the root leaf. Specifically, change the sentence "Empty segments means the root leaf" in the comment above the LeafPath definition to something like "An empty `segments` list represents the root leaf (i.e. `baml_sdk/__init__.py`)." Keep the rest of the comment intact and ensure it references the LeafPath struct and its pub(crate) segments: Vec<String> field.baml_language/rig_tests/crates/empty/build.rs (2)
91-106: Symlink fallback for unsupported targets.
#[cfg(unix)]and#[cfg(windows)]together cover all current build targets, but if the crate is ever compiled for, say,wasior anothercfg, the symlink step silently turns into dead code andcustomizable/files won't appear ingenerated/. A#[cfg(not(any(unix, windows)))] compile_error!(...)(or a fallback tofs::copy) makes the contract explicit.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/rig_tests/crates/empty/build.rs` around lines 91 - 106, The unix/windows-specific symlink branches in build.rs (the std::os::unix::fs::symlink and std::os::windows::fs::symlink_file calls that use src, dst, and file_name) leave unsupported targets silently without generating the customizable/ files; add a third branch for unsupported targets using #[cfg(not(any(unix, windows)))] that either emits a compile_error! with a clear message or implements a fallback (e.g., fs::copy from src to dst) so the build contract is explicit and files in generated/ are produced on non-unix/windows platforms.
67-74: Top-level output paths can hitcreate_dir_all("").For an output entry whose
pathhas no directory component (e.g. just"__init__.py"at the SDK root),file_path.parent()will beSome(&baml_sdk_dir)which is fine — but ifto_source_codeever returns apaththat is itself empty or already absolute, behavior gets surprising (joinwith an absolute path discardsbaml_sdk_dir). A small defensive guard plus an explicit assertion that returned paths are relative would catch upstream regressions early.for (path, content) in output { + assert!(path.is_relative(), "codegen returned non-relative path: {}", path.display()); let file_path = baml_sdk_dir.join(&path); if let Some(parent) = file_path.parent() { - fs::create_dir_all(parent).unwrap(); + if !parent.as_os_str().is_empty() { + fs::create_dir_all(parent).unwrap(); + } } fs::write(&file_path, &content).unwrap(); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/rig_tests/crates/empty/build.rs` around lines 67 - 74, Guard against empty or absolute paths returned from baml_codegen_python::to_source_code by validating each path before joining: assert the returned path is non-empty and relative (not absolute), and handle the parent directory safely instead of calling fs::create_dir_all("") — e.g., check file_path.parent() and if None treat baml_sdk_dir as the parent or fail fast with a clear panic; add an explicit assertion/error when a path is absolute so upstream regressions are caught early (references: to_source_code, output, file_path, baml_sdk_dir, file_path.parent(), fs::create_dir_all).baml_language/crates/baml_project/src/client_codegen.rs (1)
305-319: Silent drop of orphan companions makes diagnostics harder.If a companion function with a
$suffix is encountered but no matching parent is inparents(e.g. the parent was filtered out as generic, or because of an upstream invariant break), the companion is silently discarded with only a comment. A debug-only log/tracing::warn!here would surface the case during dev without affecting release builds. Currently a regression that drops a parent (e.g. accidental generic-param mis-handling) would manifest only as missing companions in generated code, which is hard to root-cause.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/baml_project/src/client_codegen.rs` around lines 305 - 319, The loop that matches companions to parents currently drops orphan companions silently; update the block that iterates over companions (the for companion in companions { ... } loop) so that when parents.iter_mut().find(...) returns None you emit a diagnostic (e.g., tracing::warn!) with identifying fields: companion.pkg, companion.ns_path, companion.bare_name and companion.companion_suffix (or suffix) so developers can see which companion was dropped; ensure the crate imports/tracing usage is added if missing and keep the original push into parent_entry.func.companions unchanged when a match is found.baml_language/integ_tests/python/baml_sdk/__init__.py (1)
1-10: Subpackage namebamlcollides with the top-levelbamlruntime package.
baml_sdkships its ownbaml/subdirectory (used here asfrom .baml import _inlinedbamlandfrom . import baml) while also importing from the top-levelbaml.baml_core. Both are resolvable today (absolute vs. relative), but the name collision makes readers wonder whichbamlis meant in a given expression and is fragile if someone forgets a leading dot or rearranges imports. Consider renaming the inner directory (e.g.,_baml_inlined/or_generated/) so the SDK's private inlined-source tree doesn't shadow the runtime package name.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/integ_tests/python/baml_sdk/__init__.py` around lines 1 - 10, The SDK's package-local "baml" subpackage conflicts with the top-level runtime package; rename the internal inlined package directory (e.g., to "_baml_inlined" or "_generated") and update all references in this module: change "from .baml import _inlinedbaml" to import from the new package name, update the BamlRuntime.initialize_runtime call if it references the package (keeping sdk_root=__name__ as appropriate), and update "from . import baml, stream_types, test003" to import the renamed package (or re-export under a non-conflicting alias) so all uses of _inlinedbaml, baml, stream_types, and test003 point to the renamed internal package.baml_language/integ_tests/python/baml_sdk/test003/__init__.py (1)
23-23: EmitLiteral["positive","negative","neutral"]instead of aUnionof single-elementLiterals.
typing.Union[Literal["positive"], Literal["negative"], Literal["neutral"]]is semantically equivalent totyping.Literal["positive", "negative", "neutral"]but the latter is the idiomatic form, easier to read, and produces simpler errors from Pydantic. Worth fixing intranslate_tyso all generated SDKs benefit.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/integ_tests/python/baml_sdk/test003/__init__.py` at line 23, The code generator emits typing.Union[Literal["a"], Literal["b"], ...] instead of the idiomatic typing.Literal["a","b",...]; in translate_ty detect when the node is a Union whose all alternatives are Literal single-value types, collect those literal values and return a single typing.Literal with all values combined (instead of wrapping each as its own Literal), preserving existing handling for other Union cases and keeping the Literal value types intact.baml_language/crates/bridge_python/python_src/baml/baml_core/proto.py (2)
229-236: Deadfieldslookup before the handle-backed guard.
fields = cls.model_fields(and thetry/except) is computed but never read; the actual handle-backed check usescls.__private_attributes__. Either drop the lookup or wire it back into a real validation.♻️ Suggested cleanup
- # Handle-backed classes never arrive via `class_value` — BEX emits - # them through `handle_value`. If one slipped through, it's a bug in - # Rust-side encoding or a stale codegen; fail loud rather than build - # a half-initialized instance. - try: - fields = cls.model_fields # type: ignore[attr-defined] - except AttributeError: - fields = {} - if "_handle" in getattr(cls, "__private_attributes__", {}): + # Handle-backed classes never arrive via `class_value` — BEX emits + # them through `handle_value`. If one slipped through, it's a bug in + # Rust-side encoding or a stale codegen; fail loud rather than build + # a half-initialized instance. + if "_handle" in getattr(cls, "__private_attributes__", {}): raise BamlError( f"BEX emitted class_value for handle-backed class {fqn!r}" )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/bridge_python/python_src/baml/baml_core/proto.py` around lines 229 - 236, The try/except that reads cls.model_fields is dead because its result (fields) is never used before the handle-backed guard; either remove the unused lookup or use it in the validation. Fix by deleting the unused block that assigns fields = cls.model_fields (and its except) OR incorporate fields into the guard/validation (e.g., ensure model_fields does not define public fields for a handle-backed class) so the check around cls.__private_attributes__ and "_handle" (and the BamlError raised with fqn) correctly accounts for model_fields.
178-193: Map-key fallback silently coerces unhashable/structured keys.Falling back to
entry.string_key = str(key)for any non-{bool,str,int,enum} key will produce nondescriptive strings (e.g.,"<object at 0x…>") and is recoverable on neither side of the wire. Consider raisingTypeError(consistent with_set_inbound_value) so callers see encoding errors at the callsite instead of after a roundtrip.🛡️ Suggested change
- else: - entry.string_key = str(key) # best-effort fallback + else: + raise TypeError( + f"Cannot encode map key of type {type(key).__name__} " + f"in argument {kwarg_name!r}" + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/bridge_python/python_src/baml/baml_core/proto.py` around lines 178 - 193, The fallback in _set_inbound_map_entry that does entry.string_key = str(key) silently coerces unhashable/structured keys; replace that fallback with a raised TypeError so callers see encoding errors immediately (mirror the behavior used in _set_inbound_value). Specifically, in function _set_inbound_map_entry, remove the final else that sets entry.string_key and instead raise TypeError with a clear message including kwarg_name and the offending key's type/value (so the caller can identify the bad key).baml_language/crates/baml_codegen_python/src/leaf.rs (1)
1-312: Add#[cfg(test)]unit tests for the renderer.
group_and_sort,render_factory_binding,render_param_list, and therender_leaf_bodyimport-block / fan-out spacing logic are all easy to drive with syntheticEmittedSymbol/SortKeyfixtures. Locking in the exact whitespace (sync/async LHS alignment, two-blank PEP-8 gap, fan-out coalescing,__all__ordering) as snapshot-style asserts protects against accidental drift.As per coding guidelines: "Prefer writing Rust unit tests over integration tests where possible".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/baml_codegen_python/src/leaf.rs` around lines 1 - 312, Add a #[cfg(test)] module with unit tests that exercise group_and_sort, render_factory_binding, render_param_list, and render_leaf_body: create synthetic EmittedSymbol and SortKey fixtures to drive cases for (a) grouping/sorting of triples, (b) sync vs async LHS alignment and mode padding in render_factory_binding, (c) empty and multi-name lists in render_param_list, and (d) import-block presence, two-blank-line PEP-8 gaps, and function fan-out coalescing plus __all__ ordering in render_leaf_body; use snapshot-style string assertions (or exact equality checks) to lock the exact whitespace and ordering so regressions in these formatting rules are detected.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@baml_language/crates/baml_cli/src/generate.rs`:
- Around line 211-218: The current map over source_files uses sf.path(db) and
silently falls back to the absolute path when strip_prefix(baml_src) fails,
which can inject out-of-tree absolute keys into the generated FILES; change this
to fail fast or filter those entries: in the closure that calls sf.path(db) and
rel = path.strip_prefix(baml_src).unwrap_or(&path) replace the unwrap_or
behavior with either (a) return an Err or panic with a clear message if
strip_prefix fails (so the codegen halts and surfaces the bad source), or (b)
filter out such sources by only mapping entries where
strip_prefix(baml_src).is_ok(); update the mapping that builds the FILES dict
(the source_files.iter().map(...) block) accordingly so only canonical in-tree
relative paths are used.
In `@baml_language/crates/baml_codegen_python/src/lib.rs`:
- Around line 169-182: The generated dict keys in render_inlinedbaml are
platform-dependent because rel.to_string_lossy() can contain backslashes on
Windows; normalize path separators to forward slashes before stringifying so
_inlinedbaml.py contains consistent logical keys. In the loop in
render_inlinedbaml (for (rel, contents) in entries), convert the path to a
string with rel.to_string_lossy(), replace backslashes with '/' (or otherwise
normalize) and then pass that normalized key into py_string (same place HEADER
is used) so FILES keys are stable across platforms.
In `@baml_language/crates/bridge_python/python_src/baml/baml_core/__init__.py`:
- Around line 186-215: The cache _RESOLVE_TYPE_CACHE is keyed only by
baml_type_expr but resolution also depends on get_runtime()._sdk_root, so modify
_resolve_type to include the current sdk_root in the cache key (e.g. prefix or
tuple of (sdk_root, baml_type_expr)) or clear/invalidate _RESOLVE_TYPE_CACHE
whenever initialize_runtime changes the runtime; update references in
_resolve_type (uses get_runtime()._sdk_root, _routed_python_path, and the final
cache store) to use the new composite key or ensure runtime initialization
clears the cache to avoid stale bindings to old SDK classes.
- Around line 312-340: The factory in _make_call currently always passes None
for ctx/collectors/abort_controller; fix _sync and _async to pull reserved
kwargs (e.g., "__ctx", "__collectors", "__abort_controller" or a chosen trio)
out of kwargs before calling _build_kwargs/encode_call_args so they are not
encoded as function args, defaulting to None if absent, then pass those values
into rt.call_function_sync(engine_fqn, args_proto, ctx, collectors,
abort_controller) and await rt.call_function(..., ctx, collectors,
abort_controller) respectively; use the existing symbols _make_call, _sync,
_async, _build_kwargs, encode_call_args, get_runtime, rt.call_function_sync and
rt.call_function to locate and apply the change.
In `@baml_language/crates/bridge_python/src/lib.rs`:
- Around line 30-31: Update stale references to the old Python module name
`baml_py` to `baml_core`: change the README description that says the crate
"exposes a `baml_py` Python module" to mention `baml_core`, and update the code
comment in the collector type (the comment near the collector implementation
that currently imports from `baml_py`) to reference `baml_core`; keep the
existing function/module symbol `baml_core` (the #[pymodule] fn baml_core(...)
declaration) as the canonical name in docs and comments.
In `@baml_language/crates/bridge_python/tests/sim/baml_sdk/lorem/__init__.py`:
- Around line 7-10: Update the stale comment in baml_sdk.lorem to reflect the
actual FQN transformation performed by the bridge: remove or reword the claim
that the bridge translates `root.` → `user.` and instead state that
`_baml_fqn_to_engine_fqn` (in
baml_language/crates/bridge_python/python_src/baml/baml_core/__init__.py) strips
the `root.` prefix without rewriting it to `user.`; reference the
`_baml_fqn_to_engine_fqn` symbol in the comment so future readers can find the
implementation that performs the transform.
In `@baml_language/integ_tests/python/baml_sdk/__init__.py`:
- Around line 6-8: The module currently calls
BamlRuntime.initialize_runtime("baml_src", _inlinedbaml.FILES,
sdk_root=__name__) at import time which creates process-global state as a side
effect; change this to lazy initialization by removing the direct call and
adding a getter or explicit init function (e.g., implement get_runtime() that
checks if BamlRuntime.is_initialized()/a private _runtime variable exists and
calls BamlRuntime.initialize_runtime(...) on first use, or expose an explicit
init() that callers can invoke) and update any consumers to call get_runtime()
or init() instead of relying on import-time initialization.
In `@baml_language/rig_tests/crates/basic_type_shapes/build.rs`:
- Around line 228-235: The current watch_dir(...) invocation only emits
cargo:rerun-if-changed for each existing .baml file so adding/removing files
doesn't retrigger the build; update build.rs to also emit a
cargo:rerun-if-changed for the directories themselves (use the baml_src path and
customizable_dir when it exists) in addition to the per-file calls inside
watch_dir so Cargo sees directory mtime changes (adds/removals) and reruns the
build script; locate the logic around watch_dir, baml_src, and customizable_dir
and add a println!("cargo:rerun-if-changed={}", <dir_path>) for baml_src and for
customizable_dir when customizable_dir.exists().
- Around line 59-66: The current creation of user_baml_files uses rel (from
sf.path(&db).strip_prefix(&canonical)) which on Windows retains backslashes and
becomes a dict key that mismatches Python tests; modify the construction of
UserBamlFile in the mapping so that rel is normalized to a forward-slash joined
string (e.g. convert the PathBuf components or the stringified path to use '/'
separators) before pairing with sf.text(&db), ensuring the downstream
render_inlinedbaml / Python emitter sees keys like "ns_lorem/types.baml"
consistently across platforms.
- Around line 90-108: The Windows symlink step in build.rs (the
std::os::windows::fs::symlink_file call that panics on Err) will fail for
non-elevated/dev-mode users; change the Windows branch for symlinking (the block
referencing src, dst, file_name) to handle errors instead of unconditionally
panicking: attempt symlink_file, and if it returns an error due to privileges,
fall back to copying the file with std::fs::copy (or at minimum log a clear,
actionable error telling users to enable Developer Mode or run as admin),
preserving the existing panic only for unrecoverable errors; ensure you
reference and use the same dst/src/file_name variables so behavior matches the
unix branch.
In `@baml_language/rig_tests/crates/llm_functions/customizable/test_main.py`:
- Around line 137-142: The assertions in test_main.py assume POSIX separators
when checking keys in _inlinedbaml.FILES, which will fail on Windows because
build.rs currently produces backslash paths via
Path::strip_prefix(...).to_path_buf(); update build.rs to normalize stored keys
to use forward slashes (e.g., convert PathBuf to a UTF-8 string and replace
backslashes with '/') when populating FILES so the tests that assert membership
against "ns_lorem/types.baml" and similar keys work cross-platform, or
alternately change the test to normalize keys (join parts with "/") before
asserting; reference _inlinedbaml.FILES and Path::strip_prefix in build.rs to
locate the change.
---
Outside diff comments:
In `@baml_language/crates/baml_compiler2_emit/src/lib.rs`:
- Around line 925-941: Replace uses of lower_type_expr that pass an empty or
mis-ordered namespace/generic-params with the namespace-aware API
lower_type_expr_in_ns so unqualified type names in non-root namespaces resolve
correctly: update the class field lowering site (where lower_type_expr(db,
&te.expr, pkg_items, &[], &mut diags) is used) to call lower_type_expr_in_ns(db,
&te.expr, pkg_items, &pkg_info.namespace_path, &[], &mut diags) and update the
stream return-type lowering site (where lower_type_expr(db, type_expr,
pkg_items, &pkg_info.namespace_path, &mut diags) is used) to call
lower_type_expr_in_ns(db, type_expr, pkg_items, &pkg_info.namespace_path, &[],
&mut diags); ensure you pass pkg_items, &pkg_info.namespace_path, an empty
generic params slice, and &mut diags in that order and then feed the returned
tir type into the same cache.convert call as before; add/adjust unit tests to
cover a non-root namespace class field referencing a sibling type.
In `@baml_language/rig_tests/crates/empty/build.rs`:
- Around line 1-256: The build.rs duplicates large scaffolding present in
llm_functions; extract the shared logic into a small helper crate (e.g.
rig_build_support) that exposes a single entry point like
rig_build_support::run(BuildConfig { project_name: String, require_nonempty:
bool, customizable_dir: PathBuf, baml_src: PathBuf, generated_dir: PathBuf, ..
}), move the symlink logic (currently using customizable_dir),
conftest_py/pyproject_toml/test.sh/test.ps1 generation, rerun-if-changed
printing, and helpers watch_dir/walk_files into that crate, and then simplify
this main() to build the ProjectDatabase, diagnostics, symbol pool and call
rig_build_support::run(...) with the project-specific pieces (pass
generated_dir, project name "baml-test-empty", and whether to require non-empty
baml_files); update llm_functions/build.rs similarly to call the same helper.
---
Nitpick comments:
In `@baml_language/crates/baml_cli/src/generate.rs`:
- Around line 125-133: Validate generator paths before joining: ensure each
rel_path from generated is not absolute and contains no ParentDir components to
prevent escaping output_dir. In the loop over generated (rel_path, content)
check rel_path.is_relative() and iterate rel_path.components() to reject any
Component::ParentDir (or otherwise return an error/context) before calling
output_dir.join(rel_path) and writing dest; keep the same error handling via
with_context on std::fs::write and increment count only after a successful
write.
In `@baml_language/crates/baml_codegen_python/src/emit/function.rs`:
- Around line 18-34: Add derives to the PyFunction struct so it can be easily
inspected and duplicated during emit/fan-out; update the declaration of
PyFunction to derive Debug and Clone (e.g., #[derive(Debug, Clone)]) so dbg!
logging in emit/mod.rs and subsequent consumption in leaf.rs work smoothly and
cloning companion stubs is supported; retain SyncAsync's current derives and
ensure no other trait collisions occur.
In `@baml_language/crates/baml_codegen_python/src/leaf.rs`:
- Around line 1-312: Add a #[cfg(test)] module with unit tests that exercise
group_and_sort, render_factory_binding, render_param_list, and render_leaf_body:
create synthetic EmittedSymbol and SortKey fixtures to drive cases for (a)
grouping/sorting of triples, (b) sync vs async LHS alignment and mode padding in
render_factory_binding, (c) empty and multi-name lists in render_param_list, and
(d) import-block presence, two-blank-line PEP-8 gaps, and function fan-out
coalescing plus __all__ ordering in render_leaf_body; use snapshot-style string
assertions (or exact equality checks) to lock the exact whitespace and ordering
so regressions in these formatting rules are detected.
In `@baml_language/crates/baml_codegen_python/src/routing.rs`:
- Around line 25-30: The doc comment for struct LeafPath is grammatically
awkward; update it to say that an empty segments list represents the root leaf.
Specifically, change the sentence "Empty segments means the root leaf" in the
comment above the LeafPath definition to something like "An empty `segments`
list represents the root leaf (i.e. `baml_sdk/__init__.py`)." Keep the rest of
the comment intact and ensure it references the LeafPath struct and its
pub(crate) segments: Vec<String> field.
In `@baml_language/crates/baml_project/src/client_codegen.rs`:
- Around line 305-319: The loop that matches companions to parents currently
drops orphan companions silently; update the block that iterates over companions
(the for companion in companions { ... } loop) so that when
parents.iter_mut().find(...) returns None you emit a diagnostic (e.g.,
tracing::warn!) with identifying fields: companion.pkg, companion.ns_path,
companion.bare_name and companion.companion_suffix (or suffix) so developers can
see which companion was dropped; ensure the crate imports/tracing usage is added
if missing and keep the original push into parent_entry.func.companions
unchanged when a match is found.
In `@baml_language/crates/bex_project/src/bex.rs`:
- Around line 34-56: Add unit tests for coerce_arg_to_declared_type covering:
Map -> Instance coercion when Ty::Class(type_name, _) should produce
BexExternalValue::Instance with class_name overridden to type_name and fields
preserved; Instance input with a different class name should be normalized to
class_name = type_name for Ty::Class; Variant input should have enum_name
overridden to type_name for Ty::Enum; and a non-matching (e.g., Int or String)
value with any Ty should be returned unchanged. Put tests in the bex.rs
unit-test module using the BexExternalValue::{Map,Instance,Variant} constructors
and Ty::{Class,Enum} so behavior is locked in and fails if the function
regresses.
In `@baml_language/crates/bridge_python/python_src/baml/baml_core/proto.py`:
- Around line 229-236: The try/except that reads cls.model_fields is dead
because its result (fields) is never used before the handle-backed guard; either
remove the unused lookup or use it in the validation. Fix by deleting the unused
block that assigns fields = cls.model_fields (and its except) OR incorporate
fields into the guard/validation (e.g., ensure model_fields does not define
public fields for a handle-backed class) so the check around
cls.__private_attributes__ and "_handle" (and the BamlError raised with fqn)
correctly accounts for model_fields.
- Around line 178-193: The fallback in _set_inbound_map_entry that does
entry.string_key = str(key) silently coerces unhashable/structured keys; replace
that fallback with a raised TypeError so callers see encoding errors immediately
(mirror the behavior used in _set_inbound_value). Specifically, in function
_set_inbound_map_entry, remove the final else that sets entry.string_key and
instead raise TypeError with a clear message including kwarg_name and the
offending key's type/value (so the caller can identify the bad key).
In `@baml_language/crates/bridge_python/src/runtime.rs`:
- Around line 76-83: The initialize_runtime success path can expose the engine
singleton before store_sdk_root may fail, leaving a live BamlRuntime while
SDK_ROOT wasn't written; fix by calling store_sdk_root(sdk_root.clone()) and
handling its error before constructing/returning Ok(BamlRuntime { bex, sdk_root
}) (i.e., move the store_sdk_root call to precede creation/return of
BamlRuntime), and when store_sdk_root returns an error detect RwLock poisoning
and convert it into a clear Err (instead of leaving the engine live) so
get_runtime() won't later produce the misleading import error—adjust error
mapping accordingly (reference: bridge_cffi::engine::initialize_runtime,
store_sdk_root, BamlRuntime, get_runtime).
In `@baml_language/crates/bridge_python/tests/test_engine.py`:
- Around line 178-187: The test test_missing_argument_raises uses a too-loose
regex match="argument"; change it to assert the expected phrasings by replacing
the match with a tighter regex such as r"(Invalid|Missing) argument" (or
equivalent) when calling pytest.raises in the test that calls
call_function_sync(rt, "ReturnNumber", {}) after creating rt via
make_runtime(EXPR_FUNCS_BAML), so the failure only matches the documented
"Invalid argument" or "Missing argument" messages.
In `@baml_language/crates/bridge_python/tests/test_sim_phase3.py`:
- Around line 19-21: The top-level mutation of sys.path using SIM_ROOT should be
moved into a fixture to avoid global state leakage; replace the module-level
block that calls sys.path.insert(0, str(SIM_ROOT)) with a pytest fixture (e.g.,
add_sim_root_to_syspath) that either uses
monkeypatch.syspath_prepend(str(SIM_ROOT)) per-test or is declared autouse=True,
scope="session" to prepend SIM_ROOT for the test(s) and automatically restore
sys.path afterward — update test_sim_phase3.py to define and use that fixture
instead of mutating sys.path at import time.
In `@baml_language/integ_tests/python/baml_sdk/__init__.py`:
- Around line 1-10: The SDK's package-local "baml" subpackage conflicts with the
top-level runtime package; rename the internal inlined package directory (e.g.,
to "_baml_inlined" or "_generated") and update all references in this module:
change "from .baml import _inlinedbaml" to import from the new package name,
update the BamlRuntime.initialize_runtime call if it references the package
(keeping sdk_root=__name__ as appropriate), and update "from . import baml,
stream_types, test003" to import the renamed package (or re-export under a
non-conflicting alias) so all uses of _inlinedbaml, baml, stream_types, and
test003 point to the renamed internal package.
In `@baml_language/integ_tests/python/baml_sdk/test003/__init__.py`:
- Line 23: The code generator emits typing.Union[Literal["a"], Literal["b"],
...] instead of the idiomatic typing.Literal["a","b",...]; in translate_ty
detect when the node is a Union whose all alternatives are Literal single-value
types, collect those literal values and return a single typing.Literal with all
values combined (instead of wrapping each as its own Literal), preserving
existing handling for other Union cases and keeping the Literal value types
intact.
In `@baml_language/rig_tests/crates/empty/build.rs`:
- Around line 91-106: The unix/windows-specific symlink branches in build.rs
(the std::os::unix::fs::symlink and std::os::windows::fs::symlink_file calls
that use src, dst, and file_name) leave unsupported targets silently without
generating the customizable/ files; add a third branch for unsupported targets
using #[cfg(not(any(unix, windows)))] that either emits a compile_error! with a
clear message or implements a fallback (e.g., fs::copy from src to dst) so the
build contract is explicit and files in generated/ are produced on
non-unix/windows platforms.
- Around line 67-74: Guard against empty or absolute paths returned from
baml_codegen_python::to_source_code by validating each path before joining:
assert the returned path is non-empty and relative (not absolute), and handle
the parent directory safely instead of calling fs::create_dir_all("") — e.g.,
check file_path.parent() and if None treat baml_sdk_dir as the parent or fail
fast with a clear panic; add an explicit assertion/error when a path is absolute
so upstream regressions are caught early (references: to_source_code, output,
file_path, baml_sdk_dir, file_path.parent(), fs::create_dir_all).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: b7049a69-f230-41a2-80cf-8d36cce53f19
⛔ Files ignored due to path filters (5)
baml_language/Cargo.lockis excluded by!**/*.lockbaml_language/crates/bridge_go/cffi/proto/baml/cffi/v1/baml_inbound.pb.gois excluded by!**/*.pb.gobaml_language/crates/bridge_go/cffi/proto/baml/cffi/v1/baml_outbound.pb.gois excluded by!**/*.pb.gobaml_language/crates/bridge_python/uv.lockis excluded by!**/*.locktools/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (128)
baml_language/Cargo.tomlbaml_language/crates/baml_cli/src/generate.rsbaml_language/crates/baml_codegen_python/Cargo.tomlbaml_language/crates/baml_codegen_python/src/_askama/config.pybaml_language/crates/baml_codegen_python/src/_askama/globals.pybaml_language/crates/baml_codegen_python/src/_askama/runtime.pybaml_language/crates/baml_codegen_python/src/_askama/runtime.py.j2baml_language/crates/baml_codegen_python/src/_askama/tracing.pybaml_language/crates/baml_codegen_python/src/docstring.rsbaml_language/crates/baml_codegen_python/src/emit/class.rsbaml_language/crates/baml_codegen_python/src/emit/enum_.rsbaml_language/crates/baml_codegen_python/src/emit/function.rsbaml_language/crates/baml_codegen_python/src/emit/instance_method.rsbaml_language/crates/baml_codegen_python/src/emit/mod.rsbaml_language/crates/baml_codegen_python/src/emit/static_method.rsbaml_language/crates/baml_codegen_python/src/emit/type_alias.rsbaml_language/crates/baml_codegen_python/src/leaf.rsbaml_language/crates/baml_codegen_python/src/lib.rsbaml_language/crates/baml_codegen_python/src/objects.rsbaml_language/crates/baml_codegen_python/src/objects/render/class.rsbaml_language/crates/baml_codegen_python/src/objects/render/enum.rsbaml_language/crates/baml_codegen_python/src/objects/render/function.rsbaml_language/crates/baml_codegen_python/src/objects/render/type_alias.rsbaml_language/crates/baml_codegen_python/src/routing.rsbaml_language/crates/baml_codegen_python/src/translate_ty.rsbaml_language/crates/baml_codegen_python/src/ty.rsbaml_language/crates/baml_codegen_tests/Cargo.tomlbaml_language/crates/baml_codegen_tests/src/builders.rsbaml_language/crates/baml_codegen_tests/src/lib.rsbaml_language/crates/baml_codegen_types/src/lib.rsbaml_language/crates/baml_codegen_types/src/symbols.rsbaml_language/crates/baml_codegen_types/src/ty.rsbaml_language/crates/baml_compiler2_emit/src/lib.rsbaml_language/crates/baml_compiler2_hir/src/item_tree.rsbaml_language/crates/baml_project/src/client_codegen.rsbaml_language/crates/baml_project/src/lib.rsbaml_language/crates/bex_project/Cargo.tomlbaml_language/crates/bex_project/src/bex.rsbaml_language/crates/bridge_ctypes/types/baml/cffi/v1/baml_outbound.protobaml_language/crates/bridge_nodejs/typescript_src/proto/baml_cffi.d.tsbaml_language/crates/bridge_nodejs/typescript_src/proto/baml_cffi.jsbaml_language/crates/bridge_python/Cargo.tomlbaml_language/crates/bridge_python/pyproject.tomlbaml_language/crates/bridge_python/python_src/baml/__init__.pybaml_language/crates/bridge_python/python_src/baml/baml_core/__init__.pybaml_language/crates/bridge_python/python_src/baml/baml_core/__init__.pyibaml_language/crates/bridge_python/python_src/baml/baml_core/cffi/__init__.pybaml_language/crates/bridge_python/python_src/baml/baml_core/cffi/v1/__init__.pybaml_language/crates/bridge_python/python_src/baml/baml_core/cffi/v1/baml_inbound_pb2.pybaml_language/crates/bridge_python/python_src/baml/baml_core/cffi/v1/baml_outbound_pb2.pybaml_language/crates/bridge_python/python_src/baml/baml_core/ctx_manager.pybaml_language/crates/bridge_python/python_src/baml/baml_core/errors.pybaml_language/crates/bridge_python/python_src/baml/baml_core/proto.pybaml_language/crates/bridge_python/python_src/baml/cffi/v1/baml_inbound_pb2.pybaml_language/crates/bridge_python/python_src/baml/cffi/v1/baml_outbound_pb2.pybaml_language/crates/bridge_python/python_src/baml/py.typedbaml_language/crates/bridge_python/python_src/baml_py/__init__.pybaml_language/crates/bridge_python/python_src/baml_py/proto.pybaml_language/crates/bridge_python/src/errors.rsbaml_language/crates/bridge_python/src/lib.rsbaml_language/crates/bridge_python/src/runtime.rsbaml_language/crates/bridge_python/tests/conftest.pybaml_language/crates/bridge_python/tests/sim/baml_sdk/__init__.pybaml_language/crates/bridge_python/tests/sim/baml_sdk/baml/__init__.pybaml_language/crates/bridge_python/tests/sim/baml_sdk/baml/_inlinedbaml.pybaml_language/crates/bridge_python/tests/sim/baml_sdk/lorem/__init__.pybaml_language/crates/bridge_python/tests/sim/baml_src/ns_lorem/root.bamlbaml_language/crates/bridge_python/tests/test_collector.pybaml_language/crates/bridge_python/tests/test_engine.pybaml_language/crates/bridge_python/tests/test_sim_phase3.pybaml_language/crates/bridge_python/tests/test_tracing.pybaml_language/crates/tools_rig/Cargo.tomlbaml_language/crates/tools_rig/src/main.rsbaml_language/integ_tests/python/.gitignorebaml_language/integ_tests/python/baml_sdk/__init__.pybaml_language/integ_tests/python/baml_sdk/baml/__init__.pybaml_language/integ_tests/python/baml_sdk/baml/_inlinedbaml.pybaml_language/integ_tests/python/baml_sdk/py.typedbaml_language/integ_tests/python/baml_sdk/stream_types/__init__.pybaml_language/integ_tests/python/baml_sdk/stream_types/test003/__init__.pybaml_language/integ_tests/python/baml_sdk/test003/__init__.pybaml_language/rig_tests/README.mdbaml_language/rig_tests/crate_templates/python/Cargo.toml.templatebaml_language/rig_tests/crate_templates/python/build.rs.templatebaml_language/rig_tests/crate_templates/python/customizable/test_main.py.templatebaml_language/rig_tests/crates/basic_type_shapes/Cargo.tomlbaml_language/rig_tests/crates/basic_type_shapes/baml_src/ns_complex/types.bamlbaml_language/rig_tests/crates/basic_type_shapes/baml_src/ns_literals/types.bamlbaml_language/rig_tests/crates/basic_type_shapes/baml_src/ns_maps/types.bamlbaml_language/rig_tests/crates/basic_type_shapes/baml_src/ns_ty_variants/types.bamlbaml_language/rig_tests/crates/basic_type_shapes/baml_src/ns_unions/types.bamlbaml_language/rig_tests/crates/basic_type_shapes/build.rsbaml_language/rig_tests/crates/basic_type_shapes/customizable/test_main.pybaml_language/rig_tests/crates/basic_type_shapes/src/lib.rsbaml_language/rig_tests/crates/empty/Cargo.tomlbaml_language/rig_tests/crates/empty/build.rsbaml_language/rig_tests/crates/empty/customizable/test_main.pybaml_language/rig_tests/crates/empty/src/lib.rsbaml_language/rig_tests/crates/llm_functions/Cargo.tomlbaml_language/rig_tests/crates/llm_functions/baml_src/ns_ipsum/functions.bamlbaml_language/rig_tests/crates/llm_functions/baml_src/ns_ipsum/types.bamlbaml_language/rig_tests/crates/llm_functions/baml_src/ns_lorem/functions.bamlbaml_language/rig_tests/crates/llm_functions/baml_src/ns_lorem/types.bamlbaml_language/rig_tests/crates/llm_functions/build.rsbaml_language/rig_tests/crates/llm_functions/customizable/test_main.pybaml_language/rig_tests/crates/llm_functions/src/lib.rsbaml_language/rig_tests/crates/python_empty/Cargo.tomlbaml_language/rig_tests/crates/python_empty/build.rsbaml_language/rig_tests/crates/python_literal_types/Cargo.tomlbaml_language/rig_tests/crates/python_literal_types/build.rsbaml_language/rig_tests/crates/python_literal_types/customizable/test_main.pybaml_language/rig_tests/crates/python_map_types/Cargo.tomlbaml_language/rig_tests/crates/python_map_types/build.rsbaml_language/rig_tests/crates/python_map_types/customizable/test_main.pybaml_language/rig_tests/crates/python_map_types/src/lib.rsbaml_language/rig_tests/crates/python_mixed_complex_types/Cargo.tomlbaml_language/rig_tests/crates/python_mixed_complex_types/build.rsbaml_language/rig_tests/crates/python_mixed_complex_types/customizable/test_main.pybaml_language/rig_tests/crates/python_mixed_complex_types/src/lib.rsbaml_language/rig_tests/crates/python_semantic_streaming/Cargo.tomlbaml_language/rig_tests/crates/python_semantic_streaming/build.rsbaml_language/rig_tests/crates/python_semantic_streaming/customizable/test_main.pybaml_language/rig_tests/crates/python_semantic_streaming/src/lib.rsbaml_language/rig_tests/crates/python_union_types_extended/Cargo.tomlbaml_language/rig_tests/crates/python_union_types_extended/build.rsbaml_language/rig_tests/crates/python_union_types_extended/customizable/test_main.pybaml_language/rig_tests/crates/python_union_types_extended/src/lib.rsbaml_language/stow.toml
💤 Files with no reviewable changes (48)
- baml_language/crates/baml_codegen_python/Cargo.toml
- baml_language/rig_tests/crates/python_map_types/customizable/test_main.py
- baml_language/crates/baml_codegen_tests/Cargo.toml
- baml_language/crates/baml_codegen_python/src/_askama/tracing.py
- baml_language/crates/tools_rig/Cargo.toml
- baml_language/rig_tests/crates/python_union_types_extended/customizable/test_main.py
- baml_language/rig_tests/crate_templates/python/customizable/test_main.py.template
- baml_language/rig_tests/crates/python_semantic_streaming/src/lib.rs
- baml_language/rig_tests/crates/python_map_types/Cargo.toml
- baml_language/rig_tests/crates/python_literal_types/customizable/test_main.py
- baml_language/rig_tests/crates/python_mixed_complex_types/customizable/test_main.py
- baml_language/crates/bridge_python/python_src/baml/cffi/v1/baml_outbound_pb2.py
- baml_language/Cargo.toml
- baml_language/rig_tests/crates/python_semantic_streaming/customizable/test_main.py
- baml_language/rig_tests/crates/python_literal_types/Cargo.toml
- baml_language/rig_tests/crates/python_union_types_extended/Cargo.toml
- baml_language/crates/bridge_python/python_src/baml/cffi/v1/baml_inbound_pb2.py
- baml_language/rig_tests/crates/python_union_types_extended/src/lib.rs
- baml_language/crates/baml_codegen_python/src/objects/render/enum.rs
- baml_language/rig_tests/crates/python_mixed_complex_types/Cargo.toml
- baml_language/rig_tests/crates/python_empty/Cargo.toml
- baml_language/rig_tests/crates/python_mixed_complex_types/src/lib.rs
- baml_language/crates/baml_codegen_python/src/objects/render/class.rs
- baml_language/rig_tests/crates/python_literal_types/build.rs
- baml_language/crates/bridge_python/python_src/baml_py/proto.py
- baml_language/crates/baml_codegen_python/src/docstring.rs
- baml_language/crates/bridge_python/python_src/baml_py/init.py
- baml_language/rig_tests/crates/python_union_types_extended/build.rs
- baml_language/crates/baml_codegen_python/src/_askama/runtime.py.j2
- baml_language/rig_tests/crates/python_empty/build.rs
- baml_language/rig_tests/crates/python_semantic_streaming/build.rs
- baml_language/crates/baml_codegen_tests/src/lib.rs
- baml_language/crates/bridge_ctypes/types/baml/cffi/v1/baml_outbound.proto
- baml_language/crates/baml_codegen_python/src/_askama/config.py
- baml_language/rig_tests/crates/python_map_types/build.rs
- baml_language/crates/baml_codegen_python/src/objects.rs
- baml_language/crates/baml_codegen_python/src/ty.rs
- baml_language/crates/baml_codegen_python/src/objects/render/function.rs
- baml_language/crates/tools_rig/src/main.rs
- baml_language/crates/baml_codegen_python/src/_askama/runtime.py
- baml_language/crates/baml_codegen_python/src/_askama/globals.py
- baml_language/rig_tests/crate_templates/python/build.rs.template
- baml_language/rig_tests/crates/python_map_types/src/lib.rs
- baml_language/rig_tests/crates/python_mixed_complex_types/build.rs
- baml_language/rig_tests/crate_templates/python/Cargo.toml.template
- baml_language/crates/baml_codegen_tests/src/builders.rs
- baml_language/crates/baml_codegen_python/src/objects/render/type_alias.rs
- baml_language/rig_tests/crates/python_semantic_streaming/Cargo.toml
| source_files | ||
| .iter() | ||
| .map(|sf| { | ||
| let path = sf.path(db); | ||
| let rel = path.strip_prefix(baml_src).unwrap_or(&path).to_path_buf(); | ||
| (rel, sf.text(db).to_string()) | ||
| }) | ||
| .collect() |
There was a problem hiding this comment.
strip_prefix fallback silently uses absolute paths.
If sf.path(db) ever returns a path that isn't under baml_src (e.g., a synthesized/built-in source, a symlink resolved out-of-tree, or a path that differs only in canonicalization), strip_prefix(baml_src).unwrap_or(&path) falls back to the absolute path. That absolute PathBuf then becomes a key in the inlined FILES dict and is later joined into the output dir — effectively writing outside baml_sdk/baml/_inlinedbaml.py's expected layout, or producing keys the runtime's initialize_runtime won't recognize.
Consider failing fast (or filtering) when a source is not under baml_src, since that almost certainly indicates a bug upstream rather than something the codegen should silently absorb.
Suggested change
source_files
.iter()
- .map(|sf| {
+ .filter_map(|sf| {
let path = sf.path(db);
- let rel = path.strip_prefix(baml_src).unwrap_or(&path).to_path_buf();
- (rel, sf.text(db).to_string())
+ let rel = path.strip_prefix(baml_src).ok()?.to_path_buf();
+ Some((rel, sf.text(db).to_string()))
})
.collect()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| source_files | |
| .iter() | |
| .map(|sf| { | |
| let path = sf.path(db); | |
| let rel = path.strip_prefix(baml_src).unwrap_or(&path).to_path_buf(); | |
| (rel, sf.text(db).to_string()) | |
| }) | |
| .collect() | |
| source_files | |
| .iter() | |
| .filter_map(|sf| { | |
| let path = sf.path(db); | |
| let rel = path.strip_prefix(baml_src).ok()?.to_path_buf(); | |
| Some((rel, sf.text(db).to_string())) | |
| }) | |
| .collect() |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@baml_language/crates/baml_cli/src/generate.rs` around lines 211 - 218, The
current map over source_files uses sf.path(db) and silently falls back to the
absolute path when strip_prefix(baml_src) fails, which can inject out-of-tree
absolute keys into the generated FILES; change this to fail fast or filter those
entries: in the closure that calls sf.path(db) and rel =
path.strip_prefix(baml_src).unwrap_or(&path) replace the unwrap_or behavior with
either (a) return an Err or panic with a clear message if strip_prefix fails (so
the codegen halts and surfaces the bad source), or (b) filter out such sources
by only mapping entries where strip_prefix(baml_src).is_ok(); update the mapping
that builds the FILES dict (the source_files.iter().map(...) block) accordingly
so only canonical in-tree relative paths are used.
| fn render_inlinedbaml(files: &[UserBamlFile]) -> String { | ||
| let mut entries: Vec<(&PathBuf, &String)> = files.iter().map(|(p, c)| (p, c)).collect(); | ||
| entries.sort_by(|a, b| a.0.cmp(b.0)); | ||
|
|
||
| let mut s = String::from(HEADER); | ||
| s.push('\n'); | ||
| s.push_str("FILES: dict[str, str] = {\n"); | ||
| for (rel, contents) in entries { | ||
| let key = rel.to_string_lossy(); | ||
| writeln!(s, " {}: {},", py_string(&key), py_string(contents)).unwrap(); | ||
| } | ||
| s.push_str("}\n"); | ||
| s | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Confirm callers do not normalize separators before constructing UserBamlFile entries.
rg -nP -C3 'UserBamlFile|to_source_code\(' --type=rustRepository: BoundaryML/baml
Length of output: 32840
🏁 Script executed:
rg -n "FILES\[" --type=python -A 2 -B 2Repository: BoundaryML/baml
Length of output: 90
🏁 Script executed:
rg -n "replace.*[\\\\]|normalize|separator" --type=rust | head -20Repository: BoundaryML/baml
Length of output: 6307
🏁 Script executed:
sed -n '650,670p' baml_language/crates/baml_codegen_python/src/lib.rsRepository: BoundaryML/baml
Length of output: 839
🏁 Script executed:
sed -n '59,67p' baml_language/rig_tests/crates/llm_functions/build.rsRepository: BoundaryML/baml
Length of output: 350
🏁 Script executed:
find . -name "*.baml" -type f 2>/dev/null | head -10Repository: BoundaryML/baml
Length of output: 752
🏁 Script executed:
sed -n '169,182p' baml_language/crates/baml_codegen_python/src/lib.rsRepository: BoundaryML/baml
Length of output: 554
🏁 Script executed:
find . -path ./target -prune -o -name "*.py" -type f -print 2>/dev/null | head -5Repository: BoundaryML/baml
Length of output: 437
🏁 Script executed:
rg -n "FILES\[|_inlinedbaml" --type=rust | head -20Repository: BoundaryML/baml
Length of output: 2334
Normalize path separators to forward slashes in _inlinedbaml.py keys for cross-platform consistency.
render_inlinedbaml uses rel.to_string_lossy() to generate dict keys, which on Windows produces backslash-separated paths (e.g., "ns_lorem\\types.baml"). The unit test passes because it manually constructs PathBuf entries with forward slashes, but real builds via build.rs strip prefixes from filesystem paths, yielding platform-dependent separators. This causes the generated FILES dict to have inconsistent keys across platforms. Normalize path separators to / before stringifying (e.g., rel.to_string_lossy().replace('\\', '/')), since these are logical keys consumed by the runtime, not filesystem paths.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@baml_language/crates/baml_codegen_python/src/lib.rs` around lines 169 - 182,
The generated dict keys in render_inlinedbaml are platform-dependent because
rel.to_string_lossy() can contain backslashes on Windows; normalize path
separators to forward slashes before stringifying so _inlinedbaml.py contains
consistent logical keys. In the loop in render_inlinedbaml (for (rel, contents)
in entries), convert the path to a string with rel.to_string_lossy(), replace
backslashes with '/' (or otherwise normalize) and then pass that normalized key
into py_string (same place HEADER is used) so FILES keys are stable across
platforms.
| _RESOLVE_TYPE_CACHE: Dict[str, type] = {} | ||
|
|
||
|
|
||
| def _resolve_type(baml_type_expr: str) -> type: | ||
| """Resolve a BAML type expression (FQN, optionally `$stream`-suffixed) | ||
| to the Python class under `sdk_root`. Phase-4 extends this; phase 3 | ||
| only wires up the memoization + import path so the decoder has a | ||
| seam to drop into. | ||
| """ | ||
| cached = _RESOLVE_TYPE_CACHE.get(baml_type_expr) | ||
| if cached is not None: | ||
| return cached | ||
|
|
||
| expr = baml_type_expr | ||
| is_stream = False | ||
| if expr.endswith("$stream"): | ||
| expr = expr[: -len("$stream")] | ||
| is_stream = True | ||
|
|
||
| subpath = _routed_python_path(expr) | ||
| if is_stream: | ||
| subpath = f"stream_types.{subpath}" | ||
|
|
||
| sdk_root = get_runtime()._sdk_root | ||
| full = f"{sdk_root}.{subpath}" | ||
| module_path, _, type_name = full.rpartition(".") | ||
| module = importlib.import_module(module_path) | ||
| resolved = getattr(module, type_name) | ||
| _RESOLVE_TYPE_CACHE[baml_type_expr] = resolved | ||
| return resolved |
There was a problem hiding this comment.
_RESOLVE_TYPE_CACHE is keyed only on the BAML expression — stale across initialize_runtime re-inits.
The cache stores resolved classes by baml_type_expr, but the resolution depends on get_runtime()._sdk_root. If a test (or hot-reload) re-calls BamlRuntime.initialize_runtime(...) with a different sdk_root, decoded values will silently bind to the old SDK's classes. Either include sdk_root in the cache key, or invalidate the cache from the runtime accessor.
🛡️ Suggested guard
-_RESOLVE_TYPE_CACHE: Dict[str, type] = {}
+_RESOLVE_TYPE_CACHE: Dict[tuple, type] = {}
@@
- cached = _RESOLVE_TYPE_CACHE.get(baml_type_expr)
+ sdk_root = get_runtime()._sdk_root
+ cache_key = (sdk_root, baml_type_expr)
+ cached = _RESOLVE_TYPE_CACHE.get(cache_key)
if cached is not None:
return cached
@@
- sdk_root = get_runtime()._sdk_root
full = f"{sdk_root}.{subpath}"
module_path, _, type_name = full.rpartition(".")
module = importlib.import_module(module_path)
resolved = getattr(module, type_name)
- _RESOLVE_TYPE_CACHE[baml_type_expr] = resolved
+ _RESOLVE_TYPE_CACHE[cache_key] = resolved
return resolved🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@baml_language/crates/bridge_python/python_src/baml/baml_core/__init__.py`
around lines 186 - 215, The cache _RESOLVE_TYPE_CACHE is keyed only by
baml_type_expr but resolution also depends on get_runtime()._sdk_root, so modify
_resolve_type to include the current sdk_root in the cache key (e.g. prefix or
tuple of (sdk_root, baml_type_expr)) or clear/invalidate _RESOLVE_TYPE_CACHE
whenever initialize_runtime changes the runtime; update references in
_resolve_type (uses get_runtime()._sdk_root, _routed_python_path, and the final
cache store) to use the new composite key or ensure runtime initialization
clears the cache to avoid stale bindings to old SDK classes.
| def _make_call( | ||
| baml_fqn: str, mode: Mode, param_names: List[str] | ||
| ) -> Callable[..., Any]: | ||
| names = list(param_names) | ||
| engine_fqn = _baml_fqn_to_engine_fqn(baml_fqn) | ||
| if mode == "sync": | ||
| def _sync(*args: Any, **kwargs: Any) -> Any: | ||
| merged = _build_kwargs(args, kwargs, names) | ||
| rt = get_runtime() | ||
| args_proto = encode_call_args(merged) | ||
| result_bytes = rt.call_function_sync(engine_fqn, args_proto, None, None, None) | ||
| return decode_call_result(result_bytes) | ||
|
|
||
| _sync.__name__ = baml_fqn.rsplit(".", 1)[-1] or baml_fqn | ||
| _sync.__qualname__ = _sync.__name__ | ||
| return _sync | ||
| elif mode == "async": | ||
| async def _async(*args: Any, **kwargs: Any) -> Any: | ||
| merged = _build_kwargs(args, kwargs, names) | ||
| rt = get_runtime() | ||
| args_proto = encode_call_args(merged) | ||
| result_bytes = await rt.call_function(engine_fqn, args_proto, None, None, None) | ||
| return decode_call_result(result_bytes) | ||
|
|
||
| _async.__name__ = baml_fqn.rsplit(".", 1)[-1] or baml_fqn | ||
| _async.__qualname__ = _async.__name__ | ||
| return _async | ||
| else: | ||
| raise ValueError(f"mode must be 'sync' or 'async', got {mode!r}") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's verify the file exists and examine lines 312-340
cd /tmp && find . -name "__init__.py" -path "*bridge_python*baml_core*" 2>/dev/null | head -5Repository: BoundaryML/baml
Length of output: 41
🏁 Script executed:
# Search for the file in the repository
find . -type f -name "__init__.py" | grep -i "bridge_python" | grep -i "baml_core"Repository: BoundaryML/baml
Length of output: 297
🏁 Script executed:
# List the structure to understand the repository layout
ls -laRepository: BoundaryML/baml
Length of output: 3971
🏁 Script executed:
# Search for bridge_python directory
fd "bridge_python" --type dRepository: BoundaryML/baml
Length of output: 95
🏁 Script executed:
# Look for the specific file
fd "__init__.py" | grep -i baml_coreRepository: BoundaryML/baml
Length of output: 365
🏁 Script executed:
# Broader search for Python files in bridge context
find . -type f -path "*bridge*python*" -name "*.py" 2>/dev/null | head -20Repository: BoundaryML/baml
Length of output: 1355
🏁 Script executed:
wc -l ./baml_language/crates/bridge_python/python_src/baml/baml_core/__init__.pyRepository: BoundaryML/baml
Length of output: 138
🏁 Script executed:
# Read lines 312-340 to verify the code snippet
sed -n '312,340p' ./baml_language/crates/bridge_python/python_src/baml/baml_core/__init__.pyRepository: BoundaryML/baml
Length of output: 1308
🏁 Script executed:
# Also read some context around this function (lines 305-345)
sed -n '305,345p' ./baml_language/crates/bridge_python/python_src/baml/baml_core/__init__.pyRepository: BoundaryML/baml
Length of output: 1617
🏁 Script executed:
# Let's search for the call_function_sync and call_function method signatures
# to understand what these parameters are supposed to be
rg "def call_function" ./baml_language/crates/bridge_python/ -A 5Repository: BoundaryML/baml
Length of output: 4204
🏁 Script executed:
# Search for runtime class definition or interface
rg "class.*Runtime|call_function_sync|call_function[^_]" ./baml_language/crates/bridge_python/ --type py -B 2 -A 2Repository: BoundaryML/baml
Length of output: 48332
🏁 Script executed:
# Look for ctx, collectors, abort_controller usage in the codebase
rg "ctx|collectors|abort_controller" ./baml_language/crates/bridge_python/ --type py | head -30Repository: BoundaryML/baml
Length of output: 2883
🏁 Script executed:
# Search for any documentation about these parameters
rg "ctx.*collector|abort" ./baml_language/crates/bridge_python/ --type py -B 2 -A 2 | head -40Repository: BoundaryML/baml
Length of output: 2435
🏁 Script executed:
# Check if there's any documentation about the difference between explicit helpers and generated code
rg "call_function|_make_call|reserved.*kwargs|__ctx|__collectors" ./baml_language/crates/bridge_python/ --type py -B 3 -A 3 | grep -i "doc\|reserved\|thread" | head -20Repository: BoundaryML/baml
Length of output: 41
🏁 Script executed:
# Search for comments or documentation around _make_call
sed -n '300,355p' ./baml_language/crates/bridge_python/python_src/baml/baml_core/__init__.pyRepository: BoundaryML/baml
Length of output: 2284
🏁 Script executed:
# Look for any mention of design docs or architecture notes about this
fd -e md -e txt | xargs grep -l "call_function\|ctx.*collectors" 2>/dev/null | head -5Repository: BoundaryML/baml
Length of output: 227
🏁 Script executed:
# Check the bridge_python README for any documentation about the design
cat ./baml_language/crates/bridge_python/README.md | head -100Repository: BoundaryML/baml
Length of output: 855
🏁 Script executed:
# Look for any integration test or example that shows the expected usage pattern
find ./integ-tests -name "*.py" -type f 2>/dev/null | head -5 | xargs grep -l "call_function\|define_function" 2>/dev/null | head -3Repository: BoundaryML/baml
Length of output: 111
🏁 Script executed:
# Check if _build_kwargs filters or processes ctx/collectors/abort_controller
rg "_build_kwargs" ./baml_language/crates/bridge_python/ --type py -A 10 | head -40Repository: BoundaryML/baml
Length of output: 4079
🏁 Script executed:
# Let me check the full _build_kwargs implementation to confirm it doesn't filter reserved kwargs
sed -n '260,295p' ./baml_language/crates/bridge_python/python_src/baml/baml_core/__init__.pyRepository: BoundaryML/baml
Length of output: 1501
🏁 Script executed:
# Check if there's any integration test showing generated SDK usage
cat ./integ-tests/python-v1/baml_client/async_client.py | head -80Repository: BoundaryML/baml
Length of output: 3056
Generated callables cannot access span context, collectors, or abort controllers—thread them through or document the limitation.
The factory closures in _make_call hardcode None for ctx, collectors, and abort_controller when calling rt.call_function_sync() and rt.call_function(). These parameters support observability and cancellation (confirmed by their presence in runtime method signatures and extensive test coverage), but generated baml_sdk.* functions provide no way for callers to pass them.
The explicit call_function_sync and call_function helpers do accept these parameters, but generated code uses only _make_call. Either extract and thread these through from reserved kwargs (e.g., __ctx, __collectors, __abort) before encoding function arguments, or document clearly that only the explicit helper path supports these features.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@baml_language/crates/bridge_python/python_src/baml/baml_core/__init__.py`
around lines 312 - 340, The factory in _make_call currently always passes None
for ctx/collectors/abort_controller; fix _sync and _async to pull reserved
kwargs (e.g., "__ctx", "__collectors", "__abort_controller" or a chosen trio)
out of kwargs before calling _build_kwargs/encode_call_args so they are not
encoded as function args, defaulting to None if absent, then pass those values
into rt.call_function_sync(engine_fqn, args_proto, ctx, collectors,
abort_controller) and await rt.call_function(..., ctx, collectors,
abort_controller) respectively; use the existing symbols _make_call, _sync,
_async, _build_kwargs, encode_call_args, get_runtime, rt.call_function_sync and
rt.call_function to locate and apply the change.
| #[pymodule] | ||
| fn baml_py(m: &Bound<'_, PyModule>) -> PyResult<()> { | ||
| fn baml_core(m: &Bound<'_, PyModule>) -> PyResult<()> { |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Confirm the cdylib name and any maturin/pyproject config match `baml_core`.
fd -t f Cargo.toml baml_language/crates/bridge_python --exec cat {}
echo "----"
fd -t f 'pyproject.toml' baml_language/crates/bridge_python --exec cat {}
echo "----"
# Look for any lingering baml_py references in this crate's manifests / build glue.
rg -nP '\bbaml_py\b' baml_language/crates/bridge_pythonRepository: BoundaryML/baml
Length of output: 1872
Update outdated documentation and code comments referencing the old baml_py module name.
Cargo.toml and pyproject.toml are correctly configured for the baml_core rename and will build successfully. However, stale references to the old module name remain:
README.mdline 3 still mentions "exposes abaml_pyPython module"src/types/collector.rsline 13 has a code comment importing frombaml_py
Update these to reflect the new baml_core name for consistency.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@baml_language/crates/bridge_python/src/lib.rs` around lines 30 - 31, Update
stale references to the old Python module name `baml_py` to `baml_core`: change
the README description that says the crate "exposes a `baml_py` Python module"
to mention `baml_core`, and update the code comment in the collector type (the
comment near the collector implementation that currently imports from `baml_py`)
to reference `baml_core`; keep the existing function/module symbol `baml_core`
(the #[pymodule] fn baml_core(...) declaration) as the canonical name in docs
and comments.
| BamlRuntime.initialize_runtime( | ||
| "baml_src", _inlinedbaml.FILES, sdk_root=__name__ | ||
| ) |
There was a problem hiding this comment.
initialize_runtime runs as an import side effect.
Importing baml_sdk (e.g., from a test runner or a tool that does inspect-style introspection) will eagerly call BamlRuntime.initialize_runtime(...), mutating process-global state and forcing the BAML compile path even when the importer never intends to call any BAML function. That makes test isolation fragile (re-initializing across tests, parallel pytest workers) and complicates static analysis tooling. Consider deferring init until first call (lazy via get_runtime()) or providing an explicit init() entrypoint.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@baml_language/integ_tests/python/baml_sdk/__init__.py` around lines 6 - 8,
The module currently calls BamlRuntime.initialize_runtime("baml_src",
_inlinedbaml.FILES, sdk_root=__name__) at import time which creates
process-global state as a side effect; change this to lazy initialization by
removing the direct call and adding a getter or explicit init function (e.g.,
implement get_runtime() that checks if BamlRuntime.is_initialized()/a private
_runtime variable exists and calls BamlRuntime.initialize_runtime(...) on first
use, or expose an explicit init() that callers can invoke) and update any
consumers to call get_runtime() or init() instead of relying on import-time
initialization.
| let user_baml_files: Vec<UserBamlFile> = source_files | ||
| .iter() | ||
| .map(|sf| { | ||
| let path = sf.path(&db); | ||
| let rel = path.strip_prefix(&canonical).unwrap_or(&path).to_path_buf(); | ||
| (rel, sf.text(&db).to_string()) | ||
| }) | ||
| .collect(); |
There was a problem hiding this comment.
Normalize path separators for inlined-file keys.
rel is built from Path::strip_prefix and turned directly into a PathBuf. On Windows it carries backslashes, which then flow into render_inlinedbaml (Python emitter) as the dict key — but the Python runtime/test expectations use forward slashes (e.g. "ns_lorem/types.baml" in test_main.py). Convert components to a forward-slash-joined string before building the UserBamlFile, e.g.:
♻️ Suggested normalization
- let rel = path.strip_prefix(&canonical).unwrap_or(&path).to_path_buf();
- (rel, sf.text(&db).to_string())
+ let rel = path.strip_prefix(&canonical).unwrap_or(&path);
+ let rel_normalized: PathBuf = rel
+ .components()
+ .map(|c| c.as_os_str().to_string_lossy().into_owned())
+ .collect::<Vec<_>>()
+ .join("/")
+ .into();
+ (rel_normalized, sf.text(&db).to_string())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@baml_language/rig_tests/crates/basic_type_shapes/build.rs` around lines 59 -
66, The current creation of user_baml_files uses rel (from
sf.path(&db).strip_prefix(&canonical)) which on Windows retains backslashes and
becomes a dict key that mismatches Python tests; modify the construction of
UserBamlFile in the mapping so that rel is normalized to a forward-slash joined
string (e.g. convert the PathBuf components or the stringified path to use '/'
separators) before pairing with sf.text(&db), ensuring the downstream
render_inlinedbaml / Python emitter sees keys like "ns_lorem/types.baml"
consistently across platforms.
| if dst.exists() || dst.symlink_metadata().is_ok() { | ||
| let _ = fs::remove_file(&dst); | ||
| } | ||
| #[cfg(unix)] | ||
| std::os::unix::fs::symlink(&src, &dst).unwrap_or_else(|_| { | ||
| panic!( | ||
| "Failed to symlink {} from {}", | ||
| file_name.to_string_lossy(), | ||
| src.display() | ||
| ) | ||
| }); | ||
| #[cfg(windows)] | ||
| std::os::windows::fs::symlink_file(&src, &dst).unwrap_or_else(|_| { | ||
| panic!( | ||
| "Failed to symlink {} from {}", | ||
| file_name.to_string_lossy(), | ||
| src.display() | ||
| ) | ||
| }); |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
What privileges are required for std::os::windows::fs::symlink_file in Rust on Windows 10/11?
💡 Result:
The std::os::windows::fs::symlink_file function in Rust requires the SeCreateSymbolicLinkPrivilege (Create symbolic links user right) to create file symbolic links on Windows 10 and 11. By default, this privilege is granted only to members of the Administrators group. Without this privilege, the function fails with error 1314: "A required privilege is not held by the client." Workarounds to create symlinks without running as elevated administrator: - Enable Developer Mode (Settings > Privacy & security > For developers > Developer Mode), available since Windows 10 version 1703 (Creators Update), which allows non-elevated users to create symlinks. - Grant the SeCreateSymbolicLinkPrivilege to the user account via Local Security Policy (secpol.msc > Local Policies > User Rights Assignment > Create symbolic links) or Group Policy (gpedit.msc). Running the process as administrator (elevated) also works, as it provides the privilege. This applies identically to Windows 10 and 11, with no changes noted as of 2026. Note that junctions (for directories) or hard links (for files) do not require special privileges but are distinct from symbolic links.
Citations:
- 1: https://doc.rust-lang.org/std/os/windows/fs/fn.symlink_file.html
- 2: https://stackoverflow.com/questions/64991523/why-are-administrator-privileges-required-to-create-a-symlink-on-windows
- 3: https://doc.rust-lang.org/beta/std/os/windows/fs/fn.symlink_dir.html
- 4: https://learn.microsoft.com/en-us/previous-versions/windows/it-pro/windows-10/security/threat-protection/security-policy-settings/create-symbolic-links
- 5: https://dscho.github.io/git-for-windows.github.io/symbolic-links.html
- 6: https://www.techbloat.com/how-to-create-a-symlink-on-windows-11-10.html
- 7: https://blogs.windows.com/windowsdeveloper/2016/12/02/symlinks-windows-10
- 8: symlink-install does not work on Windows 11: OSError: [WinError 1314] A required privilege is not held by the client colcon/colcon-core#689
- 9: https://stackoverflow.com/questions/32877260/privlege-error-trying-to-create-symlink-using-python-on-windows-10
- 10: https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-createsymboliclinka
🏁 Script executed:
# First, let's find and read the build.rs file
find . -name "build.rs" -path "*basic_type_shapes*" | head -5Repository: BoundaryML/baml
Length of output: 119
🏁 Script executed:
# Check the file size first
wc -l "baml_language/rig_tests/crates/basic_type_shapes/build.rs" 2>/dev/null || echo "File not found, searching..."Repository: BoundaryML/baml
Length of output: 121
🏁 Script executed:
# Search for the build.rs file
fd "build.rs" -p "*basic_type_shapes*"Repository: BoundaryML/baml
Length of output: 166
🏁 Script executed:
cat -n "baml_language/rig_tests/crates/basic_type_shapes/build.rs"Repository: BoundaryML/baml
Length of output: 10398
Windows developers without elevated privileges or Developer Mode will fail builds at this symlink step.
The std::os::windows::fs::symlink_file call requires either administrator privileges or Developer Mode (Windows 10 1703+), otherwise it fails with error 1314. Currently, the build panics unconditionally on failure, breaking local builds for any Windows contributor lacking that setup. Consider either falling back to fs::copy on Windows, or provide a clearer error message directing users to enable Developer Mode via Settings > Privacy & security > For developers, or to run with administrator privileges.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@baml_language/rig_tests/crates/basic_type_shapes/build.rs` around lines 90 -
108, The Windows symlink step in build.rs (the
std::os::windows::fs::symlink_file call that panics on Err) will fail for
non-elevated/dev-mode users; change the Windows branch for symlinking (the block
referencing src, dst, file_name) to handle errors instead of unconditionally
panicking: attempt symlink_file, and if it returns an error due to privileges,
fall back to copying the file with std::fs::copy (or at minimum log a clear,
actionable error telling users to enable Developer Mode or run as admin),
preserving the existing panic only for unrecoverable errors; ensure you
reference and use the same dst/src/file_name variables so behavior matches the
unix branch.
| // 8. rerun-if-changed for build.rs + every BAML and customizable | ||
| // file. | ||
| println!("cargo:rerun-if-changed=build.rs"); | ||
| watch_dir(&baml_src); | ||
| if customizable_dir.exists() { | ||
| watch_dir(&customizable_dir); | ||
| } | ||
| } |
There was a problem hiding this comment.
rerun-if-changed won't fire when a .baml file is added or removed.
watch_dir emits one cargo:rerun-if-changed per existing file, but Cargo only re-runs the build script when one of the recorded paths changes. Adding a new .baml file or deleting one inside baml_src/ won't trigger a rebuild because that path was never registered. Watch the directories themselves (Cargo treats directory mtime changes — including additions/removals — as a change) in addition to (or instead of) individual files:
♻️ Suggested fix
println!("cargo:rerun-if-changed=build.rs");
+ println!("cargo:rerun-if-changed={}", baml_src.display());
watch_dir(&baml_src);
if customizable_dir.exists() {
+ println!("cargo:rerun-if-changed={}", customizable_dir.display());
watch_dir(&customizable_dir);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // 8. rerun-if-changed for build.rs + every BAML and customizable | |
| // file. | |
| println!("cargo:rerun-if-changed=build.rs"); | |
| watch_dir(&baml_src); | |
| if customizable_dir.exists() { | |
| watch_dir(&customizable_dir); | |
| } | |
| } | |
| // 8. rerun-if-changed for build.rs + every BAML and customizable | |
| // file. | |
| println!("cargo:rerun-if-changed=build.rs"); | |
| println!("cargo:rerun-if-changed={}", baml_src.display()); | |
| watch_dir(&baml_src); | |
| if customizable_dir.exists() { | |
| println!("cargo:rerun-if-changed={}", customizable_dir.display()); | |
| watch_dir(&customizable_dir); | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@baml_language/rig_tests/crates/basic_type_shapes/build.rs` around lines 228 -
235, The current watch_dir(...) invocation only emits cargo:rerun-if-changed for
each existing .baml file so adding/removing files doesn't retrigger the build;
update build.rs to also emit a cargo:rerun-if-changed for the directories
themselves (use the baml_src path and customizable_dir when it exists) in
addition to the per-file calls inside watch_dir so Cargo sees directory mtime
changes (adds/removals) and reruns the build script; locate the logic around
watch_dir, baml_src, and customizable_dir and add a
println!("cargo:rerun-if-changed={}", <dir_path>) for baml_src and for
customizable_dir when customizable_dir.exists().
| paths = set(_inlinedbaml.FILES.keys()) | ||
| # Paths are relative to baml_src/. | ||
| assert "ns_lorem/types.baml" in paths | ||
| assert "ns_lorem/functions.baml" in paths | ||
| assert "ns_ipsum/types.baml" in paths | ||
| assert "ns_ipsum/functions.baml" in paths |
There was a problem hiding this comment.
Path key assertions assume / separators.
_inlinedbaml.FILES is keyed on the rel-path produced by build.rs via Path::strip_prefix(...).to_path_buf(). On Windows this produces backslash-separated keys, so these in paths assertions will fail there. Coupled with the path-normalization comment on build.rs — once the build script normalizes to /, these assertions will hold cross-platform.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@baml_language/rig_tests/crates/llm_functions/customizable/test_main.py`
around lines 137 - 142, The assertions in test_main.py assume POSIX separators
when checking keys in _inlinedbaml.FILES, which will fail on Windows because
build.rs currently produces backslash paths via
Path::strip_prefix(...).to_path_buf(); update build.rs to normalize stored keys
to use forward slashes (e.g., convert PathBuf to a UTF-8 string and replace
backslashes with '/') when populating FILES so the tests that assert membership
against "ns_lorem/types.baml" and similar keys work cross-platform, or
alternately change the test to normalize keys (join parts with "/") before
asserting; reference _inlinedbaml.FILES and Path::strip_prefix in build.rs to
locate the change.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (7)
baml_language/rig_tests/crates/empty/build.rs (3)
91-106:⚠️ Potential issue | 🟡 MinorDon't hard-fail on Windows symlink privilege errors.
symlink_filecommonly fails for non-admin users unless Developer Mode is enabled. Panicking here makes the fixture fail to build on a normal Windows setup even though copying the file is good enough for tests.♻️ Suggested fix
#[cfg(windows)] - std::os::windows::fs::symlink_file(&src, &dst).unwrap_or_else(|_| { - panic!( - "Failed to symlink {} from {}", - file_name.to_string_lossy(), - src.display() - ) - }); + { + if let Err(err) = std::os::windows::fs::symlink_file(&src, &dst) { + fs::copy(&src, &dst).unwrap_or_else(|copy_err| { + panic!( + "Failed to symlink or copy {} from {}: symlink error: {err}; copy error: {copy_err}", + file_name.to_string_lossy(), + src.display() + ) + }); + } + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/rig_tests/crates/empty/build.rs` around lines 91 - 106, The Windows branch currently calls std::os::windows::fs::symlink_file(&src, &dst).unwrap_or_else(|_| panic!(...)) which will hard-fail for non-admin users; change the error handling so that if symlink_file returns an error (e.g., privilege denied) you fall back to copying the file from src to dst (using std::fs::copy) and only panic for truly unrecoverable errors, preserving the same file_name/src context in any error logs; keep the unix branch untouched and reference symlink_file, src, dst, file_name when implementing this fallback.
230-234:⚠️ Potential issue | 🟡 MinorWatch the directories themselves too.
Right now Cargo only tracks the files that already exist, so adding or removing a
.bamlorcustomizablefile won't rerun the build script. Emitrerun-if-changedfor the directories as well.♻️ Suggested fix
println!("cargo:rerun-if-changed=build.rs"); + println!("cargo:rerun-if-changed={}", baml_src.display()); watch_dir(&baml_src); if customizable_dir.exists() { + println!("cargo:rerun-if-changed={}", customizable_dir.display()); watch_dir(&customizable_dir); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/rig_tests/crates/empty/build.rs` around lines 230 - 234, The build script currently calls watch_dir(&baml_src) and watch_dir(&customizable_dir) but only registers individual files, so addition/removal won't trigger reruns; update the build.rs logic (around the watch_dir calls and the customizable_dir check) to also emit Cargo's directory-level triggers by calling println!("cargo:rerun-if-changed={path}") for the baml_src directory and, if customizable_dir.exists(), for the customizable_dir directory as well so Cargo tracks changes to the directories themselves.
57-64:⚠️ Potential issue | 🟡 MinorNormalize
UserBamlFilekeys to forward slashes.
strip_prefix(...).to_path_buf()preserves backslashes on Windows, so the inlined-file keys diverge from the Python-side"foo/bar.baml"form. That breaks any lookup that compares generated keys across platforms.♻️ Suggested fix
let user_baml_files: Vec<UserBamlFile> = source_files .iter() .map(|sf| { let path = sf.path(&db); - let rel = path.strip_prefix(&canonical).unwrap_or(&path).to_path_buf(); - (rel, sf.text(&db).to_string()) + let rel = path.strip_prefix(&canonical).unwrap_or(&path); + let rel = rel + .components() + .map(|c| c.as_os_str().to_string_lossy().into_owned()) + .collect::<Vec<_>>() + .join("/"); + (PathBuf::from(rel), sf.text(&db).to_string()) }) .collect();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/rig_tests/crates/empty/build.rs` around lines 57 - 64, The generated UserBamlFile keys currently use platform-native separators (backslashes on Windows) because you build rel with strip_prefix(...).to_path_buf(); normalize those keys to use forward slashes so they match the Python-side "foo/bar.baml" form: inside the map for source_files (the closure constructing user_baml_files and the tuple currently built from rel and sf.text(&db)), convert rel into a canonical forward-slash string (e.g., stringify the path and replace backslashes with '/' or join components with '/') and use that normalized string as the key for UserBamlFile instead of the PathBuf.baml_language/crates/baml_project/src/client_codegen.rs (1)
117-118:⚠️ Potential issue | 🟡 MinorStore
Origin::source_file_pathin the same normalized form asUserBamlFile.
Originis documented as relative tobaml_src/and matching_inlinedbaml.pykeys, but this recordssource_file.path(db)directly. That gives you absolute paths and Windows backslashes, so any downstream origin-to-inlined-file lookup will miss.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/baml_project/src/client_codegen.rs` around lines 117 - 118, Origin::source_file_path is being set from source_file.path(db) which yields absolute Windows paths, causing mismatches with UserBamlFile lookups; change the assignment that creates source_file_path (the variable built from source_file.path(db)) to produce the same normalized form used by UserBamlFile: make the path relative to the baml_src root (remove any absolute prefix), convert backslashes to forward slashes, and ensure any canonicalization/normalization steps used by UserBamlFile are applied before storing into Origin::source_file_path so lookups against _inlinedbaml.py keys will match.baml_language/rig_tests/crates/basic_type_shapes/build.rs (3)
59-66:⚠️ Potential issue | 🟡 MinorNormalize
UserBamlFilekeys to forward slashes.
strip_prefix(...).to_path_buf()preserves backslashes on Windows, so the inlined-file keys diverge from the Python-side"foo/bar.baml"form. That breaks any lookup that compares generated keys across platforms.♻️ Suggested fix
let user_baml_files: Vec<UserBamlFile> = source_files .iter() .map(|sf| { let path = sf.path(&db); - let rel = path.strip_prefix(&canonical).unwrap_or(&path).to_path_buf(); - (rel, sf.text(&db).to_string()) + let rel = path.strip_prefix(&canonical).unwrap_or(&path); + let rel = rel + .components() + .map(|c| c.as_os_str().to_string_lossy().into_owned()) + .collect::<Vec<_>>() + .join("/"); + (PathBuf::from(rel), sf.text(&db).to_string()) }) .collect();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/rig_tests/crates/basic_type_shapes/build.rs` around lines 59 - 66, The generated UserBamlFile keys currently use a PathBuf (rel) which preserves platform backslashes on Windows; change the mapping that builds user_baml_files so the key is a normalized forward-slash string: after computing rel = path.strip_prefix(&canonical).unwrap_or(&path).to_path_buf(), convert it to a UTF-8-safe string and replace any backslashes with '/' (e.g. via to_string_lossy().replace('\\','/')) so the produced key matches the Python-style "foo/bar.baml" form; update the tuple type if needed to store a String instead of a PathBuf and keep the file contents as before (sf.text(&db).to_string()).
230-236:⚠️ Potential issue | 🟡 MinorWatch the directories themselves too.
Right now Cargo only tracks the files that already exist, so adding or removing a
.bamlorcustomizablefile won't rerun the build script. Emitrerun-if-changedfor the directories as well.♻️ Suggested fix
println!("cargo:rerun-if-changed=build.rs"); + println!("cargo:rerun-if-changed={}", baml_src.display()); watch_dir(&baml_src); if customizable_dir.exists() { + println!("cargo:rerun-if-changed={}", customizable_dir.display()); watch_dir(&customizable_dir); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/rig_tests/crates/basic_type_shapes/build.rs` around lines 230 - 236, The build script only calls watch_dir(&baml_src) and watch_dir(&customizable_dir) which emit rerun-if-changed for existing files but not the directories themselves, so update the logic around println!("cargo:rerun-if-changed=build.rs") and the watch_dir calls to also emit a rerun-if-changed for the directory paths (e.g., println!("cargo:rerun-if-changed={}", baml_src.display()) and similarly for customizable_dir) so Cargo will rerun the build when files are added or removed; modify the block that checks customizable_dir.exists() and the watch_dir(&baml_src) usage accordingly to include directory-level rerun-if-changed entries.
90-108:⚠️ Potential issue | 🟡 MinorDon't hard-fail on Windows symlink privilege errors.
symlink_filecommonly fails for non-admin users unless Developer Mode is enabled. Panicking here makes the fixture fail to build on a normal Windows setup even though copying the file is good enough for tests.♻️ Suggested fix
#[cfg(windows)] - std::os::windows::fs::symlink_file(&src, &dst).unwrap_or_else(|_| { - panic!( - "Failed to symlink {} from {}", - file_name.to_string_lossy(), - src.display() - ) - }); + { + if let Err(err) = std::os::windows::fs::symlink_file(&src, &dst) { + fs::copy(&src, &dst).unwrap_or_else(|copy_err| { + panic!( + "Failed to symlink or copy {} from {}: symlink error: {err}; copy error: {copy_err}", + file_name.to_string_lossy(), + src.display() + ) + }); + } + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/rig_tests/crates/basic_type_shapes/build.rs` around lines 90 - 108, The Windows branch currently calls std::os::windows::fs::symlink_file(&src, &dst).unwrap_or_else(|_| panic!(...)), which hard-fails on common non-admin/dev-mode permission errors; change this to attempt symlink_file and if it Errs, don’t panic—log or ignore the symlink error and fall back to copying the file with fs::copy(&src, &dst) (preserving the existing pre-remove logic that uses dst, src, file_name). Keep the unix branch as-is (std::os::unix::fs::symlink) but replace the unwrap_or_else panic in the Windows branch with logic that tries symlink_file, and on Err uses fs::copy(&src, &dst) instead of panicking.
🧹 Nitpick comments (1)
baml_language/crates/baml_compiler2_emit/src/lib.rs (1)
923-939: Add a focused unit test for namespaced signature resolution.This branch fixes a subtle namespace-dependent lowering path, but the local test module doesn't cover it yet. Locking it down here will be cheaper than relying on rig coverage to catch regressions.
As per coding guidelines, "Prefer writing Rust unit tests over integration tests where possible".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/baml_compiler2_emit/src/lib.rs` around lines 923 - 939, Add a focused Rust unit test in the local test module that exercises namespaced signature resolution via baml_compiler2_tir::lower_type_expr::lower_type_expr_in_ns: construct a minimal DB/pkg_items/pkg_info with a non-root namespace_path, create a type expression (te) that contains an unqualified reference (e.g. MyLorem) inside a function signature under that namespace, call lower_type_expr_in_ns(db, te, pkg_items, &pkg_info.namespace_path, &[], &mut diags), run cache.convert(&tir_ty), and assert the resulting type retains the declared parameter type (not Ty::Unknown or Ty::Void) and resolves to the expected class; this ensures tir_ty and cache.convert behavior for namespaced lowering is covered.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@baml_language/crates/baml_codegen_python/src/routing.rs`:
- Around line 16-19: Module docs incorrectly state that all non-`baml` symbols
go under `vendor/<pkg>/` but the implementation in route() special-cases the
"user" package to remain at the repository root; update the module-level comment
text to reflect that behavior by noting that symbols named "baml" are routed
under `baml/`, "user" remains at the root, and any other package names are
routed under `vendor/<pkg>/`, and also mention the `$stream` suffix rule already
described.
In `@baml_language/crates/baml_codegen_types/src/symbols.rs`:
- Around line 117-149: The validate() and walk_all_unions() implementations for
the symbol type omit the watchers field, so include validation and union-walking
for each watcher Ty. In validate(), iterate over self.watchers and call
.ty.validate() (propagate errors as the other checks do). In walk_all_unions(),
chain the union sets from self.watchers similar to arguments/companions by
mapping each watcher to watcher.ty.walk_all_unions() and flattening into the
final collected HashSet so watcher unions are included.
---
Duplicate comments:
In `@baml_language/crates/baml_project/src/client_codegen.rs`:
- Around line 117-118: Origin::source_file_path is being set from
source_file.path(db) which yields absolute Windows paths, causing mismatches
with UserBamlFile lookups; change the assignment that creates source_file_path
(the variable built from source_file.path(db)) to produce the same normalized
form used by UserBamlFile: make the path relative to the baml_src root (remove
any absolute prefix), convert backslashes to forward slashes, and ensure any
canonicalization/normalization steps used by UserBamlFile are applied before
storing into Origin::source_file_path so lookups against _inlinedbaml.py keys
will match.
In `@baml_language/rig_tests/crates/basic_type_shapes/build.rs`:
- Around line 59-66: The generated UserBamlFile keys currently use a PathBuf
(rel) which preserves platform backslashes on Windows; change the mapping that
builds user_baml_files so the key is a normalized forward-slash string: after
computing rel = path.strip_prefix(&canonical).unwrap_or(&path).to_path_buf(),
convert it to a UTF-8-safe string and replace any backslashes with '/' (e.g. via
to_string_lossy().replace('\\','/')) so the produced key matches the
Python-style "foo/bar.baml" form; update the tuple type if needed to store a
String instead of a PathBuf and keep the file contents as before
(sf.text(&db).to_string()).
- Around line 230-236: The build script only calls watch_dir(&baml_src) and
watch_dir(&customizable_dir) which emit rerun-if-changed for existing files but
not the directories themselves, so update the logic around
println!("cargo:rerun-if-changed=build.rs") and the watch_dir calls to also emit
a rerun-if-changed for the directory paths (e.g.,
println!("cargo:rerun-if-changed={}", baml_src.display()) and similarly for
customizable_dir) so Cargo will rerun the build when files are added or removed;
modify the block that checks customizable_dir.exists() and the
watch_dir(&baml_src) usage accordingly to include directory-level
rerun-if-changed entries.
- Around line 90-108: The Windows branch currently calls
std::os::windows::fs::symlink_file(&src, &dst).unwrap_or_else(|_| panic!(...)),
which hard-fails on common non-admin/dev-mode permission errors; change this to
attempt symlink_file and if it Errs, don’t panic—log or ignore the symlink error
and fall back to copying the file with fs::copy(&src, &dst) (preserving the
existing pre-remove logic that uses dst, src, file_name). Keep the unix branch
as-is (std::os::unix::fs::symlink) but replace the unwrap_or_else panic in the
Windows branch with logic that tries symlink_file, and on Err uses
fs::copy(&src, &dst) instead of panicking.
In `@baml_language/rig_tests/crates/empty/build.rs`:
- Around line 91-106: The Windows branch currently calls
std::os::windows::fs::symlink_file(&src, &dst).unwrap_or_else(|_| panic!(...))
which will hard-fail for non-admin users; change the error handling so that if
symlink_file returns an error (e.g., privilege denied) you fall back to copying
the file from src to dst (using std::fs::copy) and only panic for truly
unrecoverable errors, preserving the same file_name/src context in any error
logs; keep the unix branch untouched and reference symlink_file, src, dst,
file_name when implementing this fallback.
- Around line 230-234: The build script currently calls watch_dir(&baml_src) and
watch_dir(&customizable_dir) but only registers individual files, so
addition/removal won't trigger reruns; update the build.rs logic (around the
watch_dir calls and the customizable_dir check) to also emit Cargo's
directory-level triggers by calling println!("cargo:rerun-if-changed={path}")
for the baml_src directory and, if customizable_dir.exists(), for the
customizable_dir directory as well so Cargo tracks changes to the directories
themselves.
- Around line 57-64: The generated UserBamlFile keys currently use
platform-native separators (backslashes on Windows) because you build rel with
strip_prefix(...).to_path_buf(); normalize those keys to use forward slashes so
they match the Python-side "foo/bar.baml" form: inside the map for source_files
(the closure constructing user_baml_files and the tuple currently built from rel
and sf.text(&db)), convert rel into a canonical forward-slash string (e.g.,
stringify the path and replace backslashes with '/' or join components with '/')
and use that normalized string as the key for UserBamlFile instead of the
PathBuf.
---
Nitpick comments:
In `@baml_language/crates/baml_compiler2_emit/src/lib.rs`:
- Around line 923-939: Add a focused Rust unit test in the local test module
that exercises namespaced signature resolution via
baml_compiler2_tir::lower_type_expr::lower_type_expr_in_ns: construct a minimal
DB/pkg_items/pkg_info with a non-root namespace_path, create a type expression
(te) that contains an unqualified reference (e.g. MyLorem) inside a function
signature under that namespace, call lower_type_expr_in_ns(db, te, pkg_items,
&pkg_info.namespace_path, &[], &mut diags), run cache.convert(&tir_ty), and
assert the resulting type retains the declared parameter type (not Ty::Unknown
or Ty::Void) and resolves to the expected class; this ensures tir_ty and
cache.convert behavior for namespaced lowering is covered.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 7d7fb85b-bce7-41cf-8dc3-04236e20466b
⛔ Files ignored due to path filters (6)
baml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/namespaces_bare_name_rejected/baml_tests__namespaces_bare_name_rejected__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/namespaces_basic/baml_tests__namespaces_basic__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/namespaces_root_fallback/baml_tests__namespaces_root_fallback__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/namespaces_stream_direct_ref/baml_tests__namespaces_stream_direct_ref__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/namespaces_type_resolution/baml_tests__namespaces_type_resolution__06_codegen.snapis excluded by!**/*.snap
📒 Files selected for processing (17)
baml_language/crates/baml_codegen_python/src/emit/class.rsbaml_language/crates/baml_codegen_python/src/emit/function.rsbaml_language/crates/baml_codegen_python/src/emit/method.rsbaml_language/crates/baml_codegen_python/src/emit/mod.rsbaml_language/crates/baml_codegen_python/src/leaf.rsbaml_language/crates/baml_codegen_python/src/lib.rsbaml_language/crates/baml_codegen_python/src/routing.rsbaml_language/crates/baml_codegen_python/src/translate_ty.rsbaml_language/crates/baml_codegen_types/src/symbols.rsbaml_language/crates/baml_compiler2_emit/src/lib.rsbaml_language/crates/baml_project/src/client_codegen.rsbaml_language/crates/baml_tests/tests/builtins.rsbaml_language/crates/bridge_python/python_src/baml/baml_core/__init__.pybaml_language/crates/bridge_python/python_src/baml/baml_core/proto.pybaml_language/rig_tests/crates/basic_type_shapes/build.rsbaml_language/rig_tests/crates/empty/build.rsbaml_language/rig_tests/crates/llm_functions/build.rs
✅ Files skipped from review due to trivial changes (1)
- baml_language/crates/baml_codegen_python/src/emit/class.rs
🚧 Files skipped from review as they are similar to previous changes (4)
- baml_language/crates/baml_codegen_python/src/emit/function.rs
- baml_language/crates/baml_codegen_python/src/emit/mod.rs
- baml_language/rig_tests/crates/llm_functions/build.rs
- baml_language/crates/baml_codegen_python/src/translate_ty.rs
| fn validate(&self) -> Result<(), CodegenTypeError> { | ||
| self.arguments | ||
| .iter() | ||
| .map(|args| args.ty.validate()) | ||
| .collect::<Result<Vec<_>, _>>()?; | ||
| self.return_type.validate()?; | ||
| if let Some(stream_type) = &self.stream_return_type { | ||
| stream_type.validate()?; | ||
| } | ||
| for (_, companion) in &self.companions { | ||
| companion.validate()?; | ||
| } | ||
|
|
||
| Ok(()) | ||
| } | ||
|
|
||
| fn walk_all_unions(&self) -> HashSet<super::Ty> { | ||
| self.arguments | ||
| .iter() | ||
| .flat_map(|args| args.ty.walk_all_unions().into_iter()) | ||
| .chain(self.return_type.walk_all_unions()) | ||
| .chain( | ||
| self.stream_return_type | ||
| .iter() | ||
| .flat_map(|ty| ty.walk_all_unions().into_iter()), | ||
| ) | ||
| .chain( | ||
| self.companions | ||
| .iter() | ||
| .flat_map(|(_, c)| c.walk_all_unions().into_iter()), | ||
| ) | ||
| .collect() | ||
| } |
There was a problem hiding this comment.
Validate and traverse watchers too.
watchers carries Ty values, but both validate() and walk_all_unions() skip it. That lets invalid watcher signatures bypass validation, and watcher unions won't be collected for downstream emitters.
♻️ Suggested fix
impl Function {
fn validate(&self) -> Result<(), CodegenTypeError> {
self.arguments
.iter()
.map(|args| args.ty.validate())
.collect::<Result<Vec<_>, _>>()?;
self.return_type.validate()?;
if let Some(stream_type) = &self.stream_return_type {
stream_type.validate()?;
}
+ for (_, watcher_ty) in &self.watchers {
+ watcher_ty.validate()?;
+ }
for (_, companion) in &self.companions {
companion.validate()?;
}
Ok(())
@@
fn walk_all_unions(&self) -> HashSet<super::Ty> {
self.arguments
.iter()
.flat_map(|args| args.ty.walk_all_unions().into_iter())
.chain(self.return_type.walk_all_unions())
.chain(
self.stream_return_type
.iter()
.flat_map(|ty| ty.walk_all_unions().into_iter()),
)
+ .chain(
+ self.watchers
+ .iter()
+ .flat_map(|(_, ty)| ty.walk_all_unions().into_iter()),
+ )
.chain(
self.companions
.iter()
.flat_map(|(_, c)| c.walk_all_unions().into_iter()),
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@baml_language/crates/baml_codegen_types/src/symbols.rs` around lines 117 -
149, The validate() and walk_all_unions() implementations for the symbol type
omit the watchers field, so include validation and union-walking for each
watcher Ty. In validate(), iterate over self.watchers and call .ty.validate()
(propagate errors as the other checks do). In walk_all_unions(), chain the union
sets from self.watchers similar to arguments/companions by mapping each watcher
to watcher.ty.walk_all_unions() and flattening into the final collected HashSet
so watcher unions are included.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
baml_language/crates/baml_compiler2_emit/src/lib.rs (1)
975-1000:⚠️ Potential issue | 🔴 CriticalUse
lower_type_expr_in_nsto resolve stream-expanded types within the file's namespace context.Line 988 calls
lower_type_exprwith&pkg_info.namespace_pathas the 4th argument, butlower_type_exprexpectsgeneric_paramsthere and always uses&[]as the namespace context internally. This causes stream-expanded return types referencing classes in non-root namespaces to fail resolution, degrading toTy::Unknown→Ty::Void.Match the fix applied in
compute_function_metadata_from_item_tree(lines 933–939): uselower_type_expr_in_nsand pass&pkg_info.namespace_pathasns_contextand&[]asgeneric_params.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/baml_compiler2_emit/src/lib.rs` around lines 975 - 1000, In compute_stream_return_type replace the call to baml_compiler2_tir::lower_type_expr with baml_compiler2_tir::lower_type_expr_in_ns so the stream-expanded types resolve using the file's namespace: call lower_type_expr_in_ns(db, type_expr, pkg_items, &pkg_info.namespace_path /* ns_context */, &[] /* generic_params */, &mut diags) (preserving the diags Vec and subsequent conversion via baml_compiler2_mir::convert_tir2_ty) instead of the current lower_type_expr invocation.
🧹 Nitpick comments (2)
baml_language/crates/bridge_python/python_src/baml/baml_core/proto.py (2)
347-386: Consider raising an error for unhandled oneof variants instead of silently returningNone.Line 386 returns
Nonefor any unrecognizedwhichvalue. If the protobuf schema adds a new oneof variant that isn't handled here, the decoder will silently drop the data rather than surfacing the mismatch. This could lead to subtle bugs when the proto evolves.🛡️ Proposed defensive handling
if which in ("media_value", "prompt_ast_value"): raise BamlError( f"BEX emitted {which!r} on the FFI path — media/prompt AST " f"are expected via handle_value, not inline" ) - return None + # Unknown oneof variant — fail loud so proto/decoder mismatches surface. + raise BamlError(f"Unknown BamlOutboundValue variant: {which!r}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/bridge_python/python_src/baml/baml_core/proto.py` around lines 347 - 386, The decoder _decode_value_holder currently falls through to "return None" for unknown oneof variants, which can silently drop new proto fields; replace that silent return with a defensive error: raise a BamlError (or the module's canonical error) that includes the unexpected which value and the full holder for context so callers can surface proto mismatches; keep the existing special-case for ("media_value","prompt_ast_value") but ensure any truly unhandled variant triggers the error rather than returning None.
238-245: Remove dead code:fieldsvariable is assigned but never used.The
fieldsvariable is assigned on line 239 (or 241 in the except block) but is never referenced afterward. The handle-backed check on line 242 uses__private_attributes__instead. This appears to be leftover from a previous implementation.🧹 Proposed cleanup
# Handle-backed classes never arrive via `class_value` — BEX emits # them through `handle_value`. If one slipped through, it's a bug in # Rust-side encoding or a stale codegen; fail loud rather than build # a half-initialized instance. - try: - fields = cls.model_fields # type: ignore[attr-defined] - except AttributeError: - fields = {} if "_handle" in getattr(cls, "__private_attributes__", {}): raise BamlError( f"BEX emitted class_value for handle-backed class {fqn!r}" )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/bridge_python/python_src/baml/baml_core/proto.py` around lines 238 - 245, The assignments to the local variable "fields" via "fields = cls.model_fields" (and the except block) are dead and unused; remove the try/except block that sets "fields" and leave only the handle-backed check that references getattr(cls, "__private_attributes__", {}) and raises BamlError for "_handle" (preserve the BamlError(f"BEX emitted class_value for handle-backed class {fqn!r}") and surrounding logic). Ensure no other references to "fields" remain in proto.py so the import/state is consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@baml_language/demos/bridge_python_e2e/README.md`:
- Around line 82-92: The fenced code block that begins with ``` and shows the
project tree starting "bridge_python_e2e/" in the README is missing a language
identifier (markdownlint MD040); update the opening fence to include a language
such as text or bash (e.g., ```text) so the linter is satisfied and the snippet
remains unchanged, targeting the fenced block that contains the lines starting
with "bridge_python_e2e/".
---
Outside diff comments:
In `@baml_language/crates/baml_compiler2_emit/src/lib.rs`:
- Around line 975-1000: In compute_stream_return_type replace the call to
baml_compiler2_tir::lower_type_expr with
baml_compiler2_tir::lower_type_expr_in_ns so the stream-expanded types resolve
using the file's namespace: call lower_type_expr_in_ns(db, type_expr, pkg_items,
&pkg_info.namespace_path /* ns_context */, &[] /* generic_params */, &mut diags)
(preserving the diags Vec and subsequent conversion via
baml_compiler2_mir::convert_tir2_ty) instead of the current lower_type_expr
invocation.
---
Nitpick comments:
In `@baml_language/crates/bridge_python/python_src/baml/baml_core/proto.py`:
- Around line 347-386: The decoder _decode_value_holder currently falls through
to "return None" for unknown oneof variants, which can silently drop new proto
fields; replace that silent return with a defensive error: raise a BamlError (or
the module's canonical error) that includes the unexpected which value and the
full holder for context so callers can surface proto mismatches; keep the
existing special-case for ("media_value","prompt_ast_value") but ensure any
truly unhandled variant triggers the error rather than returning None.
- Around line 238-245: The assignments to the local variable "fields" via
"fields = cls.model_fields" (and the except block) are dead and unused; remove
the try/except block that sets "fields" and leave only the handle-backed check
that references getattr(cls, "__private_attributes__", {}) and raises BamlError
for "_handle" (preserve the BamlError(f"BEX emitted class_value for
handle-backed class {fqn!r}") and surrounding logic). Ensure no other references
to "fields" remain in proto.py so the import/state is consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 719ebc63-6dc7-448a-a0d8-b87a9d4ca3bc
📒 Files selected for processing (9)
baml_language/crates/baml_compiler2_emit/src/lib.rsbaml_language/crates/bridge_python/python_src/baml/baml_core/proto.pybaml_language/demos/bridge_python_e2e/.gitignorebaml_language/demos/bridge_python_e2e/README.mdbaml_language/demos/bridge_python_e2e/app/app.pybaml_language/demos/bridge_python_e2e/app/pyproject.tomlbaml_language/demos/bridge_python_e2e/baml_src/generators.bamlbaml_language/demos/bridge_python_e2e/baml_src/ns_lorem/types.bamlbaml_language/demos/bridge_python_e2e/run.sh
✅ Files skipped from review due to trivial changes (4)
- baml_language/demos/bridge_python_e2e/baml_src/generators.baml
- baml_language/demos/bridge_python_e2e/.gitignore
- baml_language/demos/bridge_python_e2e/app/pyproject.toml
- baml_language/demos/bridge_python_e2e/baml_src/ns_lorem/types.baml
| ``` | ||
| bridge_python_e2e/ | ||
| ├── run.sh # build + generate + maturin develop + run | ||
| ├── baml_src/ | ||
| │ ├── generators.baml # output_type "python/pydantic", output_dir "../app" | ||
| │ └── ns_lorem/types.baml # Address, Sentiment, PhoneNumber, Resume{transform} | ||
| └── app/ | ||
| ├── pyproject.toml # depends on baml (path ../../../crates/bridge_python) | ||
| ├── app.py # the user-shaped script | ||
| └── baml_sdk/ # ← written by `baml-cli generate` (gitignored) | ||
| ``` |
There was a problem hiding this comment.
Add a language identifier to the fenced code block (markdownlint MD040).
Line 82 opens a fenced block without a language tag, which triggers the linter warning.
🛠️ Minimal fix
-```
+```text
bridge_python_e2e/
├── run.sh # build + generate + maturin develop + run
├── baml_src/
│ ├── generators.baml # output_type "python/pydantic", output_dir "../app"
│ └── ns_lorem/types.baml # Address, Sentiment, PhoneNumber, Resume{transform}
└── app/
├── pyproject.toml # depends on baml (path ../../../crates/bridge_python)
├── app.py # the user-shaped script
└── baml_sdk/ # ← written by `baml-cli generate` (gitignored)</details>
<details>
<summary>🧰 Tools</summary>
<details>
<summary>🪛 markdownlint-cli2 (0.22.1)</summary>
[warning] 82-82: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
</details>
</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against the current code and only fix it if needed.
In @baml_language/demos/bridge_python_e2e/README.md around lines 82 - 92, The
fenced code block that begins with and shows the project tree starting "bridge_python_e2e/" in the README is missing a language identifier (markdownlint MD040); update the opening fence to include a language such as text or bash (e.g.,text) so the linter is satisfied and the snippet remains
unchanged, targeting the fenced block that contains the lines starting with
"bridge_python_e2e/".
</details>
<!-- fingerprinting:phantom:poseidon:hawk:8e5364fb-5cc4-4f4e-afa3-87cabff29835 -->
<!-- d98c2f50 -->
<!-- This is an auto-generated comment by CodeRabbit -->
There was a problem hiding this comment.
🧹 Nitpick comments (3)
baml_language/crates/sys_ops/src/lib.rs (1)
340-367: Consider adding unit tests for the new suffix-scan lookup.The three-tier lookup logic is correct. However, there are no unit tests covering the new suffix-scan behavior for
lookup_llm_function. A few targeted tests would help:
- Bare name match
user.{name}fallback- Suffix scan with a single
.{name}match- Suffix scan with multiple matches (should return
None)As per coding guidelines: "Prefer writing Rust unit tests over integration tests where possible."
Would you like me to generate unit tests for
lookup_llm_function?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/sys_ops/src/lib.rs` around lines 340 - 367, Add Rust unit tests for lookup_llm_function in the same module (lib.rs) that construct a HashMap<String, LlmFunctionInfo> and assert the four behaviors: (1) a bare-name key returns the matching LlmFunctionInfo, (2) when only "user.{name}" exists the function returns that entry, (3) when only a single key ending with ".{name}" exists the suffix-scan returns that entry, and (4) when multiple keys end with ".{name}" the function returns None; use small synthetic LlmFunctionInfo values (or a minimal stub) and assert pointer/equality on the returned &LlmFunctionInfo as appropriate. Ensure tests are annotated with #[cfg(test)] and #[test] and placed in the module's test scaffold so they run with cargo test.baml_language/crates/baml_codegen_python/src/leaf.rs (2)
256-863: Add crate-local tests for the new leaf renderer path.This file now owns a lot of deterministic formatting/order logic:
SortKeytie handling, cross-leafTYPE_CHECKINGimports, method fan-out spacing, and.py/.pyiparity. A few focused unit tests in this crate would make regressions much cheaper to catch, and this Rust change should also get acargo test --libpass before merge.As per coding guidelines, "Prefer writing Rust unit tests over integration tests where possible" and "Always run
cargo test --libif you changed any Rust code".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/baml_codegen_python/src/leaf.rs` around lines 256 - 863, Add focused crate-local unit tests that exercise the deterministic formatting and ordering logic in this file: create tests calling group_and_sort (to check SortKey tie-breaking and symbol_kind_ord ordering), render_leaf_body and render_leaf_body_pyi (to verify stdlib/factory imports, cross-leaf TYPE_CHECKING blocks, two-blank-line separators, and __all__ emission), and render_symbol/render_symbol_pyi or helper flows (to cover class/enum/type-alias/function layouts and method fan-out spacing via push_method_block and push_method_signatures_pyi); implement small synthetic LeafBody/EmittedSymbol/SortKey/ PyMethodBinding/PyFunction fixtures to assert exact output strings, add them as #[cfg(test)] mod tests in the same file (unit tests), and run/fix until cargo test --lib passes.
721-734: Fail fast on parameter metadata mismatches.Both helpers rely on
zip, so ifparam_namesandarg_tysever drift, the generated.pyisilently drops part of the signature instead of tripping an invariant. In this emitter rewrite, that would be painful to diagnose later. Please add adebug_assert_eq!or checked error before rendering, including the instance-method case after accounting forself.Also applies to: 757-768
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/baml_codegen_python/src/leaf.rs` around lines 721 - 734, Add a length check to fail fast when parameter metadata drifts: before any zip over m.param_names and m.arg_tys in render_method_params_pyi, insert a debug_assert_eq!(m.param_names.len(), m.arg_tys.len(), "...") and for MethodKind::Instance assert (m.param_names.len().saturating_sub(1)) == m.arg_tys.len() (include m.name or similar identifier in the assertion message for context); do the same defensive check in the other helper that zips param_names and arg_tys (the helper referenced around the other block) so mismatched lengths are caught immediately rather than silently truncating signatures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@baml_language/crates/baml_codegen_python/src/leaf.rs`:
- Around line 256-863: Add focused crate-local unit tests that exercise the
deterministic formatting and ordering logic in this file: create tests calling
group_and_sort (to check SortKey tie-breaking and symbol_kind_ord ordering),
render_leaf_body and render_leaf_body_pyi (to verify stdlib/factory imports,
cross-leaf TYPE_CHECKING blocks, two-blank-line separators, and __all__
emission), and render_symbol/render_symbol_pyi or helper flows (to cover
class/enum/type-alias/function layouts and method fan-out spacing via
push_method_block and push_method_signatures_pyi); implement small synthetic
LeafBody/EmittedSymbol/SortKey/ PyMethodBinding/PyFunction fixtures to assert
exact output strings, add them as #[cfg(test)] mod tests in the same file (unit
tests), and run/fix until cargo test --lib passes.
- Around line 721-734: Add a length check to fail fast when parameter metadata
drifts: before any zip over m.param_names and m.arg_tys in
render_method_params_pyi, insert a debug_assert_eq!(m.param_names.len(),
m.arg_tys.len(), "...") and for MethodKind::Instance assert
(m.param_names.len().saturating_sub(1)) == m.arg_tys.len() (include m.name or
similar identifier in the assertion message for context); do the same defensive
check in the other helper that zips param_names and arg_tys (the helper
referenced around the other block) so mismatched lengths are caught immediately
rather than silently truncating signatures.
In `@baml_language/crates/sys_ops/src/lib.rs`:
- Around line 340-367: Add Rust unit tests for lookup_llm_function in the same
module (lib.rs) that construct a HashMap<String, LlmFunctionInfo> and assert the
four behaviors: (1) a bare-name key returns the matching LlmFunctionInfo, (2)
when only "user.{name}" exists the function returns that entry, (3) when only a
single key ending with ".{name}" exists the suffix-scan returns that entry, and
(4) when multiple keys end with ".{name}" the function returns None; use small
synthetic LlmFunctionInfo values (or a minimal stub) and assert pointer/equality
on the returned &LlmFunctionInfo as appropriate. Ensure tests are annotated with
#[cfg(test)] and #[test] and placed in the module's test scaffold so they run
with cargo test.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: a76ff69b-f6c9-472e-b3e6-97f8b5855e1c
📒 Files selected for processing (6)
baml_language/crates/baml_codegen_python/src/leaf.rsbaml_language/crates/baml_codegen_python/src/lib.rsbaml_language/crates/sys_ops/src/lib.rsbaml_language/demos/bridge_python_e2e/README.mdbaml_language/demos/bridge_python_e2e/app/app.pybaml_language/demos/bridge_python_e2e/baml_src/ns_lorem/functions.baml
✅ Files skipped from review due to trivial changes (2)
- baml_language/demos/bridge_python_e2e/baml_src/ns_lorem/functions.baml
- baml_language/crates/baml_codegen_python/src/lib.rs
- Remove checked_value and streaming_state_value from BamlOutboundValue oneof - Delete BamlValueChecked, BamlValueStreamingState messages and BamlStreamState enum - Remove decoder branches in baml/proto.py and baml_py/proto.py - Regenerate all proto bindings (Rust, Python, Node.js, Go) - Update 09e documentation to remove deprecated table rows Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Restore the phase-3 sim to the spec layout (`ns_lorem/root.baml` with MyLorem + add_three_to_field_a) and fix two bugs uncovered along the way: - baml_compiler2_emit: param type resolution in compute_function_metadata_from_item_tree passed an empty ns_context, so unqualified class references in function signatures under sub-namespaces collapsed to Ty::Unknown → Ty::Void in the runtime metadata. Switch to lower_type_expr_in_ns(&pkg_info.namespace_path). - bex_project: coerce BexExternalValue::Map → Instance at the project boundary when the declared parameter type is Ty::Class. Host bridges (Python, Node, …) encode dicts as Map without per-call schema; the VM treats Map and Instance as distinct kinds and refused field access on Map values. Fixing this once at the project layer keeps host encoders simple across bridges. - bridge_python: translate Python-facing BAML FQNs `root.*` → `user.*` before calling the engine (codegen emits `root.<ns>.<name>`, the engine registers `user.<ns>.<name>` per file_package). `baml.*` and `vendor.*` pass through unchanged. The sim now exercises dict-in/dict-out through the spec's literal `ns_lorem/root.baml` + MyLorem layout; phase-3 exit criteria met.
Extends baml.baml_core.proto.encode_call_args (09d inbound) and decode_call_result (09e outbound) to marshal Pydantic BaseModel / Enum / handle-backed values, so a typed MyLorem round-trips as a MyLorem — the phase 4 exit bar. Also lands the minimal Rust-side complement of 09d §2: project-boundary coercion now rewrites Instance.class_name / Variant.enum_name to match the declared Ty, realizing the spec's claim that the inbound name is informational. Sim: renames ns_lorem -> lorem (09b §1 package routing; the ns_ prefix only appears on BAML source dirs) and adds MyLorem to the lorem SDK.
BEP-030 Python codegen prep — pure IR-shape refactor.
- baml_codegen_types::Name now carries (pkg, namespace_path, name),
mirroring baml_compiler2_tir::QualifiedTypeName. Drop the
Types/StreamTypes Namespace enum; encode `$stream` discrimination
on the name string with a `$stream` suffix.
- baml_project::client_codegen::build_object_pool preserves the full
QualifiedTypeName (pkg + namespace_path + name) when populating
the ObjectPool, instead of dropping pkg.
- baml_codegen_python consumes the new IR shape: the local Python
Namespace { Other, BamlPy, Types, StreamTypes } enum is retained
(it drives baml_types/ vs baml_stream_types/ file routing), with
Namespace::for_codegen_name() deriving Types vs StreamTypes from
Name::is_stream(). Filter sites updated; stream-ness flows via
the `$stream` suffix end-to-end.
- baml_codegen_tests test DSL adapted to construct names with the
new (pkg, namespace_path, name) shape; "stream_types.X" inputs
now translate to name "X$stream".
`baml generate` runs against integ_tests without panicking and emits
the same 15 files. All workspace tests pass.
Mechanical prep for the baml_codegen_python rewrite (G1–G6). No
codegen behavior changes; exit criterion is `cargo check --workspace
--tests` staying green.
- Delete `baml_codegen_tests/` crate and all workspace/dev-dep refs.
- Rename `baml_codegen_types::{ObjectPool, Object}` to
`{SymbolPool, Symbol}` (file `objects.rs` → `symbols.rs`); rename
`baml_project::build_object_pool` → `build_symbol_pool` and update
its single caller in `baml_cli::generate`.
- Add `Origin { source_file_path, span_start }` to every Symbol
variant; populate from AST spans in `build_symbol_pool`. Extend HIR
`Class`/`Enum`/`TypeAlias` with `span: TextRange` to carry the AST
span through (HIR `Function` already had one).
- Inline FIXTURE_NAMES in `tools_rig` (previously re-exported from
the deleted crate).
- Rig templates + 9 generated rig crates: write to `generated/baml_sdk/`
instead of `generated/baml_client/`; feed `to_source_code` an empty
`SymbolPool` stub; rename `baml_client` → `baml_sdk` in embedded
test.sh / test.ps1 ruff invocations.
- Python test files under `rig_tests/`: blanket `baml_client` →
`baml_sdk` rewrite. These imports will fail at runtime until G6,
which is acceptable per the G0 exit criteria.
- Delete stale `#[cfg(test)] mod tests { … }` blocks in
`baml_codegen_python/src/objects/render/{class,enum,function,type_alias}.rs`
(they referenced `baml_codegen_tests` and are slated for wholesale
replacement in G1–G5).
No G0 snapshot: to_source_code's behavior is unchanged and rig
build.rs now feeds it an empty pool, so there's no meaningful output
tree to freeze. G1 establishes the first snapshot.
…ix and routing's root arm
Every routed leaf with a cross-leaf type reference now emits a `from <dots> import <first_segment>` line under `if typing.TYPE_CHECKING:` at the top of its `__init__.py` / `__init__.pyi`. Dot count = depth + 1, anchoring every import at the `baml_sdk/` root so the package works under any vendored absolute name. The guard defers imports past module load, breaking the recursive-cross-leaf circular import cycle that was latent in the rendered-string-only test coverage. Adds ten unit tests covering each routing case (user/baml/vendor/stream) plus deduplication, one-line-per-segment, same-leaf no-import, and unguarded children re-exports.
…olete rig pre-commit hook
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
baml_language/demos/bridge_python_e2e/README.md (1)
113-125:⚠️ Potential issue | 🟡 MinorAdd a language identifier to the fenced code block.
The fenced code block on line 113 is missing a language identifier, which triggers markdownlint MD040. Add
textorplaintextto satisfy the linter.-``` +```text bridge_python_e2e/🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/demos/bridge_python_e2e/README.md` around lines 113 - 125, The README's fenced tree code block (the block starting with ``` and the directory tree under bridge_python_e2e/) is missing a language identifier which triggers markdownlint MD040; update the opening fence to include a language such as "text" or "plaintext" (e.g. change ``` to ```text) so the tree snippet is properly marked and the linter warning is resolved.
🧹 Nitpick comments (7)
baml_language/crates/baml_project/src/client_codegen.rs (1)
414-489: Index parent symbols before attaching companions.Both attach passes do a linear search per companion. Now that this walks all compiler2-visible files, that turns pool construction into O(parents × companions) twice over. A small lookup map keyed by
(pkg, ns_path, bare_name)for functions and(parent_key, bare_name)for methods would keep this path linear.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/baml_project/src/client_codegen.rs` around lines 414 - 489, The current companion-attachment passes (working with PendingFunction/ParentEntry and PendingMethod) do repeated linear searches (parents.iter_mut().find(...)) which makes pool construction O(parents × companions); replace these linear scans by first building a HashMap lookup keyed by the unique keys used in the find calls (for functions use a key tuple of (pkg, ns_path, bare_name) mapping to ParentEntry or its index, for methods use (parent_key, bare_name) mapping to PendingMethod/its index), then iterate companions and do O(1) lookups to splice companions into the parent's func.companions, and finally iterate the parents to insert into pool or push into class.static_methods/instance_methods as before (update code around pending_functions/pending_methods, parents/companions, method_parents/method_companions and the find calls).baml_language/crates/baml_codegen_python/src/leaf.rs (1)
750-764: Consider a debug assertion for parameter/type length mismatch.The
zipoperation silently truncates ifnames.len() != tys.len(). While callers should always provide matching lengths from the sameFunctiondefinition, adebug_assert!could catch mismatches during development.🔍 Optional defensive check
fn render_typed_params(names: &[String], tys: &[Ty], ctx: &TranslateCtx) -> String { + debug_assert_eq!( + names.len(), + tys.len(), + "param names and types must have matching lengths" + ); let mut s = String::new(); for (i, (n, t)) in names.iter().zip(tys.iter()).enumerate() {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/baml_codegen_python/src/leaf.rs` around lines 750 - 764, The function render_typed_params silently truncates when names.len() != tys.len() due to zip; add a debug assertion at the start of render_typed_params to validate names.len() == tys.len() (include a short message referencing the Function that produced them if available) so mismatches are caught in debug builds; keep the rest of the logic and still use translate_ty(t, ctx) for type rendering.baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/llm/__init__.py (1)
37-40: Unusual type annotation_data: None- is this intentional?The
_datafield is annotated with typeNone, which means it can only ever holdNone. If this is a placeholder for an opaque handle or data structure, consider usingtyping.Anyor a more specific type. The same applies to_handle: Noneon line 147 inStreamAccumulator.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/llm/__init__.py` around lines 37 - 40, The _data field in PromptAst is incorrectly annotated as None; change its type to a more appropriate type (e.g., typing.Any or a concrete type) so the field can hold actual data, and adjust initialization accordingly; likewise update StreamAccumulator._handle (currently annotated None) to typing.Any or the specific handle type used by StreamAccumulator, and update any related logic/type checks to match the new annotation; reference PromptAst._data and StreamAccumulator._handle when making these edits.baml_language/crates/bridge_python/src/runtime.rs (1)
226-236: Consider adding unit tests for the new runtime state helpers.The
store_sdk_root,load_sdk_root, andget_runtimefunctions introduce new behavior that could benefit from Rust unit tests to validate lock acquisition, poisoning behavior, and the not-initialized error path. As per coding guidelines: "Prefer writing Rust unit tests over integration tests where possible."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/crates/bridge_python/src/runtime.rs` around lines 226 - 236, Add Rust unit tests for the new runtime state helpers to validate correct behavior: write tests for store_sdk_root and load_sdk_root that exercise normal lock acquisition (concurrent/serial calls), simulate a poisoned lock to confirm load_sdk_root handles poisoning as expected, and tests for get_runtime that exercise the BridgeError::NotInitialized path and the successful path when bridge_cffi::engine::get_runtime returns Ok. Use the functions store_sdk_root, load_sdk_root, and get_runtime as the test targets and mock or inject bridge_cffi::engine::get_runtime error/ok responses to assert the produced PyErr mapping and returned BamlRuntime contents.baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/llm/__init__.pyi (1)
95-97: Parameterreturn_type: Nonehas unusual typing.The
return_typeparameter is typed asNone, meaning the only valid value isNone. If this is a placeholder for future functionality (e.g., passing a type reference), consider usingtyping.Anyortyping.Optional[type]. IfNoneis the only valid value, the parameter may be unnecessary.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/llm/__init__.pyi` around lines 95 - 97, The parameter typing for PrimitiveClient.render_prompt and PrimitiveClient.render_prompt_async is incorrectly annotated as `return_type: None`; update the signature to a more appropriate type (e.g., `typing.Any` or `typing.Optional[type]`) or remove the parameter if it is unused: locate the two methods on the PrimitiveClient class in baml.llm.__init__.pyi and change the `return_type` annotation to the chosen type (or delete the parameter) and ensure the stub signatures and any callers reflect that change.baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/media/__init__.py (2)
3-3: Remove unused import.The
typingmodule is imported but never used in this file.♻️ Proposed fix
from __future__ import annotations -import typing import pydantic🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/media/__init__.py` at line 3, Remove the unused top-level import of the typing module in baml.media.__init__.py: delete the line importing typing since it is not referenced anywhere in this module (there are no functions, classes, or annotations in __init__.py that use typing).
12-114: Consider extracting a base class or factory to reduce duplication.All four media classes (
Audio,Video,Image) have identical structure with only the FQN prefix differing. This could be refactored to reduce ~100 lines of near-identical code.♻️ Example approach
def _media_class(media_type: str) -> type: """Factory to create media classes with standard BAML wiring.""" class _Media(pydantic.BaseModel): model_config = pydantic.ConfigDict(extra="forbid") _data: None # ... wire methods using media_type ... return _Media Pdf = _media_class("Pdf") Audio = _media_class("Audio") # etc.This is optional since the current explicit approach is more readable and IDE-friendly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/media/__init__.py` around lines 12 - 114, The four nearly identical classes Pdf, Audio, Video, and Image should be consolidated by creating a factory or base class that wires the same static and instance methods using the media type name instead of repeating code; implement a function (e.g., _media_class or MediaBase subclass) that accepts the media type string and uses _define_static_method and _define_instance_method to attach from_url/from_file/from_base64 and url/file/base64/mime_type (and their _async variants), then replace the explicit Pdf/Audio/Video/Image class blocks with Pdf = _media_class("Pdf"), Audio = _media_class("Audio"), etc., preserving model_config and _data.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@baml_language/crates/baml_project/src/client_codegen.rs`:
- Around line 1-5: Update the module doc comment to reflect the new behavior:
replace the sentence stating it "Walks the HIR item trees for user-defined
files" with a description that build_symbol_pool iterates
compiler2_all_files(db) and therefore processes both user-defined and
stdlib-visible files to populate a codegen-ready SymbolPool for downstream
generators; mention build_symbol_pool, SymbolPool, and compiler2_all_files(db)
by name so readers know where the change occurs.
---
Duplicate comments:
In `@baml_language/demos/bridge_python_e2e/README.md`:
- Around line 113-125: The README's fenced tree code block (the block starting
with ``` and the directory tree under bridge_python_e2e/) is missing a language
identifier which triggers markdownlint MD040; update the opening fence to
include a language such as "text" or "plaintext" (e.g. change ``` to ```text) so
the tree snippet is properly marked and the linter warning is resolved.
---
Nitpick comments:
In `@baml_language/crates/baml_codegen_python/src/leaf.rs`:
- Around line 750-764: The function render_typed_params silently truncates when
names.len() != tys.len() due to zip; add a debug assertion at the start of
render_typed_params to validate names.len() == tys.len() (include a short
message referencing the Function that produced them if available) so mismatches
are caught in debug builds; keep the rest of the logic and still use
translate_ty(t, ctx) for type rendering.
In `@baml_language/crates/baml_project/src/client_codegen.rs`:
- Around line 414-489: The current companion-attachment passes (working with
PendingFunction/ParentEntry and PendingMethod) do repeated linear searches
(parents.iter_mut().find(...)) which makes pool construction O(parents ×
companions); replace these linear scans by first building a HashMap lookup keyed
by the unique keys used in the find calls (for functions use a key tuple of
(pkg, ns_path, bare_name) mapping to ParentEntry or its index, for methods use
(parent_key, bare_name) mapping to PendingMethod/its index), then iterate
companions and do O(1) lookups to splice companions into the parent's
func.companions, and finally iterate the parents to insert into pool or push
into class.static_methods/instance_methods as before (update code around
pending_functions/pending_methods, parents/companions,
method_parents/method_companions and the find calls).
In `@baml_language/crates/bridge_python/src/runtime.rs`:
- Around line 226-236: Add Rust unit tests for the new runtime state helpers to
validate correct behavior: write tests for store_sdk_root and load_sdk_root that
exercise normal lock acquisition (concurrent/serial calls), simulate a poisoned
lock to confirm load_sdk_root handles poisoning as expected, and tests for
get_runtime that exercise the BridgeError::NotInitialized path and the
successful path when bridge_cffi::engine::get_runtime returns Ok. Use the
functions store_sdk_root, load_sdk_root, and get_runtime as the test targets and
mock or inject bridge_cffi::engine::get_runtime error/ok responses to assert the
produced PyErr mapping and returned BamlRuntime contents.
In `@baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/llm/__init__.py`:
- Around line 37-40: The _data field in PromptAst is incorrectly annotated as
None; change its type to a more appropriate type (e.g., typing.Any or a concrete
type) so the field can hold actual data, and adjust initialization accordingly;
likewise update StreamAccumulator._handle (currently annotated None) to
typing.Any or the specific handle type used by StreamAccumulator, and update any
related logic/type checks to match the new annotation; reference PromptAst._data
and StreamAccumulator._handle when making these edits.
In `@baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/llm/__init__.pyi`:
- Around line 95-97: The parameter typing for PrimitiveClient.render_prompt and
PrimitiveClient.render_prompt_async is incorrectly annotated as `return_type:
None`; update the signature to a more appropriate type (e.g., `typing.Any` or
`typing.Optional[type]`) or remove the parameter if it is unused: locate the two
methods on the PrimitiveClient class in baml.llm.__init__.pyi and change the
`return_type` annotation to the chosen type (or delete the parameter) and ensure
the stub signatures and any callers reflect that change.
In `@baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/media/__init__.py`:
- Line 3: Remove the unused top-level import of the typing module in
baml.media.__init__.py: delete the line importing typing since it is not
referenced anywhere in this module (there are no functions, classes, or
annotations in __init__.py that use typing).
- Around line 12-114: The four nearly identical classes Pdf, Audio, Video, and
Image should be consolidated by creating a factory or base class that wires the
same static and instance methods using the media type name instead of repeating
code; implement a function (e.g., _media_class or MediaBase subclass) that
accepts the media type string and uses _define_static_method and
_define_instance_method to attach from_url/from_file/from_base64 and
url/file/base64/mime_type (and their _async variants), then replace the explicit
Pdf/Audio/Video/Image class blocks with Pdf = _media_class("Pdf"), Audio =
_media_class("Audio"), etc., preserving model_config and _data.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d3d62c46-9379-443c-a86e-02b3112469f8
⛔ Files ignored due to path filters (12)
baml_language/Cargo.lockis excluded by!**/*.lockbaml_language/crates/baml_tests/snapshots/__baml_std__/baml_tests____baml_std____06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/namespaces_bare_name_rejected/baml_tests__namespaces_bare_name_rejected__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/namespaces_basic/baml_tests__namespaces_basic__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/namespaces_root_fallback/baml_tests__namespaces_root_fallback__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/namespaces_stream_direct_ref/baml_tests__namespaces_stream_direct_ref__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/baml_tests/snapshots/namespaces_type_resolution/baml_tests__namespaces_type_resolution__06_codegen.snapis excluded by!**/*.snapbaml_language/crates/bridge_go/cffi/proto/baml/cffi/v1/baml_inbound.pb.gois excluded by!**/*.pb.gobaml_language/crates/bridge_go/cffi/proto/baml/cffi/v1/baml_outbound.pb.gois excluded by!**/*.pb.gobaml_language/crates/bridge_nodejs/proto.d.ts.mapis excluded by!**/*.mapbaml_language/crates/bridge_nodejs/proto.js.mapis excluded by!**/*.mapbaml_language/crates/bridge_python/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (134)
.github/workflows/cargo-tests.reusable.yaml.pre-commit-config.yamlbaml_language/Cargo.tomlbaml_language/crates/baml_cli/src/generate.rsbaml_language/crates/baml_codegen_python/Cargo.tomlbaml_language/crates/baml_codegen_python/src/_askama/config.pybaml_language/crates/baml_codegen_python/src/_askama/globals.pybaml_language/crates/baml_codegen_python/src/_askama/runtime.pybaml_language/crates/baml_codegen_python/src/_askama/runtime.py.j2baml_language/crates/baml_codegen_python/src/_askama/tracing.pybaml_language/crates/baml_codegen_python/src/docstring.rsbaml_language/crates/baml_codegen_python/src/emit/class.rsbaml_language/crates/baml_codegen_python/src/emit/enum_.rsbaml_language/crates/baml_codegen_python/src/emit/function.rsbaml_language/crates/baml_codegen_python/src/emit/method.rsbaml_language/crates/baml_codegen_python/src/emit/mod.rsbaml_language/crates/baml_codegen_python/src/emit/type_alias.rsbaml_language/crates/baml_codegen_python/src/leaf.rsbaml_language/crates/baml_codegen_python/src/lib.rsbaml_language/crates/baml_codegen_python/src/objects.rsbaml_language/crates/baml_codegen_python/src/objects/render/class.rsbaml_language/crates/baml_codegen_python/src/objects/render/enum.rsbaml_language/crates/baml_codegen_python/src/objects/render/function.rsbaml_language/crates/baml_codegen_python/src/objects/render/type_alias.rsbaml_language/crates/baml_codegen_python/src/routing.rsbaml_language/crates/baml_codegen_python/src/translate_ty.rsbaml_language/crates/baml_codegen_python/src/ty.rsbaml_language/crates/baml_codegen_tests/Cargo.tomlbaml_language/crates/baml_codegen_tests/src/builders.rsbaml_language/crates/baml_codegen_tests/src/lib.rsbaml_language/crates/baml_codegen_types/src/lib.rsbaml_language/crates/baml_codegen_types/src/objects.rsbaml_language/crates/baml_codegen_types/src/symbols.rsbaml_language/crates/baml_codegen_types/src/ty.rsbaml_language/crates/baml_compiler2_emit/src/lib.rsbaml_language/crates/baml_compiler2_hir/src/item_tree.rsbaml_language/crates/baml_project/src/client_codegen.rsbaml_language/crates/baml_project/src/lib.rsbaml_language/crates/baml_tests/tests/builtins.rsbaml_language/crates/baml_tests/tests/classes.rsbaml_language/crates/baml_tests/tests/enums.rsbaml_language/crates/baml_tests/tests/parse_companions.rsbaml_language/crates/baml_tests/tests/typed_outputs.rsbaml_language/crates/bex_engine/src/conversion.rsbaml_language/crates/bex_project/Cargo.tomlbaml_language/crates/bex_project/src/bex.rsbaml_language/crates/bridge_ctypes/types/baml/cffi/v1/baml_outbound.protobaml_language/crates/bridge_nodejs/proto.jsbaml_language/crates/bridge_nodejs/typescript_src/proto.tsbaml_language/crates/bridge_nodejs/typescript_src/proto/baml_cffi.d.tsbaml_language/crates/bridge_nodejs/typescript_src/proto/baml_cffi.jsbaml_language/crates/bridge_python/Cargo.tomlbaml_language/crates/bridge_python/pyproject.tomlbaml_language/crates/bridge_python/python_src/baml/__init__.pybaml_language/crates/bridge_python/python_src/baml/baml_core/__init__.pybaml_language/crates/bridge_python/python_src/baml/baml_core/__init__.pyibaml_language/crates/bridge_python/python_src/baml/baml_core/cffi/__init__.pybaml_language/crates/bridge_python/python_src/baml/baml_core/cffi/v1/__init__.pybaml_language/crates/bridge_python/python_src/baml/baml_core/ctx_manager.pybaml_language/crates/bridge_python/python_src/baml/baml_core/errors.pybaml_language/crates/bridge_python/python_src/baml/baml_core/proto.pybaml_language/crates/bridge_python/python_src/baml/cffi/__init__.pybaml_language/crates/bridge_python/python_src/baml/cffi/v1/__init__.pybaml_language/crates/bridge_python/python_src/baml/cffi/v1/baml_inbound_pb2.pybaml_language/crates/bridge_python/python_src/baml/cffi/v1/baml_outbound_pb2.pybaml_language/crates/bridge_python/python_src/baml/py.typedbaml_language/crates/bridge_python/python_src/baml_py/__init__.pybaml_language/crates/bridge_python/python_src/baml_py/proto.pybaml_language/crates/bridge_python/src/errors.rsbaml_language/crates/bridge_python/src/lib.rsbaml_language/crates/bridge_python/src/runtime.rsbaml_language/crates/bridge_python/tests/conftest.pybaml_language/crates/bridge_python/tests/sim/baml_sdk/__init__.pybaml_language/crates/bridge_python/tests/sim/baml_sdk/baml/__init__.pybaml_language/crates/bridge_python/tests/sim/baml_sdk/baml/_inlinedbaml.pybaml_language/crates/bridge_python/tests/sim/baml_sdk/lorem/__init__.pybaml_language/crates/bridge_python/tests/sim/baml_src/ns_lorem/root.bamlbaml_language/crates/bridge_python/tests/test_collector.pybaml_language/crates/bridge_python/tests/test_engine.pybaml_language/crates/bridge_python/tests/test_sim_phase3.pybaml_language/crates/bridge_python/tests/test_tracing.pybaml_language/crates/sys_ops/src/lib.rsbaml_language/crates/tools_rig/Cargo.tomlbaml_language/crates/tools_rig/src/main.rsbaml_language/demos/bridge_python_e2e/.gitignorebaml_language/demos/bridge_python_e2e/README.mdbaml_language/demos/bridge_python_e2e/app/app.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/_inlinedbaml.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/errors/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/errors/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/fs/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/fs/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/http/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/http/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/llm/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/llm/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/media/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/media/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/net/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/net/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/panics/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/panics/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/stream/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/baml/stream/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/lorem/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/lorem/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/py.typedbaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/errors/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/errors/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/fs/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/fs/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/http/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/http/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/llm/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/llm/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/media/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/media/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/net/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/net/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/panics/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/panics/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/stream/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/stream/__init__.pyibaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/lorem/__init__.pybaml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/lorem/__init__.pyibaml_language/demos/bridge_python_e2e/app/pyproject.toml
💤 Files with no reviewable changes (25)
- .pre-commit-config.yaml
- baml_language/Cargo.toml
- baml_language/crates/bridge_nodejs/proto.js
- baml_language/crates/tools_rig/Cargo.toml
- baml_language/crates/baml_codegen_python/src/ty.rs
- baml_language/crates/baml_codegen_python/src/_askama/tracing.py
- baml_language/crates/baml_codegen_python/src/_askama/runtime.py
- baml_language/crates/baml_codegen_python/src/objects.rs
- baml_language/crates/baml_codegen_tests/src/lib.rs
- baml_language/crates/baml_codegen_python/src/objects/render/type_alias.rs
- baml_language/crates/baml_codegen_python/src/objects/render/class.rs
- baml_language/crates/baml_codegen_python/src/objects/render/enum.rs
- baml_language/crates/baml_codegen_python/src/objects/render/function.rs
- baml_language/crates/baml_codegen_python/src/docstring.rs
- baml_language/crates/baml_codegen_python/src/_askama/globals.py
- baml_language/crates/tools_rig/src/main.rs
- baml_language/crates/baml_codegen_types/src/objects.rs
- baml_language/crates/bridge_python/python_src/baml_py/proto.py
- baml_language/crates/baml_codegen_python/src/_askama/runtime.py.j2
- baml_language/crates/bridge_ctypes/types/baml/cffi/v1/baml_outbound.proto
- baml_language/crates/bridge_python/python_src/baml_py/init.py
- baml_language/crates/bridge_nodejs/typescript_src/proto.ts
- baml_language/crates/baml_codegen_tests/src/builders.rs
- baml_language/crates/baml_codegen_tests/Cargo.toml
- baml_language/crates/baml_codegen_python/src/_askama/config.py
✅ Files skipped from review due to trivial changes (43)
- baml_language/crates/bex_project/Cargo.toml
- baml_language/crates/bridge_python/python_src/baml/cffi/init.py
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/init.py
- baml_language/crates/bridge_python/Cargo.toml
- baml_language/crates/bridge_python/python_src/baml/baml_core/ctx_manager.py
- baml_language/crates/bridge_python/python_src/baml/init.py
- baml_language/crates/baml_codegen_types/src/lib.rs
- baml_language/crates/bridge_python/tests/sim/baml_sdk/baml/_inlinedbaml.py
- baml_language/crates/bridge_python/tests/sim/baml_src/ns_lorem/root.baml
- baml_language/crates/baml_codegen_python/Cargo.toml
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/fs/init.pyi
- baml_language/crates/bridge_python/python_src/baml/baml_core/errors.py
- baml_language/crates/bridge_python/src/errors.rs
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/net/init.py
- baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/stream/init.py
- baml_language/crates/baml_codegen_python/src/emit/enum_.rs
- baml_language/demos/bridge_python_e2e/app/baml_sdk/init.pyi
- baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/net/init.py
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/errors/init.py
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/lorem/init.pyi
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/stream/init.py
- baml_language/demos/bridge_python_e2e/app/pyproject.toml
- baml_language/crates/bridge_python/python_src/baml/cffi/v1/init.py
- baml_language/crates/baml_tests/tests/enums.rs
- baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/media/init.pyi
- baml_language/crates/baml_tests/tests/classes.rs
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/errors/init.pyi
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/panics/init.pyi
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/http/init.py
- baml_language/crates/bridge_python/tests/conftest.py
- baml_language/crates/baml_codegen_python/src/emit/function.rs
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/media/init.pyi
- baml_language/crates/baml_codegen_python/src/routing.rs
- baml_language/crates/baml_codegen_python/src/emit/class.rs
- baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/stream/init.pyi
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/panics/init.py
- baml_language/crates/bridge_python/tests/sim/baml_sdk/lorem/init.py
- baml_language/crates/bridge_python/python_src/baml/baml_core/init.pyi
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/lorem/init.py
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/fs/init.py
- baml_language/demos/bridge_python_e2e/app/baml_sdk/lorem/init.py
- baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/panics/init.pyi
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/llm/init.pyi
🚧 Files skipped from review as they are similar to previous changes (31)
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/net/init.pyi
- baml_language/demos/bridge_python_e2e/.gitignore
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/init.pyi
- baml_language/crates/bridge_python/src/lib.rs
- baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/_inlinedbaml.py
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/stream/init.pyi
- baml_language/crates/baml_project/src/lib.rs
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/llm/init.py
- baml_language/crates/bridge_python/tests/sim/baml_sdk/init.py
- baml_language/crates/baml_compiler2_hir/src/item_tree.rs
- baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/net/init.pyi
- baml_language/crates/bridge_python/pyproject.toml
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/media/init.py
- baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/fs/init.py
- baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/fs/init.pyi
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/init.py
- baml_language/crates/sys_ops/src/lib.rs
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/init.pyi
- baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/http/init.pyi
- baml_language/crates/baml_cli/src/generate.rs
- baml_language/demos/bridge_python_e2e/app/baml_sdk/stream_types/baml/http/init.pyi
- baml_language/crates/baml_codegen_python/src/emit/type_alias.rs
- baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/panics/init.py
- baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/init.pyi
- baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/http/init.py
- baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/errors/init.pyi
- baml_language/demos/bridge_python_e2e/app/app.py
- baml_language/crates/baml_codegen_python/src/translate_ty.rs
- baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/init.py
- baml_language/crates/baml_codegen_python/src/emit/mod.rs
- baml_language/demos/bridge_python_e2e/app/baml_sdk/baml/errors/init.py
| //! Conversion from the compiler2 HIR/TIR to `SymbolPool`. | ||
| //! | ||
| //! Walks the HIR item trees for user-defined files, resolves types via TIR, | ||
| //! and populates a codegen-ready `ObjectPool` suitable for language-specific | ||
| //! and populates a codegen-ready `SymbolPool` suitable for language-specific | ||
| //! code generators (e.g. `baml_codegen_python`). |
There was a problem hiding this comment.
Update the module doc: it no longer only walks user-defined files.
The header comment still says this pass walks user-defined files, but build_symbol_pool now iterates compiler2_all_files(db) and includes stdlib-visible files too. That mismatch will mislead the next person debugging symbol inclusion.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@baml_language/crates/baml_project/src/client_codegen.rs` around lines 1 - 5,
Update the module doc comment to reflect the new behavior: replace the sentence
stating it "Walks the HIR item trees for user-defined files" with a description
that build_symbol_pool iterates compiler2_all_files(db) and therefore processes
both user-defined and stdlib-visible files to populate a codegen-ready
SymbolPool for downstream generators; mention build_symbol_pool, SymbolPool, and
compiler2_all_files(db) by name so readers know where the change occurs.
Make
baml-cli generategenerate a Python client, containing generated Python types that map to underlying BAML types and allows calling BAML functions using generated Python types and function bindings. Supports the following:Also implements inbound serialization (Python host to BAML function) and outbound deserialization (BAML function return to Python host) over the cffi/proto pattern.
We're leaving a lot of comments unresolved in this PR because it represents a large enough shift in functionality that we'd like to get it merged, and then we'll fast follow a lot of the cleanup work. There are three classes of things that we still need to work on: