Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
96fde1b
Phase 1: Remove deprecated BamlValueChecked and BamlValueStreamingState
sxlijin Apr 10, 2026
e02585f
Phase 3: bridge_python sim with namespaced class round-trip
sxlijin Apr 24, 2026
2fc072f
Phase 4: typed Pydantic round-trip via 09d/09e encoding
sxlijin Apr 24, 2026
b704aae
Phase 5: redesign cg::Name to carry (pkg, namespace_path, name)
sxlijin Apr 24, 2026
6421246
Phase G0: teardown + rig wiring for codegen rewrite
sxlijin Apr 24, 2026
d9cbfd5
Phase G1: directory layout + whole-SDK scaffolding
sxlijin Apr 24, 2026
adb0458
Phase G2: emitter-internal Rust types + placeholder rendering
sxlijin Apr 24, 2026
eca8f6b
integ test cleanup
sxlijin Apr 24, 2026
4b53301
Phase G4: real class/enum/type-alias bodies via translate_ty
sxlijin Apr 24, 2026
b69e4a7
Phase G5: factory bindings for free functions + companions
sxlijin Apr 24, 2026
6979fa2
Phase G6: rig tests as end-to-end regression fence
sxlijin Apr 24, 2026
51a54f3
phase G7: rig tests driven by baml_src/, not Rust pools
sxlijin Apr 25, 2026
5c254d1
consolidate type-shape rig crates; fold llm_functions into example_09a
sxlijin Apr 25, 2026
99985ac
12a: collapse FQN spaces — emit always fully qualifies, drop fqn_pref…
sxlijin Apr 27, 2026
1dfa1ea
12b: static and instance method bindings
sxlijin Apr 27, 2026
eac9560
12d: pyi stub generation
sxlijin Apr 27, 2026
5ec0952
13a: bridge_python e2e smoke demo
sxlijin Apr 28, 2026
338ca5a
13b: extend e2e demo with LLM function + __build_request
sxlijin Apr 28, 2026
afbae41
13c: put LLM demo back in ns_lorem/, fix sys_ops client $new lookup
sxlijin Apr 28, 2026
f5aa69e
12f: TYPE_CHECKING-guarded relative imports for cross-leaf refs
sxlijin Apr 28, 2026
a4ca0f9
fix rust unit tests for user. qualification
sxlijin Apr 28, 2026
0e51528
delete rig test empty
sxlijin Apr 28, 2026
413f2c9
clean up stream_return_type
sxlijin Apr 28, 2026
a78c31d
drop dead checkedValue/streamingStateValue branches in proto.ts
sxlijin Apr 28, 2026
2cf40e6
ci: install uv via mise-action for snapshot tests
sxlijin Apr 28, 2026
5fe8071
codegen_python: render root __init__.py via askama template
sxlijin Apr 28, 2026
9940f06
fix CI: stow Cargo.toml, sync bridge_nodejs generated files, drop obs…
sxlijin Apr 28, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions .github/workflows/cargo-tests.reusable.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -492,6 +492,12 @@ jobs:
runs-on: blacksmith-4vcpu-ubuntu-2404
if: ${{ inputs.code_changed == 'true' || inputs.run_all }}
timeout-minutes: 30
env:
# Disable Python + pipx tools: mise 2025.12.12 downloads a broken
# freethreaded Python build that's missing a lib/ directory.
# Rig test scripts use system Python via uv (the standalone uv binary
# mise installs handles its own Python via `uv run`).
MISE_DISABLE_TOOLS: python,pipx:uv,pipx:ruff,pipx:maturin
steps:
- uses: actions/checkout@v4
with:
Expand All @@ -513,6 +519,13 @@ jobs:
with:
tool: cargo-insta

- name: "Install mise"
uses: jdx/mise-action@v2
# Pinning to old version because newer versions time out on the npm backend,
# see https://github.com/jdx/mise/discussions/7630
with:
version: 2025.12.12

- name: "Run snapshot tests"
run: |
cargo insta test --workspace --all-features --unreferenced=reject --exclude rig_python_empty --exclude rig_python_literal_types --exclude rig_python_map_types --exclude rig_python_mixed_complex_types --exclude rig_python_semantic_streaming --exclude rig_python_union_types_extended
Expand Down
9 changes: 0 additions & 9 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,18 +37,9 @@ repos:
- repo: local
hooks:
# Priority ordering:
# 0: Code generation (rig) - must run first to generate files
# 1: Formatters and validators (fmt, stow, bep-readme) - format/validate generated code
# 2: Linters (clippy) - final checks on formatted code

- id: rig
name: rig
entry: bash -c 'cd baml_language && cargo run -p tools_rig -- --check'
language: system
pass_filenames: false
files: ^baml_language/(rig_tests/|crates/tools_rig/|crates/baml_codegen)
priority: 1

- id: cargo-fmt
name: cargo fmt
entry: bash -c 'cd baml_language && mise run fmt'
Expand Down
79 changes: 9 additions & 70 deletions baml_language/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 0 additions & 1 deletion baml_language/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@ baml_builtins2 = { path = "crates/baml_builtins2" }
baml_builtins2_codegen = { path = "crates/baml_builtins2_codegen" }
baml_builtins_macros = { path = "crates/baml_builtins_macros" }
baml_codegen_python = { path = "crates/baml_codegen_python" }
baml_codegen_tests = { path = "crates/baml_codegen_tests" }
baml_codegen_types = { path = "crates/baml_codegen_types" }
baml_compiler2_ast = { path = "crates/baml_compiler2_ast" }
baml_compiler2_hir = { path = "crates/baml_compiler2_hir" }
Expand Down
128 changes: 27 additions & 101 deletions baml_language/crates/baml_cli/src/generate.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
#![allow(clippy::print_stdout, clippy::print_stderr)]

use std::{collections::HashMap, path::PathBuf};
use std::{
collections::HashMap,
path::{Path, PathBuf},
};

use anyhow::{Context, Result, anyhow};
use baml_db::{
Expand All @@ -23,20 +26,12 @@ pub struct GenerateArgs {
pub output: Option<PathBuf>,
}

/// Whether the default `b` export is sync or async.
#[derive(Clone, Copy)]
enum DefaultClientMode {
Sync,
Async,
}

/// A parsed generator definition from a BAML source file.
struct GeneratorDef {
name: String,
output_type: String,
/// Resolved output directory (absolute).
output_dir: PathBuf,
default_client_mode: DefaultClientMode,
}

impl GenerateArgs {
Expand Down Expand Up @@ -90,8 +85,12 @@ impl GenerateArgs {
return Ok(crate::ExitCode::Other);
}

// Build the codegen ObjectPool from the compiler database.
let pool = baml_project::build_object_pool(&db);
// Build the codegen SymbolPool from the compiler database.
let pool = baml_project::build_symbol_pool(&db);

// Collect user BAML source files keyed by path relative to
// `baml_src/` for inlining into `baml_sdk/baml/_inlinedbaml.py`.
let user_baml_files = collect_user_baml_files(&db, &source_files, &from);

let mut total_files = 0;

Expand All @@ -103,7 +102,7 @@ impl GenerateArgs {

let generated = match generator.output_type.as_str() {
"python/pydantic" | "python/pydantic/v1" => {
baml_codegen_python::to_source_code(&pool, &from)
baml_codegen_python::to_source_code(&pool, &user_baml_files)
}
other => {
eprintln!(
Expand Down Expand Up @@ -133,23 +132,6 @@ impl GenerateArgs {
count += 1;
}

// Write Python-specific bootstrap files.
if generator.output_type.starts_with("python") {
// inlinedbaml.py — embed all BAML source files.
let inlined = generate_inlinedbaml(&db, &source_files, &from);
let inlined_path = output_dir.join("inlinedbaml.py");
std::fs::write(&inlined_path, &inlined)
.with_context(|| format!("Failed to write {}", inlined_path.display()))?;
count += 1;

// __init__.py — re-export client and modules.
let init_py = generate_init_py(generator.default_client_mode);
let init_path = output_dir.join("__init__.py");
std::fs::write(&init_path, &init_py)
.with_context(|| format!("Failed to write {}", init_path.display()))?;
count += 1;
}

println!(
"Generator '{}': wrote {count} files to {}",
generator.name,
Expand Down Expand Up @@ -196,31 +178,12 @@ fn discover_generators(db: &ProjectDatabase, baml_src: &std::path::Path) -> Vec<
// Strip surrounding quotes if present (config values may be quoted strings)
let raw_output_dir = raw_output_dir.trim_matches('"').trim_matches('\'');

let output_dir = baml_src.join(raw_output_dir).join("baml_client");

// default_client_mode: "sync" or "async". Default depends on language.
let default_client_mode = match get_config(config, "default_client_mode")
.as_deref()
.map(|s| s.trim_matches('"').trim_matches('\''))
{
Some("sync") => DefaultClientMode::Sync,
Some("async") => DefaultClientMode::Async,
_ => {
// Language-specific defaults: Python defaults to sync (recommended),
// TypeScript to async.
if output_type.starts_with("python") {
DefaultClientMode::Sync
} else {
DefaultClientMode::Async
}
}
};
let output_dir = baml_src.join(raw_output_dir).join("baml_sdk");

generators.push(GeneratorDef {
name: generator_item.name.to_string(),
output_type,
output_dir,
default_client_mode,
});
}
}
Expand All @@ -236,58 +199,21 @@ fn get_config(items: &[GeneratorConfigItem], key: &str) -> Option<String> {
.map(|item| item.value.clone())
}

/// Generate `inlinedbaml.py` — a Python dict mapping relative file paths to BAML source.
fn generate_inlinedbaml(
/// Collect user BAML source files as `(rel_path, contents)` pairs.
/// `rel_path` is relative to `baml_src/` so it matches the keys the
/// runtime's `initialize_runtime(...)` expects in the inlined `FILES`
/// dict.
fn collect_user_baml_files(
db: &ProjectDatabase,
source_files: &[baml_db::SourceFile],
baml_src: &std::path::Path,
) -> String {
use std::fmt::Write;

let mut out = String::from(
"# This file is generated by baml-cli. Do not edit.\n\n\
_file_map = {\n",
);
for sf in source_files {
let path = sf.path(db);
// Make key relative to baml_src (e.g. "sub/foo.baml").
let rel = path.strip_prefix(baml_src).unwrap_or(&path);
let key = rel.to_string_lossy();
let text = sf.text(db);
let _ = writeln!(out, " {}: {},", quote_py(&key), quote_py(text));
}
out.push_str(
"}\n\n\
def get_baml_files():\n \
return dict(_file_map)\n",
);
out
}

/// Quote a string as a Python repr (triple-quoted to handle embedded quotes/newlines).
fn quote_py(s: &str) -> String {
// Use triple-double-quotes; escape any embedded triple-double-quotes.
let escaped = s.replace("\\", "\\\\").replace("\"\"\"", "\\\"\\\"\\\"");
format!("\"\"\"{}\"\"\"", escaped)
}

/// Generate a minimal `__init__.py` that re-exports the client and modules.
fn generate_init_py(mode: DefaultClientMode) -> String {
let default_line = match mode {
DefaultClientMode::Sync => "b = sync_b",
DefaultClientMode::Async => "b = async_b",
};
format!(
r#"# Generated by baml-cli. Do not edit.

from . import types
from . import stream_types
from .sync_client import b as sync_b
from .async_client import b as async_b

{default_line}

__all__ = ["b", "sync_b", "async_b", "types", "stream_types"]
"#
)
baml_src: &Path,
) -> Vec<(PathBuf, String)> {
source_files
.iter()
.map(|sf| {
let path = sf.path(db);
let rel = path.strip_prefix(baml_src).unwrap_or(&path).to_path_buf();
(rel, sf.text(db).to_string())
})
.collect()
Comment on lines +211 to +218
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

strip_prefix fallback silently uses absolute paths.

If sf.path(db) ever returns a path that isn't under baml_src (e.g., a synthesized/built-in source, a symlink resolved out-of-tree, or a path that differs only in canonicalization), strip_prefix(baml_src).unwrap_or(&path) falls back to the absolute path. That absolute PathBuf then becomes a key in the inlined FILES dict and is later joined into the output dir — effectively writing outside baml_sdk/baml/_inlinedbaml.py's expected layout, or producing keys the runtime's initialize_runtime won't recognize.

Consider failing fast (or filtering) when a source is not under baml_src, since that almost certainly indicates a bug upstream rather than something the codegen should silently absorb.

Suggested change
     source_files
         .iter()
-        .map(|sf| {
+        .filter_map(|sf| {
             let path = sf.path(db);
-            let rel = path.strip_prefix(baml_src).unwrap_or(&path).to_path_buf();
-            (rel, sf.text(db).to_string())
+            let rel = path.strip_prefix(baml_src).ok()?.to_path_buf();
+            Some((rel, sf.text(db).to_string()))
         })
         .collect()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
source_files
.iter()
.map(|sf| {
let path = sf.path(db);
let rel = path.strip_prefix(baml_src).unwrap_or(&path).to_path_buf();
(rel, sf.text(db).to_string())
})
.collect()
source_files
.iter()
.filter_map(|sf| {
let path = sf.path(db);
let rel = path.strip_prefix(baml_src).ok()?.to_path_buf();
Some((rel, sf.text(db).to_string()))
})
.collect()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@baml_language/crates/baml_cli/src/generate.rs` around lines 211 - 218, The
current map over source_files uses sf.path(db) and silently falls back to the
absolute path when strip_prefix(baml_src) fails, which can inject out-of-tree
absolute keys into the generated FILES; change this to fail fast or filter those
entries: in the closure that calls sf.path(db) and rel =
path.strip_prefix(baml_src).unwrap_or(&path) replace the unwrap_or behavior with
either (a) return an Err or panic with a clear message if strip_prefix fails (so
the codegen halts and surfaces the bad source), or (b) filter out such sources
by only mapping entries where strip_prefix(baml_src).is_ok(); update the mapping
that builds the FILES dict (the source_files.iter().map(...) block) accordingly
so only canonical in-tree relative paths are used.

}
3 changes: 1 addition & 2 deletions baml_language/crates/baml_codegen_python/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ license = { workspace = true }
[dependencies]
baml_base = { workspace = true }
baml_codegen_types = { workspace = true }
askama = { workspace = true, features = [ "code-in-doc" ] }
askama = { workspace = true }

[lib]
doctest = false
Expand All @@ -22,6 +22,5 @@ doctest = false
workspace = true

[dev-dependencies]
baml_codegen_tests = { workspace = true }
baml_codegen_types = { workspace = true }
pretty_assertions = { workspace = true }
Loading
Loading