Skip to content

fix(llm): preflight compact oversized requests#284

Merged
alexk-dev merged 12 commits intomainfrom
fix/llm-context-preflight
Apr 15, 2026
Merged

fix(llm): preflight compact oversized requests#284
alexk-dev merged 12 commits intomainfrom
fix/llm-context-preflight

Conversation

@golemcore1
Copy link
Copy Markdown
Collaborator

Summary

This PR fixes a long-chat failure mode where the agent could build an LLM request larger than the selected model can accept and only discover the problem after sending the request to the provider.

Changes included:

  • Add a provider-agnostic ContextTokenEstimator for estimating full LLM request size, including system prompt, conversation view, tools, and tool results.
  • Add request preflight compaction in LlmCallPhase after context assembly and model selection, immediately before the provider call.
  • Emit preflight diagnostics via the canonical ContextAttributes.LLM_REQUEST_PREFLIGHT key.
  • Add REQUEST_PREFLIGHT as a distinct compaction reason for observability.
  • Allow context-overflow recovery to retry after fallback compaction even when LLM summary generation was unavailable.
  • Expand context-overflow error classification for common token/context messages.
  • Avoid treating obvious context-overflow messages such as too_many_tokens as rate-limit retry signals in the LangChain4j adapter.
  • Reuse the shared estimator in AutoCompactionSystem so pre-context estimates are less ad hoc.
  • Add/adjust tests for preflight compaction, fallback overflow recovery, classifier coverage, adapter classification, and architecture compliance.

@golemcore1 golemcore1 force-pushed the fix/llm-context-preflight branch from 89984f6 to 4193fd3 Compare April 14, 2026 05:15
@golemcore1 golemcore1 force-pushed the fix/llm-context-preflight branch from 4193fd3 to 832f75f Compare April 14, 2026 05:42
@alexk-dev alexk-dev merged commit d533cd9 into main Apr 15, 2026
18 checks passed
@alexk-dev alexk-dev deleted the fix/llm-context-preflight branch April 15, 2026 01:47
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants