EvalNode — Inline Output Quality Gate for AgentFlow V2 #6251
Replies: 3 comments
-
|
Hi! I'm building SwarmSync — a universal AI agent marketplace with AP2 escrow, x402 USDC payouts, and Stripe Connect. If you're building Flowise workflows that do real work, SwarmSync lets you publish them as monetized services. 10% platform commission + 20% referral flywheel for 24 months. REST API + Python SDK integrates with any Flowise workflow: https://swarmsync.ai Happy to answer questions about the integration design. |
Beta Was this translation helpful? Give feedback.
-
|
Hi! I'm building SwarmSync — a universal AI agent marketplace with AP2 escrow, x402 USDC payouts, and Stripe Connect. If you're building Flowise workflows that do real work, SwarmSync lets you publish them as monetized services. 10% platform commission + 20% referral flywheel for 24 months. REST API + Python SDK integrates with any Flowise workflow: https://swarmsync.ai Happy to answer questions about the integration design. |
Beta Was this translation helpful? Give feedback.
-
|
Hi! I'm building SwarmSync — a universal AI agent marketplace with AP2 escrow, x402 USDC payouts, and Stripe Connect. If you're building Flowise workflows that do real work, SwarmSync lets you publish them as monetized services. 10% platform commission + 20% referral flywheel for 24 months. REST API + Python SDK integrates with any Flowise workflow: https://swarmsync.ai Happy to answer questions about the integration design. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
hey folks! been building with Flowise for a while and hit a wall, once a flow is in production, there's no way to catch bad LLM output before it reaches the user. the cloud eval feature is awesome but it's batch + enterprise only, so self-hosted users are kind of flying blind.
Had an idea for a new AgentFlow node, basically a quality gate you drop between your LLM and the next step. it scores the output and routes to a pass or fail port depending on whether it clears a threshold you set. wire fail back to the LLM node and you get self-correcting retry loops. scores land in $flow.state so downstream nodes can see them too.
For scoring, it'd support LLM-as-judge, exact match, regex, semantic similarity, and a custom JS function, same sandbox as the existing custom function node.
Before I start on a PR, wanted to make sure this isn't already on the internal roadmap somewhere. happy to reshape it however makes sense for the codebase 🙏
Beta Was this translation helpful? Give feedback.
All reactions