Agent safety and permissions



I had a good chat with Jose about agent safety after he shared this article:

https://www.the-independent.com/tech/claude-ai-agent-deletes-startup-anthropic-b2966176.html

It’s a good reminder that we should not rely only on the model “doing the right thing.” If an agent has too much access, one bad decision can cause real damage.

I think this is a good opportunity for ASTRA to be stronger than a raw Claude Code/Cursor-style workflow by adding better controls around the agent.

Some questions worth discussing:

- What should the default permission mode be?
- Should broad autonomous permissions be opt-in?
- Which actions should always need user approval?
- How should we handle destructive commands, production systems, databases, and credentials?
- Can we add deterministic safety checks outside the prompt/model?
- What should we log so users can audit what happened?

A few possible ideas:

- Default to restricted/read-only permissions when possible.
- Require approval for destructive or production-impacting actions.
- Add a guard layer for risky commands.
- Make production credentials and production workspaces harder to touch accidentally.
- Show clearer risk labels for tasks, skills, tools, and connectors.

Would love to hear thoughts and ideas from everyone. @garricko @jdposada @irvins @sboosi


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent safety and permissions #6

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Agent safety and permissions #6

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions