Skip to content

Agent safety and permissions #6

Description

@aandresalvarez

I had a good chat with Jose about agent safety after he shared this article:

https://www.the-independent.com/tech/claude-ai-agent-deletes-startup-anthropic-b2966176.html

It’s a good reminder that we should not rely only on the model “doing the right thing.” If an agent has too much access, one bad decision can cause real damage.

I think this is a good opportunity for ASTRA to be stronger than a raw Claude Code/Cursor-style workflow by adding better controls around the agent.

Some questions worth discussing:

  • What should the default permission mode be?
  • Should broad autonomous permissions be opt-in?
  • Which actions should always need user approval?
  • How should we handle destructive commands, production systems, databases, and credentials?
  • Can we add deterministic safety checks outside the prompt/model?
  • What should we log so users can audit what happened?

A few possible ideas:

  • Default to restricted/read-only permissions when possible.
  • Require approval for destructive or production-impacting actions.
  • Add a guard layer for risky commands.
  • Make production credentials and production workspaces harder to touch accidentally.
  • Show clearer risk labels for tasks, skills, tools, and connectors.

Would love to hear thoughts and ideas from everyone. @garricko @jdposada @irvins @sboosi

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions