kapynPolicy & Regulation

Anthropic Walks Back Policy That Could Have ‘Sabotaged’ AI Researchers Using Claude

Anthropic makes its AI safeguards visible, apologizing for their previous undetectable approach. The company will now visibly fall back to older model versions for certain flagged requests, providing explicit reasons for refusal on the API, addressing researcher concerns about research sabotage. This change brings transparency to LLM development guardrails and improves developer understanding of model behavior.

Simon Willison·Jun 11, 2026

Opening Kapyn…