Anthropic’s browser agent got hijacked 31.5% of the time before safeguards engaged

Across the frontier labs, the highest prompt injection figures published this spring are Anthropic’s. Point a red-teamer at its newest model in a browser, and the attacker hijacked it 31.5% of the time before safeguards engaged. OpenAI, Google, and Meta never gave security leaders a comparable number to set beside it. That figure looks like a liability. In this comparison, it is the opposite. It's
Source
VentureBeat
Opens original article in a new tab


