Ratings

2 Matching Ratings

Rated Article

Alignment faking in large language models

A paper from Anthropic's Alignment Science team on Alignment Faking in AI large language models

anthropic.com 2,000 words

Rated 2024-12-19T19:01:38-0800 - sethherr

The Anthropic Economic Index

Announcement of the new Anthropic Economic Index and description of the new data on AI use in occupations

anthropic.com 2,000 words

Rated 2025-02-10T12:59:35-0800 - sethherr