Go back

Anthropic's 'Anonymous' AI Interviews? An LLM De-Anonymized Them in Minutes

Anthropic's 'Anonymous' AI Interviews? An LLM De-Anonymized Them in Minutes

Anthropic released 1,250 ‘safe’ anonymized interviews. A prof used a stock LLM to unmask 25%—exposing a massive privacy wake-up call for AI data.

Anthropic promised anonymity in their shiny new Interviewer tool. A Northeastern prof just shattered it with off-the-shelf LLMs. Tianshi Li de-anonymized 25% of scientist interviews from 1,250 public ones, linking responses to real papers and people[5].

Interviewer launched December 2025 to gauge AI perspectives, dumping anonymized data for research. Li filtered 24 mentioning studies, fed to a public LLM with internet access—and boom: inferences connected dots humans miss. LLMs as ‘microscopes’ magnifying subtle signals in vast data[5].

Huge for devs handling user data: RAG pipelines, evals, or agent memory now risk re-identification attacks. Train on ‘anon’ corpora? Think twice—proprietary info leaks via clever prompting. A stark reminder pre-deployment audits must simulate adversarial LLMs[5].

Unlike basic dedup tools, this leverages LLM’s web-scale inference, succeeding where rules-based fails. Echoes prompt injection fears but flips to data privacy; Anthropic’s tool now a cautionary benchmark vs. closed evals[5].

Audit your datasets now: prompt an LLM with sample ‘anon’ text + public sources, check linkage risks. Build differential privacy in? Or is ‘anon’ AI data a myth—pushing us to federated learning? Test it yourself.

Source: Northeastern News


Share this post on:

Next Post
LLMs Just Cracked 'Uniquely Human' Language Skills—And Built ConlangCrafter to Prove It

Related Posts

Comments

Share your thoughts using your GitHub account.