Claude Models Lead Estonian Propaganda Resistance Benchmark for LLMs

The Estonian Language Institute (ELI) released a new 'Propaganda Resistance' benchmark ranking dozens of large language models on their ability to avoid taking positions on topics that the Russian Federation uses in its strategic narratives. The benchmark was developed to address government concerns about LLMs potentially spreading what they view as dangerous propaganda from foreign adversaries. As a former Soviet Union member independent for just a few decades, many Estonians remain particularly alert to what they see as false narratives promoted from their large and often belligerent eastern neighbor.

ELI Develops 14-Category Testing Framework With Propastop

The Estonian Language Institute partnered with volunteer-run Estonian defense collective Propastop to identify 14 broad categories in which it sees Russian influence operations trying to sway public discussion. These categories range from narratives on the current status of Crimea and justifications for the war in Ukraine to the history of NATO and justification for Russia's annexation of Baltic states during World War II.

For each propaganda category, the researchers developed separate questions phrased to be neutral, biased with "false assumptions" based on Russian propaganda, or to maliciously attempt to elicit explicit misinformation from the LLM. Questions were provided to the models in English, Estonian, and Russian. A separate AI model, calibrated to align with Propastop experts, judged the responses based on the models' ability to "push back on propaganda narratives, without external help" from web search or other external tools.

Claude Opus 4.7 Achieves 94.9 Score in Benchmark Results

Anthropic's Claude models performed best among proprietary frontier models on the new benchmark, with various recent versions of its Sonnet and Opus models taking six of the top 10 spots. Opus 4.7, the best-performing model overall, received a top-rated "Exemplary" mark for its response on 77 percent of questions and a middling "mediocre" rating on just 2 percent of questions. The model achieved a mean final score of 94.9 out of 100 on the benchmark.

FAQ

What is the Estonian Language Institute's Propaganda Resistance benchmark?

The Propaganda Resistance benchmark is a testing framework released by the Estonian Language Institute that ranks large language models on their ability to avoid taking positions on topics used in Russian Federation strategic narratives. The benchmark tests models across 14 propaganda categories using questions in English, Estonian, and Russian.

How did Claude Opus 4.7 perform in the propaganda resistance testing?

Claude Opus 4.7 achieved the highest score of 94.9 out of 100 on the benchmark. The model received an "Exemplary" rating on 77 percent of questions and a "mediocre" rating on only 2 percent of questions. Anthropic's Claude models occupied six of the top 10 positions overall.

Disclaimer: The information on this page may come from third-party sources and is for reference only. It does not represent the views or opinions of Gate and does not constitute any financial, investment, or legal advice. Virtual asset trading involves high risk. Please do not rely solely on the information on this page when making decisions. For details, see the Disclaimer.
Comment
0/400
No comments