<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Benchmarks on Tenu Tech Brief</title>
    <link>https://cluster-site.onrender.com/tags/benchmarks/</link>
    <description>Recent content in Benchmarks on Tenu Tech Brief</description>
    <generator>Hugo -- 0.146.0</generator>
    <language>en-us</language>
    <lastBuildDate>Wed, 25 Feb 2026 07:59:10 +0000</lastBuildDate>
    <atom:link href="https://cluster-site.onrender.com/tags/benchmarks/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Google Cloud N4 Series Benchmarks: Google Axion vs. Intel Xeon vs. AMD EPYC Performance</title>
      <link>https://cluster-site.onrender.com/posts/google-cloud-n4-series-benchmarks-google-axion-vs.-intel-xeon-vs.-amd-epyc-performance/</link>
      <pubDate>Tue, 24 Feb 2026 15:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/google-cloud-n4-series-benchmarks-google-axion-vs.-intel-xeon-vs.-amd-epyc-performance/</guid>
      <description>• Google Cloud N4 Series Benchmarks: Google Axion vs. • AMD EPYC Performance Google Cloud recently launched their N4A series powered by their in-house Axion ARM64 processors. • In</description>
    </item>
    <item>
      <title>When AI Benchmarks Plateau: A Systematic Study of Benchmark Saturation</title>
      <link>https://cluster-site.onrender.com/posts/when-ai-benchmarks-plateau-a-systematic-study-of-benchmark-saturation/</link>
      <pubDate>Fri, 20 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/when-ai-benchmarks-plateau-a-systematic-study-of-benchmark-saturation/</guid>
      <description>• Computer Science &amp;gt; Artificial Intelligence [Submitted on 18 Feb 2026] Title:When AI Benchmarks Plateau: A Systematic Study of Benchmark Saturation View PDF HTML (experimental)Abs</description>
    </item>
    <item>
      <title>New Gemini 3.1 Pro crushes previous benchmarks, outperforms GPT 5.2 reasoning</title>
      <link>https://cluster-site.onrender.com/posts/new-gemini-3.1-pro-crushes-previous-benchmarks-outperforms-gpt-5.2-reasoning/</link>
      <pubDate>Thu, 19 Feb 2026 21:13:24 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/new-gemini-3.1-pro-crushes-previous-benchmarks-outperforms-gpt-5.2-reasoning/</guid>
      <description>• New Gemini 3.1 Pro crushes previous benchmarks, outperforms GPT 5.2 reasoning The latest Gemini update sharpens coding support and nearly doubles performance in agentic workflow</description>
    </item>
    <item>
      <title>Reserve Protocol: The Rise of Onchain Market Benchmarks</title>
      <link>https://cluster-site.onrender.com/posts/reserve-protocol-the-rise-of-onchain-market-benchmarks/</link>
      <pubDate>Mon, 09 Feb 2026 17:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/reserve-protocol-the-rise-of-onchain-market-benchmarks/</guid>
      <description>• In November 2025, CMC20 launched on BNB Chain as Reserve&amp;rsquo;s core broad-market onchain index, allowing holders to gain diversified exposure to the top 20 cryptocurrencies by market</description>
    </item>
    <item>
      <title>Paza: Introducing automatic speech recognition benchmarks and models for low resource languages</title>
      <link>https://cluster-site.onrender.com/posts/paza-introducing-automatic-speech-recognition-benchmarks-and-models-for-low-resource-languages/</link>
      <pubDate>Thu, 05 Feb 2026 05:07:55 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/paza-introducing-automatic-speech-recognition-benchmarks-and-models-for-low-resource-languages/</guid>
      <description>• At a glance - Microsoft Research releases PazaBench and Paza automatic speech recognition models, advancing speech technology for low resource languages. • - Human-centered pipel</description>
    </item>
    <item>
      <title>Community Evals: Because we&#39;re done trusting black-box leaderboards over the community</title>
      <link>https://cluster-site.onrender.com/posts/community-evals-because-were-done-trusting-black-box-leaderboards-over-the-community/</link>
      <pubDate>Wed, 04 Feb 2026 00:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/community-evals-because-were-done-trusting-black-box-leaderboards-over-the-community/</guid>
      <description>• Evaluation metrics saturated; MMLU &amp;gt;91%, GSM8K &amp;gt;94%, yet real‑world tasks still fail. • Inconsistent benchmark scores across papers, model cards, and platforms create no single t</description>
    </item>
    <item>
      <title>Introducing Community Benchmarks on Kaggle</title>
      <link>https://cluster-site.onrender.com/posts/introducing-community-benchmarks-on-kaggle/</link>
      <pubDate>Wed, 14 Jan 2026 14:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/introducing-community-benchmarks-on-kaggle/</guid>
      <description>• Introducing Community Benchmarks on Kaggle Jan 14, 2026 Today&amp;rsquo;s AI models require more than static accuracy scores. • Community Benchmarks, a new capability on Kaggle, enables th</description>
    </item>
  </channel>
</rss>
