<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>LLM on Tenu Tech Brief</title>
    <link>https://cluster-site.onrender.com/tags/llm/</link>
    <description>Recent content in LLM on Tenu Tech Brief</description>
    <generator>Hugo -- 0.146.0</generator>
    <language>en-us</language>
    <lastBuildDate>Thu, 26 Feb 2026 06:03:06 +0000</lastBuildDate>
    <atom:link href="https://cluster-site.onrender.com/tags/llm/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference</title>
      <link>https://cluster-site.onrender.com/posts/dualpath-breaking-the-storage-bandwidth-bottleneck-in-agentic-llm-inference/</link>
      <pubDate>Thu, 26 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/dualpath-breaking-the-storage-bandwidth-bottleneck-in-agentic-llm-inference/</guid>
      <description>• Computer Science &amp;gt; Distributed, Parallel, and Cluster Computing [Submitted on 25 Feb 2026] Title:DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference View</description>
    </item>
    <item>
      <title>MSADM: Large Language Model (LLM) Assisted End-to-End Network Health Management Based on Multi-Scale Semanticization</title>
      <link>https://cluster-site.onrender.com/posts/msadm-large-language-model-llm-assisted-end-to-end-network-health-management-based-on-multi-scale-semanticization/</link>
      <pubDate>Thu, 26 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/msadm-large-language-model-llm-assisted-end-to-end-network-health-management-based-on-multi-scale-semanticization/</guid>
      <description>• Computer Science &amp;gt; Networking and Internet Architecture [Submitted on 12 Jun 2024 (v1), last revised 25 Feb 2026 (this version, v3)] Title:MSADM: Large Language Model (LLM) Assis</description>
    </item>
    <item>
      <title>Multi-Layer Scheduling for MoE-Based LLM Reasoning</title>
      <link>https://cluster-site.onrender.com/posts/multi-layer-scheduling-for-moe-based-llm-reasoning/</link>
      <pubDate>Thu, 26 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/multi-layer-scheduling-for-moe-based-llm-reasoning/</guid>
      <description>• Computer Science &amp;gt; Distributed, Parallel, and Cluster Computing [Submitted on 25 Feb 2026] Title:Multi-Layer Scheduling for MoE-Based LLM Reasoning View PDF HTML (experimental)Ab</description>
    </item>
    <item>
      <title>ABD: Default Exception Abduction in Finite First Order Worlds</title>
      <link>https://cluster-site.onrender.com/posts/abd-default-exception-abduction-in-finite-first-order-worlds/</link>
      <pubDate>Tue, 24 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/abd-default-exception-abduction-in-finite-first-order-worlds/</guid>
      <description>• ABD benchmark tests default‑exception abduction in finite first‑order logical worlds. • Models generate sparse exception formulas to restore satisfiability under abnormality pred</description>
    </item>
    <item>
      <title>BiScale: Energy-Efficient Disaggregated LLM Serving via Phase-Aware Placement and DVFS</title>
      <link>https://cluster-site.onrender.com/posts/biscale-energy-efficient-disaggregated-llm-serving-via-phase-aware-placement-and-dvfs/</link>
      <pubDate>Tue, 24 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/biscale-energy-efficient-disaggregated-llm-serving-via-phase-aware-placement-and-dvfs/</guid>
      <description>• Prefill/decode disaggregation improves latency-throughput tradeoff for large language model serving. • Energy consumption remains high; autoscaling is too coarse-grained for rapi</description>
    </item>
    <item>
      <title>ConfSpec: Efficient Step-Level Speculative Reasoning via Confidence-Gated Verification</title>
      <link>https://cluster-site.onrender.com/posts/confspec-efficient-step-level-speculative-reasoning-via-confidence-gated-verification/</link>
      <pubDate>Tue, 24 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/confspec-efficient-step-level-speculative-reasoning-via-confidence-gated-verification/</guid>
      <description>• ConfSpec introduces confidence‑gated cascaded verification for step‑level speculative reasoning efficiently. • Small draft models quickly verify reasoning steps, accepting high‑c</description>
    </item>
    <item>
      <title>Early Evidence of Vibe-Proving with Consumer LLMs: A Case Study on Spectral Region Characterization with ChatGPT-5.2 (Thinking)</title>
      <link>https://cluster-site.onrender.com/posts/early-evidence-of-vibe-proving-with-consumer-llms-a-case-study-on-spectral-region-characterization-with-chatgpt-5.2-thinking/</link>
      <pubDate>Tue, 24 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/early-evidence-of-vibe-proving-with-consumer-llms-a-case-study-on-spectral-region-characterization-with-chatgpt-5.2-thinking/</guid>
      <description>• LLMs increasingly used as scientific copilots, but research-level math evidence limited. • Case study uses ChatGPT-5.2 (Thinking) to resolve Conjecture 20 on spectral region of 4</description>
    </item>
    <item>
      <title>Federated Reasoning Distillation Framework with Model Learnability-Aware Data Allocation</title>
      <link>https://cluster-site.onrender.com/posts/federated-reasoning-distillation-framework-with-model-learnability-aware-data-allocation/</link>
      <pubDate>Tue, 24 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/federated-reasoning-distillation-framework-with-model-learnability-aware-data-allocation/</guid>
      <description>• Addresses bidirectional model learnability gap in federated LLM-SLM reasoning collaboration. • Introduces LaDa framework with learnability-aware data filter for high-reward sampl</description>
    </item>
    <item>
      <title>Feedback-based Automated Verification in Vibe Coding of CAS Adaptation Built on Constraint Logic</title>
      <link>https://cluster-site.onrender.com/posts/feedback-based-automated-verification-in-vibe-coding-of-cas-adaptation-built-on-constraint-logic/</link>
      <pubDate>Tue, 24 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/feedback-based-automated-verification-in-vibe-coding-of-cas-adaptation-built-on-constraint-logic/</guid>
      <description>• Leveraged generative LLMs to auto‑generate Adaptation Manager code for CAS systems. • Introduced vibe coding feedback loops to iteratively test and refine generated AMs. • Develo</description>
    </item>
    <item>
      <title>FineRef: Fine-Grained Error Reflection and Correction for Long-Form Generation with Citations</title>
      <link>https://cluster-site.onrender.com/posts/fineref-fine-grained-error-reflection-and-correction-for-long-form-generation-with-citations/</link>
      <pubDate>Tue, 24 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/fineref-fine-grained-error-reflection-and-correction-for-long-form-generation-with-citations/</guid>
      <description>• FineRef introduces fine-grained error reflection for citation mismatch and irrelevance in long‑form LLM generation. • Two‑stage training: supervised fine‑tuning with attempt‑refl</description>
    </item>
    <item>
      <title>From &#39;Help&#39; to Helpful: A Hierarchical Assessment of LLMs in Mental e-Health Applications</title>
      <link>https://cluster-site.onrender.com/posts/from-help-to-helpful-a-hierarchical-assessment-of-llms-in-mental-e-health-applications/</link>
      <pubDate>Tue, 24 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/from-help-to-helpful-a-hierarchical-assessment-of-llms-in-mental-e-health-applications/</guid>
      <description>• Evaluated 11 LLMs generating six-word subject lines for German counselling emails. • Used hierarchical assessment: first categorize outputs, then rank within categories. • Nine a</description>
    </item>
    <item>
      <title>LLM-Assisted Replication for Quantitative Social Science</title>
      <link>https://cluster-site.onrender.com/posts/llm-assisted-replication-for-quantitative-social-science/</link>
      <pubDate>Tue, 24 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/llm-assisted-replication-for-quantitative-social-science/</guid>
      <description>• Replication crisis threatens empirical research credibility, driven by high costs and low incentives for replication. • LLMs accelerate scientific output by automating writing, c</description>
    </item>
    <item>
      <title>Many AI Analysts, One Dataset: Navigating the Agentic Data Science Multiverse</title>
      <link>https://cluster-site.onrender.com/posts/many-ai-analysts-one-dataset-navigating-the-agentic-data-science-multiverse/</link>
      <pubDate>Tue, 24 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/many-ai-analysts-one-dataset-navigating-the-agentic-data-science-multiverse/</guid>
      <description>• AI analysts replicate many‑analyst diversity at scale using large language models. • LLMs and prompt framing generate distinct analytic pipelines on the same dataset. • An AI aud</description>
    </item>
    <item>
      <title>Prompt Optimization Via Diffusion Language Models</title>
      <link>https://cluster-site.onrender.com/posts/prompt-optimization-via-diffusion-language-models/</link>
      <pubDate>Tue, 24 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/prompt-optimization-via-diffusion-language-models/</guid>
      <description>• Diffusion-based framework refines system prompts via masked denoising in an iterative manner. • Conditions on interaction traces: user queries, model responses, and optional feed</description>
    </item>
    <item>
      <title>ReportLogic: Evaluating Logical Quality in Deep Research Reports</title>
      <link>https://cluster-site.onrender.com/posts/reportlogic-evaluating-logical-quality-in-deep-research-reports/</link>
      <pubDate>Tue, 24 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/reportlogic-evaluating-logical-quality-in-deep-research-reports/</guid>
      <description>• LLMs increasingly synthesize research into structured reports, but logical reliability remains unassessed. • ReportLogic benchmark quantifies report‑level logical quality for dee</description>
    </item>
    <item>
      <title>Spilled Energy in Large Language Models</title>
      <link>https://cluster-site.onrender.com/posts/spilled-energy-in-large-language-models/</link>
      <pubDate>Tue, 24 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/spilled-energy-in-large-language-models/</guid>
      <description>• Reinterprets LLM softmax as Energy-Based Model, enabling energy tracking during decoding. • Introduces training‑free metrics: spilled energy and marginalized energy from logits.</description>
    </item>
    <item>
      <title>The Convergence of Schema-Guided Dialogue Systems and the Model Context Protocol</title>
      <link>https://cluster-site.onrender.com/posts/the-convergence-of-schema-guided-dialogue-systems-and-the-model-context-protocol/</link>
      <pubDate>Tue, 24 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/the-convergence-of-schema-guided-dialogue-systems-and-the-model-context-protocol/</guid>
      <description>• Schema-Guided Dialogue (SGD) and Model Context Protocol (MCP) converge as unified deterministic LLM-agent frameworks. • Both rely on schemas to encode tool signatures, operationa</description>
    </item>
    <item>
      <title>WANSpec: Leveraging Global Compute Capacity for LLM Inference</title>
      <link>https://cluster-site.onrender.com/posts/wanspec-leveraging-global-compute-capacity-for-llm-inference/</link>
      <pubDate>Tue, 24 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/wanspec-leveraging-global-compute-capacity-for-llm-inference/</guid>
      <description>• WANSpec leverages under‑utilized global data centers for LLM inference to reduce latency and cost. • Uses speculative decoding by moving draft model to low‑demand GPUs, cutting f</description>
    </item>
    <item>
      <title>This AI can improve your peer review - and make it more polite</title>
      <link>https://cluster-site.onrender.com/posts/this-ai-can-improve-your-peer-review-and-make-it-more-polite/</link>
      <pubDate>Tue, 24 Feb 2026 00:39:27 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/this-ai-can-improve-your-peer-review-and-make-it-more-polite/</guid>
      <description>• AI coach transforms peer reviews into more constructive, less toxic feedback. • Stanford researchers trained LLMs on curated reviews flagged as vague or unprofessional. • The Rev</description>
    </item>
    <item>
      <title>When large language models are reliable for judging empathic communication</title>
      <link>https://cluster-site.onrender.com/posts/when-large-language-models-are-reliable-for-judging-empathic-communication/</link>
      <pubDate>Tue, 24 Feb 2026 00:35:23 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/when-large-language-models-are-reliable-for-judging-empathic-communication/</guid>
      <description>• LLMs generate empathic responses, but reliability of judging empathy remains unclear. • Study compares expert, crowdworker, and LLM annotations across four psychological framewor</description>
    </item>
    <item>
      <title>Beyond a Single Extractor: Re-thinking HTML-to-Text Extraction for LLM Pretraining</title>
      <link>https://cluster-site.onrender.com/posts/beyond-a-single-extractor-re-thinking-html-to-text-extraction-for-llm-pretraining/</link>
      <pubDate>Tue, 24 Feb 2026 00:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/beyond-a-single-extractor-re-thinking-html-to-text-extraction-for-llm-pretraining/</guid>
      <description>• Beyond a Single Extractor: Re-thinking HTML-to-Text Extraction for LLM Pretraining Beyond a Single Extractor: Re-thinking HTML-to-Text Extraction for LLM Pretraining AuthorsJeffr</description>
    </item>
    <item>
      <title>How Exposed Endpoints Increase Risk Across LLM Infrastructure</title>
      <link>https://cluster-site.onrender.com/posts/how-exposed-endpoints-increase-risk-across-llm-infrastructure/</link>
      <pubDate>Mon, 23 Feb 2026 11:58:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/how-exposed-endpoints-increase-risk-across-llm-infrastructure/</guid>
      <description>• How Exposed Endpoints Increase Risk Across LLM Infrastructure As more organizations run their own Large Language Models (LLMs), they are also deploying more internal services and</description>
    </item>
    <item>
      <title>AI Coach Improves Peer Review Tone</title>
      <link>https://cluster-site.onrender.com/posts/ai-coach-improves-peer-review-tone/</link>
      <pubDate>Mon, 23 Feb 2026 10:25:58 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/ai-coach-improves-peer-review-tone/</guid>
      <description>AI coach provides constructive feedback, turning vague reviews into detailed, actionable suggestions. The tool reduces unprofessional tone, eliminating personal attacks and factual</description>
    </item>
    <item>
      <title>Agentic Unlearning: When LLM Agent Meets Machine Unlearning</title>
      <link>https://cluster-site.onrender.com/posts/agentic-unlearning-when-llm-agent-meets-machine-unlearning/</link>
      <pubDate>Mon, 23 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/agentic-unlearning-when-llm-agent-meets-machine-unlearning/</guid>
      <description>• Computer Science &amp;gt; Machine Learning [Submitted on 6 Feb 2026] Title:Agentic Unlearning: When LLM Agent Meets Machine Unlearning View PDF HTML (experimental)Abstract:In this paper</description>
    </item>
    <item>
      <title>AI Hallucination from Students&#39; Perspective: A Thematic Analysis</title>
      <link>https://cluster-site.onrender.com/posts/ai-hallucination-from-students-perspective-a-thematic-analysis/</link>
      <pubDate>Mon, 23 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/ai-hallucination-from-students-perspective-a-thematic-analysis/</guid>
      <description>• Students rely on LLMs, hallucinations threaten learning accuracy. • Survey of 63 students revealed common hallucination types: fabricated citations, false facts, overconfidence.</description>
    </item>
    <item>
      <title>Assessing LLM Response Quality in the Context of Technology-Facilitated Abuse</title>
      <link>https://cluster-site.onrender.com/posts/assessing-llm-response-quality-in-the-context-of-technology-facilitated-abuse/</link>
      <pubDate>Mon, 23 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/assessing-llm-response-quality-in-the-context-of-technology-facilitated-abuse/</guid>
      <description>• Computer Science &amp;gt; Human-Computer Interaction [Submitted on 11 Jan 2026] Title:Assessing LLM Response Quality in the Context of Technology-Facilitated Abuse View PDF HTML (experi</description>
    </item>
    <item>
      <title>BioBridge: Bridging Proteins and Language for Enhanced Biological Reasoning with LLMs</title>
      <link>https://cluster-site.onrender.com/posts/biobridge-bridging-proteins-and-language-for-enhanced-biological-reasoning-with-llms/</link>
      <pubDate>Mon, 23 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/biobridge-bridging-proteins-and-language-for-enhanced-biological-reasoning-with-llms/</guid>
      <description>• BioBridge fuses protein language models with general LLMs to enhance biological reasoning across diverse tasks. • Domain-Incremental Continual Pre‑Training (DICP) injects domain</description>
    </item>
    <item>
      <title>CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models</title>
      <link>https://cluster-site.onrender.com/posts/codescaler-scaling-code-llm-training-and-test-time-inference-via-execution-free-reward-models/</link>
      <pubDate>Mon, 23 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/codescaler-scaling-code-llm-training-and-test-time-inference-via-execution-free-reward-models/</guid>
      <description>• Computer Science &amp;gt; Machine Learning [Submitted on 4 Feb 2026] Title:CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models View PDF HTML (</description>
    </item>
    <item>
      <title>Microsoft removes guide on how to train LLMs on pirated Harry Potter books</title>
      <link>https://cluster-site.onrender.com/posts/microsoft-removes-guide-on-how-to-train-llms-on-pirated-harry-potter-books/</link>
      <pubDate>Fri, 20 Feb 2026 12:11:28 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/microsoft-removes-guide-on-how-to-train-llms-on-pirated-harry-potter-books/</guid>
      <description>• Microsoft removed blog post that promoted using pirated Harry Potter books to train LLMs. • Post was written by senior product manager Pooja Kamath, advocating dataset for genera</description>
    </item>
    <item>
      <title>The On-Device LLM Revolution</title>
      <link>https://cluster-site.onrender.com/posts/the-on-device-llm-revolution/</link>
      <pubDate>Fri, 20 Feb 2026 08:01:56 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/the-on-device-llm-revolution/</guid>
      <description>• Why 3B to 30B models are moving to the edge - and what that means for silicon. • The AI world is experiencing a fundamental shift. • After years of cloud-centric inference domina</description>
    </item>
    <item>
      <title>A Few-Shot LLM Framework for Extreme Day Classification in Electricity Markets</title>
      <link>https://cluster-site.onrender.com/posts/a-few-shot-llm-framework-for-extreme-day-classification-in-electricity-markets/</link>
      <pubDate>Fri, 20 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/a-few-shot-llm-framework-for-extreme-day-classification-in-electricity-markets/</guid>
      <description>• Computer Science &amp;gt; Machine Learning [Submitted on 17 Feb 2026] Title:A Few-Shot LLM Framework for Extreme Day Classification in Electricity Markets View PDF HTML (experimental)Ab</description>
    </item>
    <item>
      <title>AgentLAB: Benchmarking LLM Agents against Long-Horizon Attacks</title>
      <link>https://cluster-site.onrender.com/posts/agentlab-benchmarking-llm-agents-against-long-horizon-attacks/</link>
      <pubDate>Fri, 20 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/agentlab-benchmarking-llm-agents-against-long-horizon-attacks/</guid>
      <description>• Computer Science &amp;gt; Artificial Intelligence [Submitted on 18 Feb 2026] Title:AgentLAB: Benchmarking LLM Agents against Long-Horizon Attacks View PDF HTML (experimental)Abstract:LL</description>
    </item>
    <item>
      <title>DeepContext: Stateful Real-Time Detection of Multi-Turn Adversarial Intent Drift in LLMs</title>
      <link>https://cluster-site.onrender.com/posts/deepcontext-stateful-real-time-detection-of-multi-turn-adversarial-intent-drift-in-llms/</link>
      <pubDate>Fri, 20 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/deepcontext-stateful-real-time-detection-of-multi-turn-adversarial-intent-drift-in-llms/</guid>
      <description>• DeepContext introduces stateful monitoring for LLM safety, tracking intent across turns. • Uses RNN to process fine‑tuned turn‑level embeddings, preserving conversation context.</description>
    </item>
    <item>
      <title>Guiding LLM-Based Human Mobility Simulation with Mobility Measures from Shared Data</title>
      <link>https://cluster-site.onrender.com/posts/guiding-llm-based-human-mobility-simulation-with-mobility-measures-from-shared-data/</link>
      <pubDate>Fri, 20 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/guiding-llm-based-human-mobility-simulation-with-mobility-measures-from-shared-data/</guid>
      <description>• M2LSimu introduces a mobility-measures guided framework for LLM-based human mobility simulation. • It coordinates individual agents using shared data, capturing emergent collecti</description>
    </item>
    <item>
      <title>Mind the GAP: Text Safety Does Not Transfer to Tool-Call Safety in LLM Agents</title>
      <link>https://cluster-site.onrender.com/posts/mind-the-gap-text-safety-does-not-transfer-to-tool-call-safety-in-llm-agents/</link>
      <pubDate>Fri, 20 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/mind-the-gap-text-safety-does-not-transfer-to-tool-call-safety-in-llm-agents/</guid>
      <description>• Computer Science &amp;gt; Artificial Intelligence [Submitted on 18 Feb 2026] Title:Mind the GAP: Text Safety Does Not Transfer to Tool-Call Safety in LLM Agents View PDF HTML (experimen</description>
    </item>
    <item>
      <title>Simple Baselines are Competitive with Code Evolution</title>
      <link>https://cluster-site.onrender.com/posts/simple-baselines-are-competitive-with-code-evolution/</link>
      <pubDate>Fri, 20 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/simple-baselines-are-competitive-with-code-evolution/</guid>
      <description>• Code evolution uses LLMs to mutate code, yet lacks baseline comparisons. • Authors test two simple baselines across math bounds, agentic scaffolds, and ML contests. • Baselines m</description>
    </item>
    <item>
      <title>[RFC] TensaLang: A tensor-first language for LLM inference, lowering through MLIR to CPU/CUDA</title>
      <link>https://cluster-site.onrender.com/posts/rfc-tensalang-a-tensor-first-language-for-llm-inference-lowering-through-mlir-to-cpu/cuda/</link>
      <pubDate>Thu, 19 Feb 2026 22:24:46 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/rfc-tensalang-a-tensor-first-language-for-llm-inference-lowering-through-mlir-to-cpu/cuda/</guid>
      <description>• Hello, I&amp;rsquo;ve been working on a project called TensaLang and it&amp;rsquo;s finally at a point worth sharing. • It&amp;rsquo;s a small language + compiler + runtime for writing LLM forward passes dire</description>
    </item>
    <item>
      <title>How your LLM is silently hallucinating company revenue</title>
      <link>https://cluster-site.onrender.com/posts/how-your-llm-is-silently-hallucinating-company-revenue/</link>
      <pubDate>Thu, 19 Feb 2026 21:06:33 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/how-your-llm-is-silently-hallucinating-company-revenue/</guid>
      <description>• We&amp;rsquo;re so glad you&amp;rsquo;re here. • You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game. • Check</description>
    </item>
    <item>
      <title>The Perplexity Paradox: Why Code Compresses Better Than Math in LLM Prompts</title>
      <link>https://cluster-site.onrender.com/posts/the-perplexity-paradox-why-code-compresses-better-than-math-in-llm-prompts/</link>
      <pubDate>Thu, 19 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/the-perplexity-paradox-why-code-compresses-better-than-math-in-llm-prompts/</guid>
      <description>• Computer Science &amp;gt; Computation and Language [Submitted on 21 Jan 2026] Title:The Perplexity Paradox: Why Code Compresses Better Than Math in LLM Prompts View PDF HTML (experiment</description>
    </item>
    <item>
      <title>Improving LLM Reliability through Hybrid Abstention and Adaptive Detection</title>
      <link>https://cluster-site.onrender.com/posts/improving-llm-reliability-through-hybrid-abstention-and-adaptive-detection/</link>
      <pubDate>Wed, 18 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/improving-llm-reliability-through-hybrid-abstention-and-adaptive-detection/</guid>
      <description>• Computer Science &amp;gt; Artificial Intelligence [Submitted on 17 Feb 2026] Title:Improving LLM Reliability through Hybrid Abstention and Adaptive Detection View PDF HTML (experimental</description>
    </item>
    <item>
      <title>Improving LLM Reliability through Hybrid Abstention and Adaptive Detection</title>
      <link>https://cluster-site.onrender.com/posts/improving-llm-reliability-through-hybrid-abstention-and-adaptive-detection/</link>
      <pubDate>Wed, 18 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/improving-llm-reliability-through-hybrid-abstention-and-adaptive-detection/</guid>
      <description>• Computer Science &amp;gt; Artificial Intelligence [Submitted on 17 Feb 2026] Title:Improving LLM Reliability through Hybrid Abstention and Adaptive Detection View PDF HTML (experimental</description>
    </item>
    <item>
      <title>Protecting Language Models Against Unauthorized Distillation through Trace Rewriting</title>
      <link>https://cluster-site.onrender.com/posts/protecting-language-models-against-unauthorized-distillation-through-trace-rewriting/</link>
      <pubDate>Wed, 18 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/protecting-language-models-against-unauthorized-distillation-through-trace-rewriting/</guid>
      <description>• Uses trace rewriting to deter unauthorized knowledge distillation from large language models. • Introduces anti-distillation techniques that degrade training usefulness while kee</description>
    </item>
    <item>
      <title>Quantifying construct validity in large language model evaluations</title>
      <link>https://cluster-site.onrender.com/posts/quantifying-construct-validity-in-large-language-model-evaluations/</link>
      <pubDate>Wed, 18 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/quantifying-construct-validity-in-large-language-model-evaluations/</guid>
      <description>• LLM benchmarks often misrepresent true model capabilities due to contamination and annotator errors. • Construct validity is essential to ensure benchmarks truly measure desired</description>
    </item>
    <item>
      <title>Bruteforcing Accidental Antenna Designs</title>
      <link>https://cluster-site.onrender.com/posts/bruteforcing-accidental-antenna-designs/</link>
      <pubDate>Wed, 18 Feb 2026 03:00:45 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/bruteforcing-accidental-antenna-designs/</guid>
      <description>• Antenna design often seen as black art, but brute-force GPU approach explored. • Janne, novice, used VNA and GPU-based FDTD to simulate and optimize antennas. • Leveraged LLMs to</description>
    </item>
    <item>
      <title>CrowdStrike&#39;s Agentic Security Powered by Human‑AI Feedback Loop</title>
      <link>https://cluster-site.onrender.com/posts/crowdstrikes-agentic-security-powered-by-humanai-feedback-loop/</link>
      <pubDate>Tue, 17 Feb 2026 08:33:08 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/crowdstrikes-agentic-security-powered-by-humanai-feedback-loop/</guid>
      <description>• CrowdStrike&amp;rsquo;s new Agentic Security framework blends human oversight with AI‑driven threat detection. • The system uses a continuous feedback loop where analysts refine AI models</description>
    </item>
    <item>
      <title>BotzoneBench: Scalable LLM Evaluation via Graded AI Anchors</title>
      <link>https://cluster-site.onrender.com/posts/botzonebench-scalable-llm-evaluation-via-graded-ai-anchors/</link>
      <pubDate>Tue, 17 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/botzonebench-scalable-llm-evaluation-via-graded-ai-anchors/</guid>
      <description>• Computer Science &amp;gt; Artificial Intelligence [Submitted on 22 Jan 2026] Title:BotzoneBench: Scalable LLM Evaluation via Graded AI Anchors View PDF HTML (experimental)Abstract:Large</description>
    </item>
    <item>
      <title>BotzoneBench: Scalable LLM Evaluation via Graded AI Anchors</title>
      <link>https://cluster-site.onrender.com/posts/botzonebench-scalable-llm-evaluation-via-graded-ai-anchors/</link>
      <pubDate>Tue, 17 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/botzonebench-scalable-llm-evaluation-via-graded-ai-anchors/</guid>
      <description>• Computer Science &amp;gt; Artificial Intelligence [Submitted on 22 Jan 2026] Title:BotzoneBench: Scalable LLM Evaluation via Graded AI Anchors View PDF HTML (experimental)Abstract:Large</description>
    </item>
    <item>
      <title>TemporalBench: A Benchmark for Evaluating LLM-Based Agents on Contextual and Event-Informed Time Series Tasks</title>
      <link>https://cluster-site.onrender.com/posts/temporalbench-a-benchmark-for-evaluating-llm-based-agents-on-contextual-and-event-informed-time-series-tasks/</link>
      <pubDate>Tue, 17 Feb 2026 05:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/temporalbench-a-benchmark-for-evaluating-llm-based-agents-on-contextual-and-event-informed-time-series-tasks/</guid>
      <description>• TemporalBench offers a multi-domain benchmark for temporal reasoning in LLM agents. • Four-tier taxonomy tests historical structure, context-free, contextual, and event-condition</description>
    </item>
    <item>
      <title>Asynchronous Verified Semantic Caching for Tiered LLM Architectures</title>
      <link>https://cluster-site.onrender.com/posts/asynchronous-verified-semantic-caching-for-tiered-llm-architectures/</link>
      <pubDate>Mon, 16 Feb 2026 00:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/asynchronous-verified-semantic-caching-for-tiered-llm-architectures/</guid>
      <description>• Asynchronous Verified Semantic Caching for Tiered LLM Architectures Asynchronous Verified Semantic Caching for Tiered LLM Architectures AuthorsAsmit Kumar Singh, Haozhe Wang, Lax</description>
    </item>
    <item>
      <title>Solving Context Size Issues with Docker Model Runner</title>
      <link>https://cluster-site.onrender.com/posts/solving-context-size-issues-with-docker-model-runner/</link>
      <pubDate>Fri, 13 Feb 2026 13:57:36 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/solving-context-size-issues-with-docker-model-runner/</guid>
      <description>• Context window limits hinder large language model usage. • Context packing packs multiple messages into single prompt. • Docker Model Runner supports context packing techniques.</description>
    </item>
    <item>
      <title>Automating Inference Optimizations with NVIDIA TensorRT LLM AutoDeploy</title>
      <link>https://cluster-site.onrender.com/posts/automating-inference-optimizations-with-nvidia-tensorrt-llm-autodeploy/</link>
      <pubDate>Mon, 09 Feb 2026 18:30:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/automating-inference-optimizations-with-nvidia-tensorrt-llm-autodeploy/</guid>
      <description>• NVIDIA TensorRT LLM enables developers to build high-performance inference engines for large language models (LLMs), but deploying a new architecture traditionally requires signi</description>
    </item>
    <item>
      <title>A one-prompt attack that breaks LLM safety alignment</title>
      <link>https://cluster-site.onrender.com/posts/a-one-prompt-attack-that-breaks-llm-safety-alignment/</link>
      <pubDate>Mon, 09 Feb 2026 17:12:11 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/a-one-prompt-attack-that-breaks-llm-safety-alignment/</guid>
      <description>• Share Link copied to clipboard! • Content types Research Topics Actionable threat insights AI and agents Security management Large language models (LLMs) and diffusion models now</description>
    </item>
    <item>
      <title>LLM Inference Benchmarking - Measure What Matters</title>
      <link>https://cluster-site.onrender.com/posts/llm-inference-benchmarking-measure-what-matters/</link>
      <pubDate>Fri, 06 Feb 2026 14:46:06 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/llm-inference-benchmarking-measure-what-matters/</guid>
      <description>• By Piyush Srivastava, Karnik Modi, Stephen Varela, and Rithish Ramesh Production-grade LLM inference is a complex systems challenge, requiring deep co-designs - from hardware pri</description>
    </item>
    <item>
      <title>Code smells for AI agents: Q&amp;A with Eno Reyes of Factory</title>
      <link>https://cluster-site.onrender.com/posts/code-smells-for-ai-agents-qa-with-eno-reyes-of-factory/</link>
      <pubDate>Wed, 04 Feb 2026 15:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/code-smells-for-ai-agents-qa-with-eno-reyes-of-factory/</guid>
      <description>• Factory builds autonomous coding agents for large engineering teams, covering full SDLC. • Their platform includes tools to assess code quality and agent impact. • Factory&amp;rsquo;s agen</description>
    </item>
    <item>
      <title>Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective</title>
      <link>https://cluster-site.onrender.com/posts/unlocking-agentic-rl-training-for-gpt-oss-a-practical-retrospective/</link>
      <pubDate>Tue, 27 Jan 2026 01:53:15 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/unlocking-agentic-rl-training-for-gpt-oss-a-practical-retrospective/</guid>
      <description>• Agentic RL extends LLM training beyond single-turn responses to full decision-making via environment interaction. • It collects on‑policy data, optimizing policies across multi‑s</description>
    </item>
    <item>
      <title>Clawdbot with Docker Model Runner, a Private Personal AI Assistant</title>
      <link>https://cluster-site.onrender.com/posts/clawdbot-with-docker-model-runner-a-private-personal-ai-assistant/</link>
      <pubDate>Mon, 26 Jan 2026 20:51:41 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/clawdbot-with-docker-model-runner-a-private-personal-ai-assistant/</guid>
      <description>• Clawdbot + Docker Model Runner enables self-hosted, privacy-first personal AI assistants. • Integrates with Telegram, WhatsApp, Discord, Signal for proactive digital coworker. •</description>
    </item>
    <item>
      <title>The Most Precious Resource</title>
      <link>https://cluster-site.onrender.com/posts/the-most-precious-resource/</link>
      <pubDate>Thu, 22 Jan 2026 16:57:18 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/the-most-precious-resource/</guid>
      <description>• Sequoia invests in Kais Khimji as partner, valuing his work ethic, learning appetite, and EQ. • Kais transitions to founder, launching Blockit, an AI-driven time optimization app</description>
    </item>
    <item>
      <title>The Next Frontier of Runtime Assembly Attacks: Leveraging LLMs to Generate Phishing JavaScript in Real Time</title>
      <link>https://cluster-site.onrender.com/posts/the-next-frontier-of-runtime-assembly-attacks-leveraging-llms-to-generate-phishing-javascript-in-real-time/</link>
      <pubDate>Thu, 22 Jan 2026 11:00:22 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/the-next-frontier-of-runtime-assembly-attacks-leveraging-llms-to-generate-phishing-javascript-in-real-time/</guid>
      <description>• Attackers embed a benign page that calls an LLM API to generate malicious JavaScript in real time. • Prompt engineering bypasses AI safety guardrails, producing polymorphic phish</description>
    </item>
    <item>
      <title>Differential Transformer V2</title>
      <link>https://cluster-site.onrender.com/posts/differential-transformer-v2/</link>
      <pubDate>Tue, 20 Jan 2026 03:20:57 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/differential-transformer-v2/</guid>
      <description>• DiffTransformer V2 doubles query heads, keeps KV heads constant for efficient attention. • Uses differential attention: subtracts paired heads within same GQA group. • Applies si</description>
    </item>
    <item>
      <title>LLM flexibility, Agent Mode improvements, and new agentic experiences in Android Studio Otter 3 Feature Drop</title>
      <link>https://cluster-site.onrender.com/posts/llm-flexibility-agent-mode-improvements-and-new-agentic-experiences-in-android-studio-otter-3-feature-drop/</link>
      <pubDate>Thu, 15 Jan 2026 17:18:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/llm-flexibility-agent-mode-improvements-and-new-agentic-experiences-in-android-studio-otter-3-feature-drop/</guid>
      <description>• Posted by Sandhya Mohan, Senior Product Manager and Trevor Johns, Developer Relations Engineer We are excited to announce that Android Studio Otter 3 Feature Drop is now stable!</description>
    </item>
    <item>
      <title>How Reddit Built a LLM Guardrails Platform</title>
      <link>https://cluster-site.onrender.com/posts/how-reddit-built-a-llm-guardrails-platform/</link>
      <pubDate>Mon, 08 Dec 2025 19:21:13 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/how-reddit-built-a-llm-guardrails-platform/</guid>
      <description>• Written by Charan Akiri, with help from Dylan Raithel. • TL;DR We built a centralized LLM Guardrails Service at Reddit to detect &amp;amp; block malicious &amp;amp; unsafe inputs-including promp</description>
    </item>
    <item>
      <title>Breaking Through the Noise: A Hybrid ML and LLM Framework for Identifying Engaging, Breaking Content on Reddit</title>
      <link>https://cluster-site.onrender.com/posts/breaking-through-the-noise-a-hybrid-ml-and-llm-framework-for-identifying-engaging-breaking-content-on-reddit/</link>
      <pubDate>Tue, 25 Nov 2025 16:26:25 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/breaking-through-the-noise-a-hybrid-ml-and-llm-framework-for-identifying-engaging-breaking-content-on-reddit/</guid>
      <description>• Authors: Andrew Garrett, Md Mansurul Bhuiyan With 10s of thousands of new posts on Reddit each day, identifying content that is simultaneously timely, newsworthy, and engaging pr</description>
    </item>
    <item>
      <title>SpellVault&#39;s evolution: Beyond LLM apps, towards the agentic future</title>
      <link>https://cluster-site.onrender.com/posts/spellvaults-evolution-beyond-llm-apps-towards-the-agentic-future/</link>
      <pubDate>Fri, 21 Nov 2025 00:00:10 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/spellvaults-evolution-beyond-llm-apps-towards-the-agentic-future/</guid>
      <description>• SpellVault&amp;rsquo;s evolution: Beyond LLM apps, towards the agentic future Introduction At Grab, innovation isn&amp;rsquo;t just about building new features; it&amp;rsquo;s about evolving our platforms to</description>
    </item>
    <item>
      <title>Level up your Solidity LLM tooling with Slither-MCP</title>
      <link>https://cluster-site.onrender.com/posts/level-up-your-solidity-llm-tooling-with-slither-mcp/</link>
      <pubDate>Sat, 15 Nov 2025 12:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/level-up-your-solidity-llm-tooling-with-slither-mcp/</guid>
      <description>• We&amp;rsquo;re releasingSlither-MCP, a new tool that augments LLMs with Slither&amp;rsquo;s unmatched static analysis engine. • Slither-MCP benefits virtually every use case for LLMs by exposing Sl</description>
    </item>
    <item>
      <title>How we built a custom vision LLM to improve document processing at Grab</title>
      <link>https://cluster-site.onrender.com/posts/how-we-built-a-custom-vision-llm-to-improve-document-processing-at-grab/</link>
      <pubDate>Tue, 04 Nov 2025 00:00:10 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/how-we-built-a-custom-vision-llm-to-improve-document-processing-at-grab/</guid>
      <description>• How we built a custom vision LLM to improve document processing at Grab Introduction In the world of digital services, accurate extraction of information from user-submitted docu</description>
    </item>
    <item>
      <title>Bringing AI-Aware Traffic Management to Istio: Gateway API Inference Extension Support</title>
      <link>https://cluster-site.onrender.com/posts/bringing-ai-aware-traffic-management-to-istio-gateway-api-inference-extension-support/</link>
      <pubDate>Mon, 28 Jul 2025 00:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/bringing-ai-aware-traffic-management-to-istio-gateway-api-inference-extension-support/</guid>
      <description>• Istio now supports Gateway API Inference Extension, enabling model‑aware, LoRA‑aware routing for AI workloads. • AI inference requests can last seconds to minutes, making routing</description>
    </item>
    <item>
      <title>Defending against Prompt Injection with Structured Queries (StruQ) and Preference Optimization (SecAlign)</title>
      <link>https://cluster-site.onrender.com/posts/defending-against-prompt-injection-with-structured-queries-struq-and-preference-optimization-secalign/</link>
      <pubDate>Fri, 11 Apr 2025 10:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/defending-against-prompt-injection-with-structured-queries-struq-and-preference-optimization-secalign/</guid>
      <description>• LLMs power new apps but prompt injection is top OWASP threat. • Attack injects malicious instructions into untrusted data, overriding trusted prompts. • Real-world examples: Yelp</description>
    </item>
    <item>
      <title>Search Query Understanding with LLMs: From Ideation to Production</title>
      <link>https://cluster-site.onrender.com/posts/search-query-understanding-with-llms-from-ideation-to-production/</link>
      <pubDate>Tue, 04 Feb 2025 00:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/search-query-understanding-with-llms-from-ideation-to-production/</guid>
      <description>• Yelp integrates LLMs to interpret search queries, improving intent detection for millions of daily searches. • The team tackled spelling correction, segmentation, canonicalizatio</description>
    </item>
    <item>
      <title>Virtual Personas for Language Models via an Anthology of Backstories</title>
      <link>https://cluster-site.onrender.com/posts/virtual-personas-for-language-models-via-an-anthology-of-backstories/</link>
      <pubDate>Tue, 12 Nov 2024 09:00:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/virtual-personas-for-language-models-via-an-anthology-of-backstories/</guid>
      <description>• Anthology conditions LLMs with detailed backstories to create consistent virtual personas. • Uses naturalistic life narratives to represent diverse human values and experiences.</description>
    </item>
    <item>
      <title>How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark</title>
      <link>https://cluster-site.onrender.com/posts/how-to-evaluate-jailbreak-methods-a-case-study-with-the-strongreject-benchmark/</link>
      <pubDate>Wed, 28 Aug 2024 15:30:00 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/how-to-evaluate-jailbreak-methods-a-case-study-with-the-strongreject-benchmark/</guid>
      <description>• Researchers tested jailbreak via Scots Gaelic translation, initially replicating 43% success claim. • GPT-4 responded with bomb instructions in Gaelic, but full output differed f</description>
    </item>
    <item>
      <title>Language, Statistics, &amp; Category Theory, Part 1</title>
      <link>https://cluster-site.onrender.com/posts/language-statistics-category-theory-part-1/</link>
      <pubDate>Wed, 07 Jul 2021 20:18:57 +0000</pubDate>
      <guid>https://cluster-site.onrender.com/posts/language-statistics-category-theory-part-1/</guid>
      <description>• Authors propose a new preprint exploring math behind large language models. • Question: how to model transition from probability distributions on text to syntax and semantics. •</description>
    </item>
  </channel>
</rss>
