The 'Intern' in the Machine: Why LLMs Need a Script to Scale

• Large language modelsare amazing, but they can’t do everything by themselves. • This truth became clear as we prepared for our internal Company Kickoff. • Salesforce had built an agent to help our thousands of sellers learn new product lines - a task that, on paper, was a perfect assignment for an LLM. • We could feed it all of our documentation and knowledge, and it could generate a bank of questions and score them with nuanced rubrics. • But as the LLM interacted with our team, it began to drift - the phenomenon where an agent’s behavior shifts away from its original purpose as it encounters new data or human inputs. • One day, the quiz would be a perfect assessment of core value propositions; the next, after encountering a new prompt or piece of documentation, it would wander into a cul-de-sac.

Article Summaries:

Salesforce’s internal kickoff revealed a common problem with large language models (LLMs): drift. An LLM‑powered agent designed to train sellers began producing irrelevant or overly detailed questions when exposed to new prompts, risking poor service and customer loss. To counter this, Salesforce introduced a hybrid engine that blends the generative power of probabilistic LLMs with deterministic controls. The new system uses Agent Graph to map task flows and Agent Script to enforce programmatic constraints, keeping agents focused on essential business objectives while still allowing flexible reasoning. This approach aims to deliver enterprise‑grade reliability without sacrificing the creativity that LLMs offer.

Sources:

https://www.salesforce.com/news/stories/why-llms-need-script-to-scale/ (Latest source article published: 2026-02-18 16:00 UTC)