Skip to content

Prompt Stability Principle

Prompt Stability Principle (提示稳定性原则) is a design guideline in AI agent architecture asserting that the system prompt should remain constant and invariant throughout the duration of a single conversation session^[001-TODO__Hermes_Agent_·_设计哲学与思维框架.md]。

This principle dictates that the core instructions defining the agent's role, capabilities, and constraints must not be mutated, modified, or rewritten dynamically during an ongoing interaction^[001-TODO__Hermes_Agent_·_设计哲学与思维框架.md].

Core Constraint: Cache Safety

The primary technical motivation for this principle is cache preservation^[001-TODO__Hermes_Agent_·_设计哲学与思维框架.md]。

Large Language Model (LLM) APIs often utilize caching mechanisms (such as KV-cache or semantic caching) to reduce latency and costs for long prompts. If the system prompt changes mid-conversation (e.g., via automatic refinement or dynamic injection of instructions), it invalidates these caches^[001-TODO__Hermes_Agent_·_设计哲学与思维框架.md]。

  • Mutation Breaks Caching: Changing the system prompt forces the system to reprocess the entire prompt context, significantly increasing inference time and computational cost.
  • Explicit Action Exception: The only acceptable time for the system prompt to change is via an explicit user action (e.g., the user manually edits the system instructions or switches profiles)^[001-TODO__Hermes_Agent_·_设计哲学与思维框架.md]。

Behavioral Consistency

Beyond technical performance, the principle ensures behavioral consistency.

  • Predictability: An agent that follows a stable set of instructions maintains a consistent persona and decision-making logic throughout the session.
  • Context Integrity: It prevents "context drift," where the agent's interpretation of its own goals shifts subtly over time due to accumulated or conflicting instruction modifications.

Applications

This principle is a critical architectural consideration for systems that manage long-running sessions, such as:

  • AI Coding Agents: Where a stable definition of tools and coding standards is required over long refactoring tasks^[001-TODO__Hermes_Agent_·_设计哲学与思维框架.md]。
  • Multi-turn Conversations: In systems like Hermes Agent, where session continuity and performance are paramount^[001-TODO__Hermes_Agent_·_设计哲学与思维框架.md]。
  • Closed Learning Loop: The concept of an agent learning between sessions or asynchronously, as opposed to altering its instructions during a session.
  • [[Platform-Agnostic Core]]: An architectural pattern often associated with systems that prioritize stability and consistency across different interfaces.

Sources

  • 001-TODO__Hermes_Agent_·_设计哲学与思维框架.md