Prompt Stability Principle¶
Prompt Stability Principle (提示稳定性原则) is a design guideline in AI agent architecture asserting that the system prompt should remain constant and invariant throughout the duration of a single conversation session^[001-TODO__Hermes_Agent_·_设计哲学与思维框架.md]。
This principle dictates that the core instructions defining the agent's role, capabilities, and constraints must not be mutated, modified, or rewritten dynamically during an ongoing interaction^[001-TODO__Hermes_Agent_·_设计哲学与思维框架.md].
Core Constraint: Cache Safety¶
The primary technical motivation for this principle is cache preservation^[001-TODO__Hermes_Agent_·_设计哲学与思维框架.md]。
Large Language Model (LLM) APIs often utilize caching mechanisms (such as KV-cache or semantic caching) to reduce latency and costs for long prompts. If the system prompt changes mid-conversation (e.g., via automatic refinement or dynamic injection of instructions), it invalidates these caches^[001-TODO__Hermes_Agent_·_设计哲学与思维框架.md]。
- Mutation Breaks Caching: Changing the system prompt forces the system to reprocess the entire prompt context, significantly increasing inference time and computational cost.
- Explicit Action Exception: The only acceptable time for the system prompt to change is via an explicit user action (e.g., the user manually edits the system instructions or switches profiles)^[001-TODO__Hermes_Agent_·_设计哲学与思维框架.md]。
Behavioral Consistency¶
Beyond technical performance, the principle ensures behavioral consistency.
- Predictability: An agent that follows a stable set of instructions maintains a consistent persona and decision-making logic throughout the session.
- Context Integrity: It prevents "context drift," where the agent's interpretation of its own goals shifts subtly over time due to accumulated or conflicting instruction modifications.
Applications¶
This principle is a critical architectural consideration for systems that manage long-running sessions, such as:
- AI Coding Agents: Where a stable definition of tools and coding standards is required over long refactoring tasks^[001-TODO__Hermes_Agent_·_设计哲学与思维框架.md]。
- Multi-turn Conversations: In systems like Hermes Agent, where session continuity and performance are paramount^[001-TODO__Hermes_Agent_·_设计哲学与思维框架.md]。
Related Concepts¶
- Closed Learning Loop: The concept of an agent learning between sessions or asynchronously, as opposed to altering its instructions during a session.
- [[Platform-Agnostic Core]]: An architectural pattern often associated with systems that prioritize stability and consistency across different interfaces.
Sources¶
001-TODO__Hermes_Agent_·_设计哲学与思维框架.md