2026-06-01

Hermes Skills Load on Demand and Patch Themselves Mid-Run

hermestool-searchskillstoken-optimizationself-healing

Nous Research announced Tool Search for Hermes Agent on May 29. The feature replaces the all-up-front tool schema dump with a system that loads only the tools the agent needs for a given turn. Separately, community members are documenting cases where skills detect stale instructions and patch themselves during a run. Together, these changes make agent capabilities dynamic - loaded on demand, versioned by reality.

Tool Search: From 41% to 3% Overhead

The previous design placed every tool's JSON schema into every turn's system prompt. With 50 tools registered, tool definitions consumed roughly 20,000 tokens before the agent started any work. For a typical session, tool schemas ate 41% of the context window.

Tool Search replaces the complete schema dump with three bridge tools that the agent calls when it needs specific capabilities:

tool_search - queries the tool registry by name or description
tool_select - loads a specific tool's full schema on demand
tool_clear - unloads schemas when no longer needed

Core tools (terminal, memory, file operations) remain always-loaded. Everything else is fetched when the agent determines the task requires it. The agent already knows its task domain from the user's request, so a file operation triggers tool_search("patch"), which returns the schema, which the agent uses and either keeps or clears.

Metric	Before	After
Tool schemas (50 tools)	~20,000 tokens	~1,500 tokens
Context overhead	41%	3%
Token savings per turn	~18,500 tokens (92.5% reduction)

The feature shipped in the v0.15 release wave, which brought 1,302 commits, 747 merged PRs, and 321 contributors. Tool Search itself was highlighted by @InfomlyLab, who published before/after measurements showing context overhead dropping from 41% to 3% in a 16-tool scenario.

"The old design dumped every tool schema into the prompt on every turn. Fifty tools. Twenty thousand tokens. Before a single line of work. Three bridge tools replaced the entire schema dump: tool_search, tool_select, tool_clear."

@InfomlyLab, May 30, 2026

Skills That Patch Themselves

A separate discussion thread, started by @aelshimy_ and expanded by @Prashtwtz, described a pattern where skills detect their own staleness and self-correct during execution.

"Hermes doesn't need to 'know' a skill is stale in abstract. The signal comes when the agent loads it, hits a mismatch/failure, and patches the skill as part of the run. Curator handles the slower cleanup layer. Skills get versioned by reality."

@Prashtwtz, June 1, 2026

The sequence works as follows:

The agent loads a skill based on task context
The skill's instructions reference a command, path, or behavior that no longer exists
The agent runs the command, gets a failure or unexpected output
The agent rewrites the skill's instructions to match current reality
The skill persists with corrected instructions
Future runs use the corrected version

This is different from periodic audits or central version control. The skill's correctness is determined by execution, not by policy. If the command works, the skill is current. If it does not, the skill gets patched by the agent that discovered the mismatch.

The mechanism depends on two existing Hermes features: skills as persisted markdown files in ~/.hermes/skills/, and the agent's ability to write files with patch or write_file. When a skill fails, the agent has both the context to understand why and the tools to fix it.

@0xJeff noted a related point in a six-tip thread on Hermes optimization: "Tools and Skills matter more than the underlying model." The dynamic loading and self-healing patterns amplify this. If skills are load-on-demand and self-maintaining, the skill layer absorbs complexity that would otherwise live in the system prompt or require manual upkeep.

Skill Bundles: Load a Workflow in One Command

The v0.15 release also introduced skill bundles. A bundle is a named group of skills that loads them all at once with a single slash command. A user can create a "writing" bundle with humanizer, design-md, and excalidraw, then activate the entire workflow with /writing.

Bundles are configured in config.yaml:

skill_bundles:
  writing:
    - humanizer
    - design-md
    - excalidraw

When combined with Tool Search, bundles give the agent a fast-start capability: load a targeted skill set for a known task without the overhead of loading every registered skill's instructions.

What Changed

The combination of Tool Search, self-healing skills, and skill bundles changes how agent capabilities are defined and maintained. Before, skills were loaded eagerly and updated manually - either through curated releases or the user running hermes skills update. Now, skills load only when relevant and patch themselves when they encounter real-world drift.

The three mechanisms operate on different timelines:

Mechanism	Triggers	Frequency
Tool Search	Agent identifies needed tool	Per-turn, as needed
Skill self-patch	Command or behavior mismatch	On failure, during run
Curator cleanup	Scheduled or manual sweep	Hours to days
Skill bundles	User slash command	On session start

Tool Search handles the immediate problem of token bloat by making tool schemas conditional. Self-patching handles the slower problem of documentation drift by making skills version-aware at runtime. Bundles handle the organizational problem of grouping related capabilities. None of them require the user to remember what changed in the last release.

[^1]: Nous Research. "Hermes Agent now has Tool Search, so your agent only loads what it needs." X. May 29, 2026. [^2]: Infomly Lab. "Hermes Agent killed the per-turn token tax." X. May 30, 2026. [^3]: Prashanth. "Hermes doesn't need to 'know' a skill is stale in abstract." X. June 1, 2026. [^4]: 0xJeff. "Tools and Skills matter more than the underlying model." X. June 1, 2026.