Lesson 13 | Knowledge Base: RAG & Long-Term Document Management

20 MIN READ | UPDATED: 2026-05-07

title: "Lesson 13 | Knowledge Base: RAG & Long-Term Document Management" summary: "Integrate Hermes Agent with vector databases and local document indexing to build an AI-driven knowledge management system for individuals or teams. Dive deep into the practical application of Retrieval-Augmented Generation." sortOrder: 130 status: "published"

🎯 Learning Objectives

  • Gain a deep understanding of the core principles of RAG (Retrieval-Augmented Generation) and its crucial role in overcoming the knowledge limitations of Large Language Models (LLMs).
  • Master how to configure and use "Context Files" in Hermes Agent to achieve simple and direct knowledge injection.
  • Learn to leverage Hermes Agent's built-in capabilities to integrate with the local vector database ChromaDB, building, managing, and querying scalable long-term knowledge bases.
  • Through hands-on practice, enhance Hermes Agent's ability to process domain-specific knowledge, reduce "hallucinations," and provide more accurate and professional answers.

📖 Core Concepts Explained

13.1 RAG (Retrieval-Augmented Generation) Overview

Large Language Models (LLMs) excel at generating text, but they are not without limitations. Their main challenges include:

  1. Knowledge Cutoff: An LLM's knowledge is limited to the time point of its training data and cannot access the latest information.
  2. Factual Errors and "Hallucinations": LLMs sometimes generate information that sounds plausible but is actually fabricated.
  3. Lack of Domain-Specific Knowledge: LLMs often lack knowledge specific to particular industries, internal company information, or personal professional domains.
  4. Poor Explainability: Answers generated by LLMs usually cannot be traced back to specific sources, making it difficult to verify their authenticity.

To address these issues, RAG (Retrieval-Augmented Generation) emerged. The core idea of RAG is that before an LLM generates a response, it first retrieves relevant information from an external, trustworthy knowledge base. This retrieved information is then provided to the LLM as additional context, thereby enhancing its generation capabilities.

The basic workflow of RAG can be summarized in the following steps:

  1. User Query: The user poses a question or instruction to the Agent.
  2. Retrieval: The Agent analyzes the user query and uses it to retrieve the most relevant document chunks from an external knowledge base. This knowledge base can be a structured database, an unstructured document collection, or even a vector database.
  3. Augmentation: The retrieved document chunks are combined with the original user query to form an augmented Prompt that includes background knowledge.
  4. Generation: The augmented Prompt is sent to the LLM, which then generates the final response based on this rich contextual information.

Through RAG, Hermes Agent can:

  • Access Latest Information: As long as the knowledge base is updated in real-time, the Agent can access the latest data.
  • Improve Factual Accuracy: Reduce LLM "hallucinations" by providing reliable source information.
  • Handle Domain-Specific Questions: Incorporate internal company documents, professional manuals, etc., into the knowledge base, making the Agent a domain expert.
  • Enhance Explainability: The Agent can cite its retrieved source documents in its answers, increasing transparency.

Hermes Agent, as a self-evolving AI agent, is inherently designed to support integration with external knowledge sources to achieve powerful RAG capabilities. This includes not only simple file context injection but also deep integration with vector databases, laying the foundation for building intelligent knowledge management systems.

Below is a simplified Mermaid flowchart of the RAG process:

graph TD
    A[User Query] --> B{Agent Receives Query};
    B --> C{Query Analysis};
    C --> D[Retrieve Relevant Document Chunks];
    D --> E{External Knowledge Base
(Local Files / Vector Database)}; E --> F[Retrieval Results]; F --> G{Construct Augmented Prompt}; G --> H[Send to LLM]; H --> I[LLM Generates Response]; I --> J[Agent Returns Final Answer];

13.2 Hermes Agent's Built-in RAG Mechanisms: Context Files and Knowledge Base

Hermes Agent provides two main built-in mechanisms to implement RAG: context_files and knowledge_base. They are suitable for different scenarios and requirements.

13.2.1 Context Files: Direct Prompt Injection

context_files is a lightweight RAG method suitable for scenarios where a small amount of critical information needs to be directly injected into the LLM's Prompt for each call. When you add file paths to the context_files configuration, Hermes Agent will read the content of these files and append them to the user's Prompt, either before or after, as additional context information during each interaction with the LLM.

Working Principle:

  1. File Loading: When Hermes Agent starts or its configuration is updated, it loads all files specified in context_files.
  2. Content Embedding: For each generation request, the Agent concatenates the entire content of these files (or parts of them, depending on file size and LLM's context window limits) directly into the user's input Prompt.
  3. LLM Processing: The LLM receives a complete Prompt containing the user query and additional file content, and generates a response based on this information.

Applicable Scenarios:

  • Providing fixed, general background information, such as company bylaws, product overviews, or personal resumes.
  • When the LLM needs to consistently remember a small amount of critical data within a specific conversation.
  • Scenarios where real-time updates are not critical, and file content does not change frequently.

Advantages:

  • Simple configuration, no additional tools or databases required.
  • Direct and reliable effect for small files.

Disadvantages:

  • Limited by the LLM's context window size. If files are too large, they may not be fully loaded, or token costs may become excessive.
  • Not suitable for large-scale document collections, as all content needs to be loaded every time, leading to inefficiency.
  • Does not support advanced semantic search and retrieval; only simple text concatenation is performed.

Configuration Example:

Suppose you have a Markdown file named project_overview.md containing important information about your ongoing project.

<!-- project_overview.md -->
# Project Name: Hermes Agent Tutorial Series

## Project Goals
*   Provide comprehensive and in-depth learning resources for Hermes Agent users.
*   Cover all core functionalities from basic installation to advanced applications.
*   Help users quickly get started and efficiently utilize Hermes Agent through practical demonstrations.

## Key Milestones
*   Phase One: Basic Features and Core Concepts (Completed)
*   Phase Two: Advanced Features and Ecosystem Integration (In Progress)
*   Phase Three: Production Deployment and Future Outlook (Planned)

## Team Members
*   Technical Content Author: [Your Name/Team]
*   Review and Proofreading: [Review Team]

You can configure it as a context file using the hermes config set command:

hermes config set context_files project_overview.md

Or directly edit the config.yaml file:

# ~/.config/hermes-agent/config.yaml (partial content)
# ...
context_files:
  - project_overview.md
# ...

Once configured, Hermes Agent will include the content of project_overview.md in the Prompt during every interaction with the LLM.

13.2.2 Knowledge Base and Vector Database: Scalable RAG

For large-scale, multi-format document collections requiring efficient retrieval, Hermes Agent offers the knowledge_base feature integrated with vector databases. Currently, Hermes Agent natively supports ChromaDB, a lightweight, easy-to-deploy open-source vector database.

Working Principle:

  1. Document Ingestion: Users add documents to the knowledge base using the hermes docs add command. Hermes Agent will:
    • Read the document content.
    • Chunk the document into smaller, manageable fragments.
    • Use an embedding model to convert each text chunk into a high-dimensional vector (embedding).
    • Store these vectors along with their original text and metadata in the configured vector database (e.g., ChromaDB).
  2. Retrieval: When a user asks the Agent a question, the Agent also converts the user query into a vector. It then performs a similarity search in the vector database to find vectors of document chunks most similar to the query vector.
  3. Augmentation & Generation: The most relevant retrieved document chunks are extracted and provided as context information to the LLM, which then generates the final response based on this information.

Applicable Scenarios:

  • Managing large volumes of internal company documents, technical manuals, research reports, meeting minutes, etc.
  • Scenarios requiring complex, semantic queries.
  • When the Agent needs access to the latest and continuously updated knowledge.
  • Applications with high demands for accuracy and traceability.

Advantages:

  • Scalability: Can handle massive amounts of documents, not limited by the LLM's context window.
  • Efficient Retrieval: Vector search can quickly find semantically most relevant document fragments.
  • Supports Multiple Document Formats: Typically can process various formats like PDF, Markdown, TXT, DOCX, etc.
  • Reduces Hallucinations: Significantly lowers the risk of LLMs generating hallucinations by providing clear evidence sources.

Disadvantages:

  • Requires an additional vector database component (though ChromaDB is easy to deploy).
  • The document ingestion process requires computational resources and time (for chunking and embedding).
  • Requires careful selection of embedding models and chunking strategies.

Configuration Example:

First, you need to install the ChromaDB client library:

pip install chromadb

Then, configure Hermes Agent to use ChromaDB as the knowledge base provider and specify the storage path for the knowledge base:

hermes config set knowledge_base_provider chromadb
hermes config set knowledge_base_path ./my_hermes_kb

Here, ./my_hermes_kb is a local directory where ChromaDB will store its data.

Managing Knowledge Base Documents:

  • Add Documents:
    hermes docs add my_document.pdf
    hermes docs add ./my_project_docs/
    
    You can add single files or entire directories. Hermes Agent will automatically handle document loading, chunking, and embedding.
  • List Added Documents:
    hermes docs list
    
    This will display all indexed documents in the knowledge base along with their metadata.
  • Remove Documents:
    hermes docs remove my_document.pdf
    
    Removes the specified document and all its associated vector embeddings.

How the Agent Uses the Knowledge Base:

Once knowledge_base_provider is configured and documents are added, Hermes Agent intelligently determines whether to retrieve information from the knowledge base when it receives a user query. If retrieval is deemed necessary, it automatically performs the retrieval step and provides the relevant document chunks as context to the LLM, which then answers the user's question. You do not need to explicitly instruct the Agent to "query the knowledge base" in the Prompt; it will do so automatically based on its internal logic and your configuration.

By combining the directness of context_files with the scalability of knowledge_base, Hermes Agent provides you with a powerful and flexible knowledge management system capable of effectively addressing various information retrieval and generation challenges.


💻 Hands-on Demonstration

In this section, we will demonstrate how to build and manage knowledge in Hermes Agent using context_files and knowledge_base through two practical scenarios.

Scenario One: Enhancing Agent's Perception of Specific Information using Context Files

In this scenario, we will create a Markdown file about the Hermes Agent tutorial series plan and configure it as context_files. Then, we will verify if the Agent can use this information to answer questions.

Steps:

  1. Create Context File In your Hermes Agent working directory (or any convenient location), create a file named tutorial_plan.md and populate it with the following content:

    # Hermes Agent Tutorial Series Plan
    
    ## Phase One: Basic Introduction (Completed)
    *   Lesson 01 | Introduction to Hermes Agent: Architecture Overview & Rapid Installation
    *   Lesson 02 | Model Switching & Provider Configuration in Practice
    *   Lesson 03 | Deep Dive into the Skills System: Enabling Agent Self-Evolution
    *   Lesson 04 | Memory & User Persona: Persistent Memory Across Sessions
    *   Lesson 05 | Message Gateway: Chat Anytime via Telegram/Discord
    
    ## Phase Two: Advanced Features & Ecosystem Integration (In Progress)
    *   Lesson 06 | MCP Tool Ecosystem: A Bridge to the External World
    *   Lesson 07 | Cron Scheduling & Automated Task Orchestration
    *   Lesson 08 | Context Files & Workspace Awareness
    *   Lesson 09 | Multi-Turn Conversations & Complex Task Decomposition
    *   Lesson 10 | Hermes Agent's Security Model & Access Control
    *   Lesson 11 | Custom Persona: Building Domain-Specific Agents
    *   Lesson 12 | Agent Collaboration: Multi-Agent Communication & Task Delegation
    *   Lesson 13 | Knowledge Base: RAG & Long-Term Document Management (This Lesson)
    
    ## Phase Three: Production Deployment & Future Outlook (Planned)
    *   Lesson 18 | Production Deployment: From $5 VPS to GPU Clusters
    *   Lesson 19 | Community Ecosystem & Skill Marketplace Contribution Guide
    *   Lesson 20 | Future Outlook: From Personal Assistant to Autonomous Organization
    
  2. Configure Hermes Agent to Use Context Files Use the hermes config set command to add tutorial_plan.md to the context_files list. If not configured before, it will create a new list. If other files already exist, it will append to the list.

    hermes config set context_files tutorial_plan.md
    

    (If you want to add multiple files, you can separate them with commas, e.g., hermes config set context_files file1.md,file2.txt)

  3. Interact with the Agent and Verify Start Hermes Agent and ask questions about the tutorial series plan.

    hermes
    

    User Input:

    Tell me which lessons are included in Phase Two of the Hermes Agent tutorial series?
    

    Expected Output (Example):

    Agent: Phase Two of the Hermes Agent tutorial series (Advanced Features & Ecosystem Integration) includes the following lessons:
    
    *   Lesson 06 | MCP Tool Ecosystem: A Bridge to the External World
    *   Lesson 07 | Cron Scheduling & Automated Task Orchestration
    *   Lesson 08 | Context Files & Workspace Awareness
    *   Lesson 09 | Multi-Turn Conversations & Complex Task Decomposition
    *   Lesson 10 | Hermes Agent's Security Model & Access Control
    *   Lesson 11 | Custom Persona: Building Domain-Specific Agents
    *   Lesson 12 | Agent Collaboration: Multi-Agent Communication & Task Delegation
    *   Lesson 13 | Knowledge Base: RAG & Long-Term Document Management
    
    We are currently learning Lesson 13!
    

    Analysis: The Agent was able to accurately extract information from the tutorial_plan.md file and answer the question, indicating that context_files has successfully taken effect.

Scenario Two: Building and Querying a ChromaDB Knowledge Base

This scenario will demonstrate how to configure Hermes Agent to use ChromaDB as a knowledge base, add multiple documents for indexing, and then query them through the Agent.

Steps:

  1. Install ChromaDB Client Library Ensure chromadb is installed in your Python environment.

    pip install chromadb
    
  2. Configure Hermes Agent to Use ChromaDB Knowledge Base We will specify chromadb as the knowledge base provider and set a local directory to store ChromaDB's data.

    hermes config set knowledge_base_provider chromadb
    hermes config set knowledge_base_path ./hermes_knowledge_base
    

    This will create a folder named hermes_knowledge_base in the current directory, used to store ChromaDB's data.

  3. Create Sample Documents Create several documents on different topics to simulate a small knowledge base.

    • Create doc_hermes_features.md:
      # Hermes Agent Core Features
      
      Hermes Agent is a self-evolving AI agent from NousResearch, possessing the following core features:
      *   **Built-in Learning Loop**: Creates and optimizes Skills from experience.
      *   **Cross-Session Memory**: Achieves persistent memory and user personas through the Memory system.
      *   **Multi-Model Support**: Supports 200+ LLM models via OpenRouter.
      *   **Message Gateway**: Supports Telegram/Discord for chatting anytime, anywhere.
      *   **MCP Integration**: Connects to external tools and systems via MCP (Multi-Agent Communication Protocol).
      *   **Cron Scheduling**: Supports scheduled tasks and automation.
      
    • Create doc_rag_intro.txt:
      RAG (Retrieval-Augmented Generation) is a technique that combines information retrieval and text generation. Its primary goal is to enhance the generative capabilities of Large Language Models (LLMs) by retrieving relevant information from an external knowledge base, thereby addressing issues such as LLM knowledge cutoffs, hallucinations, and lack of domain-specific knowledge. The core RAG process includes querying, retrieval, augmentation, and generation.
      
    • Create doc_project_setup.md:
      # Project Startup Guide
      
      ## 1. Environment Setup
      *   Install Python 3.9+
      *   Clone the project repository: `git clone https://github.com/your-org/your-project.git`
      *   Install dependencies: `pip install -r requirements.txt`
      
      ## 2. Configure Database
      *   Create a `.env` file
      *   Set `DATABASE_URL=postgresql://user:password@host:port/dbname`
      
      ## 3. Run Application
      *   Start the backend service: `python app.py`
      *   Start the frontend: `npm start`
      
  4. Add Documents to Hermes Agent Knowledge Base Use the hermes docs add command to add these documents to the ChromaDB knowledge base.

    hermes docs add doc_hermes_features.md
    hermes docs add doc_rag_intro.txt
    hermes docs add doc_project_setup.md
    

    You will see output similar to "Processing document..." and "Document added successfully."

  5. View Indexed Documents (Optional) You can list all documents currently indexed in the knowledge base.

    hermes docs list
    

    Expected Output (Example):

    Indexed Documents:
    - ID: 0123...4567, Path: doc_hermes_features.md, Chunks: 2
    - ID: 89ab...cdef, Path: doc_rag_intro.txt, Chunks: 1
    - ID: fedc...ba98, Path: doc_project_setup.md, Chunks: 3
    

    This indicates that the documents have been successfully chunked and indexed.

  6. Interact with the Agent and Query the Knowledge Base Start Hermes Agent and ask cross-document or domain-specific questions.

    hermes
    

    User Input:

    What is RAG? What are the core features of Hermes Agent?
    

    Expected Output (Example):

    Agent: RAG (Retrieval-Augmented Generation) is a technique that combines information retrieval and text generation, aiming to enhance the generative capabilities of Large Language Models (LLMs) by retrieving relevant information from an external knowledge base. This addresses issues such as LLM knowledge cutoffs, hallucinations, and lack of domain-specific knowledge. Its core process includes querying, retrieval, augmentation, and generation.
    
    Hermes Agent, as a self-evolving AI agent, possesses the following core features:
    *   Built-in Learning Loop: Creates and optimizes Skills from experience.
    *   Cross-Session Memory: Achieves persistent memory and user personas through the Memory system.
    *   Multi-Model Support: Supports 200+ LLM models via OpenRouter.
    *   Message Gateway: Supports Telegram/Discord for chatting anytime, anywhere.
    *   MCP Integration: Connects to external tools and systems via MCP.
    *   Cron Scheduling: Supports scheduled tasks and automation.
    

    Analysis: The Agent successfully retrieved relevant information from both doc_rag_intro.txt and doc_hermes_features.md and integrated it into the answer. This demonstrates Hermes Agent's powerful capability to implement RAG with ChromaDB.

    User Input:

    How should I configure the project database?
    

    Expected Output (Example):

    Agent: According to the Project Startup Guide, the steps to configure the database are as follows:
    1.  Create a `.env` file.
    2.  In the `.env` file, set `DATABASE_URL=postgresql://user:password@host:port/dbname`.
    

    Analysis: The Agent accurately retrieved the specific steps for configuring the database from doc_project_setup.md.

Through these two hands-on demonstrations, you should now have a grasp of Hermes Agent's two main knowledge base construction methods and be able to choose the appropriate strategy to enhance your AI agent based on your actual needs.


🔧 Commands and Tools Involved

Command/Tool Description Example
hermes config set Used to set various configuration parameters for Hermes Agent, including context_files, knowledge_base_provider, and knowledge_base_path. hermes config set context_files my_doc.md
hermes config set knowledge_base_provider chromadb
hermes docs add Adds local files or directories to Hermes Agent's knowledge base for chunking, embedding, and indexing. hermes docs add project_notes.txt
hermes docs add ./manuals/
hermes docs list Lists all documents currently indexed in the knowledge base. hermes docs list
hermes docs remove Removes the specified document and all its associated embeddings from the knowledge base. hermes docs remove old_policy.pdf
hermes Launches Hermes Agent's CLI interactive interface to converse with the Agent. The Agent will automatically utilize the knowledge base for RAG based on its configuration. hermes
pip install chromadb Python package manager command used to install the ChromaDB client library, which is the vector database dependency for Hermes Agent's knowledge base feature. pip install chromadb
Markdown / TXT / PDF Various document formats. Hermes Agent's knowledge_base supports ingestion and indexing of multiple document formats.
.env file Environment variable file. Although not a direct Hermes Agent command, it is commonly used when configuring databases or other external services, and is presented here as an example within doc_project_setup.md.

📝 Key Takeaways from This Lesson

  • RAG Core Value: RAG (Retrieval-Augmented Generation) is a key technology for addressing LLM knowledge cutoffs, hallucinations, and lack of domain-specific knowledge, enhancing LLM generation capabilities through external knowledge retrieval.
  • Context Files: Hermes Agent provides the context_files configuration, allowing a small amount of critical document content to be directly injected into the LLM's Prompt, suitable for fixed and infrequently updated background information.
  • Knowledge Base (ChromaDB): For large-scale, multi-format documents requiring efficient retrieval, Hermes Agent integrates ChromaDB as a vector database to achieve scalable long-term knowledge base management.
  • Knowledge Base Management: The hermes docs add command allows for easy ingestion of files or directories into the ChromaDB knowledge base, while hermes docs list and hermes docs remove are used to manage indexed documents.
  • Intelligent Retrieval and Generation: Once the knowledge base is configured and populated, Hermes Agent intelligently determines and automatically utilizes these knowledge sources for retrieval during conversations, providing relevant information to the LLM to generate more accurate and professional answers.

🔗 References