Start RAG-Enabled Chat (Chat with Dataset)

Start RAG-Enabled Chat (Chat with Dataset)

Overview

Flow ID: rag-enabled-chat
Category: Chat Interactions
Estimated Duration: Setup < 15 seconds
User Role: All Users
Complexity: Moderate

Purpose: “Talk to your Data.” Enables Retrieval-Augmented Generation (RAG) by connecting a specific Dataset (Corpus) to the chat. This forces the AI to look up “Trusted Answers” and source material from your uploaded documents before generating a response, greatly reducing hallucinations.


Trigger

What initiates this flow:

  • User manually initiates

Specific trigger: Selecting a Dataset from the “Context Source” or “Active Corpus” dropdown in the chat header.


Prerequisites

Before starting, users must have:

  • At least one Active Dataset (processed and embedded)
  • A Chat Model that supports context injection (most do)

User Intent Analysis

Primary Intent

Get factual answers based strictly (or primarily) on internal company documents, manuals, or specific knowledge bases, rather than the AI’s general training data.

Secondary Intents

  • finding source citations (“Where does it say this?”)
  • Comparing documents
  • Summarizing specific internal files

Step-by-Step Flow

Main Path (Happy Path)

Step 1: Create New Chat

  • User Action: Click New Chat.
  • System Response: Empty chat window.

Step 2: Select Dataset

  • User Action: Locate “Active Dataset” dropdown (usually says “None” or “General Chat”).
  • User Action: Select target corpus (e.g., “Q4 Financials” or “HR Policy”).
  • System Response:
    • UI indicator changes (e.g., “Context: HR Policy”).
    • System Prompt may update behind scenes to “Answer using provided context…”.

Step 3: Ask Question

  • User Action: Ask specific question (e.g., “What is the holiday rollover policy?”).
  • System Response:
    1. Retrieval: System searches vector DB for chunks matching query.
    2. Injection: Top chunks are inserted into prompt.
    3. Generation: AI answers based on those chunks.
  • Visual Cues: “Searching dataset…” indicator might appear briefly.

Step 4: Verify Sources

  • User Action: Hover over citations [1] or check “Sources” dropdown.
  • System Response: Application shows filename and specific text chunk used.

Error States & Recovery

Error 1: No Relevant Content Found

Cause: Query has no semantic match in dataset
User Experience: AI says “I cannot find any information about that in the dataset.” (Desired behavior vs hallucination).
Recovery: Rephrase query keywords or check if correct dataset is active.

Error 2: Dataset Not Active/Ready

Cause: Dataset still processing or embeddings failed
User Experience: Dataset greyed out in dropdown or “Index not ready”.
Recovery: Go to Monitor Job Progress to ensure processing finished.


Pain Points & Friction

  1. “Why didn’t it find it?”: Semantic search isn’t magic; keyword mismatches happen.
    • Mitigation: User education on “Hybrid Search” or better keywords.
  2. Context Window Limit: If too many results are found, they might exceed model’s token limit.
    • Mitigation: Auto-truncate least relevant chunks.

Design Considerations

  • Source Credibility: Always visually link the answer to the source file (e.g., clickable pill [HR_Manual.pdf]).
  • Toggle Ease: Easy switch between “Chat with Data” and “Chat with AI” (General knowledge).


Technical References

  • src/engines/rag-engine.js
  • src/components/chat/context-selector.js

Related Articles

View All Chat Interactions Articles

Still need help?

Get personalized support with our team for tailored guidance and quick resolution.