Configure Advanced Chunk Settings

Configure Advanced Chunk Settings

Overview

Flow ID: configure-advanced-chunks
Category: Blockify Processing
Estimated Duration: 2-5 minutes
User Role: Power Users / Admins
Complexity: Advanced

Purpose: Allows users to fine-tune exactly how documents are split (“chunked”) before processing. This flow features a real-time preview that visualizes chunk boundaries, helping users optimize for specific document structures (e.g., technical manuals vs. narrative text).


Trigger

What initiates this flow:

  • User manually initiates

Specific trigger: Clicking the “Show Advanced Settings” or “Preview Chunks” toggle within the job creation workflow.


Prerequisites

Before starting, users must have:

  • At least one file uploaded to the job
  • Basic understanding of chunk size (tokens) and overlap

User Intent Analysis

Primary Intent

Verify and optimize how the system splits text to ensure important context isn’t lost at cut-off points.

Secondary Intents

  • Debugging poor search results (often caused by bad chunking)
  • Adjusting settings for specific file types (e.g., small chunks for FAQs, large for articles)

Step-by-Step Flow

Main Path (Happy Path)

Step 1: Open Advanced Panel

  • User Action: Click Advanced Settings / Preview
  • System Response: Panel expands, showing a preview area and setting sliders.
  • UI Elements Visible:
    • File Tabs (one for each uploaded file)
    • Chunk Size Slider / Input
    • Overlap Slider / Input
    • Text Preview Window

Step 2: Select File to Preview

  • User Action: Click on a specific file tab (e.g., manual.pdf)
  • System Response: Preview window loads the text of that file.
  • Visual Cues: Colored highlights indicating separate chunks (e.g., alternating blue/green backgrounds).

Step 3: Adjust Chunk Size

  • User Action: Drag Chunk Size slider (e.g., from 512 to 1024)
  • System Response:
    • Chunks in preview resize instantly/near-instantly.
    • Total number of chunks updates.
  • Feedback: Users see if sentences are cut in half or if paragraphs fit nicely.

Step 4: Adjust Overlap

  • User Action: Drag Overlap slider (e.g., 0 to 50 tokens)
  • System Response: The shared text between chunks increases/decreases.
  • Visual Cues: Overlapping regions might be highlighted darker or indicated by markers.

Step 5: Confirm Settings

  • User Action: Collapse panel or proceed with job
  • System Response: Settings are applied to ALL files in the current job (unless per-file settings are supported).

Error States & Recovery

Error 1: Preview Not Loading

Cause: Text extraction pending or failed
User Experience: “Loading preview…” hangs or shows blank
Recovery: Wait for extraction to complete; if stuck, re-upload file.

Error 2: Settings Too Extreme

Cause: Overlap > Chunk Size or Size < 50 tokens
User Experience: Validation error “Overlap must be smaller than chunk size”
Recovery: System auto-corrects or blocks invalid ranges.


Pain Points & Friction

  1. Global vs. Local Settings: Users often want different settings for different files in the same job, but settings usually apply globally.
    • Workaround: Create separate jobs for different file types.
  2. Technical Complexity: “Tokens” are abstract to non-technical users.
    • Improvement: Show approximate word count (e.g., “512 tokens ≈ 380 words”).

Design Considerations

  • Color Coding: Use distinct, accessible colors to differentiate adjacent chunks.
  • Performance: Preview reconfiguration should be debounced to prevent lag when dragging sliders.
  • Persistence: Remember last-used settings for convenience.


Technical References

  • src/components/blockify-corpus/advanced-settings.js
  • src/utils/chunking-preview.js

Related Articles

View All Blockify Processing Articles

Still need help?

Get personalized support with our team for tailored guidance and quick resolution.