Create New Blockify Job
Create New Blockify Job
Overview
Flow ID: create-blockify-job
Category: Blockify Processing
Estimated Duration: 3-5 minutes (setup only) + Processing Time
User Role: All Users
Complexity: Moderate
Purpose: The primary workflow for processing documents into AI-structured “IdeaBlocks” (Question/Answer pairs). This job uses a Large Language Model (Blockify Model) to intelligently parse content, making it significantly more effective for RAG (Retrieval-Augmented Generation) than simple mechanical chunking.
Trigger
What initiates this flow:
- User manually initiates
Specific trigger: User clicks “New Job” or “Blockify” from the main navigation to process new documents.
Prerequisites
Before starting, users must have:
- At least one Embedding Model uploaded and active
- At least one Blockify Model (LLM) uploaded and active
- Document files (PDF, DOCX, TXT, etc.) ready for upload
- Sufficient disk space for processing
User Intent Analysis
Primary Intent
Transform raw documents into a structured, searchable knowledge base (Dataset) composed of high-quality Question/Answer pairs.
Secondary Intents
- Add new knowledge to an existing dataset
- Create a completely new dataset from a batch of files
- Test specific prompt strategies on documents
Step-by-Step Flow
Main Path (Happy Path)
Step 1: Access Job Creation
- User Action: Click Blockify or New Job in sidebar
- System Response: Job configuration screen appears
- UI Elements Visible:
- Job Mode Selector (Blockify vs. Chunking)
- Dataset Selection
- File Upload Area
Step 2: Select Job Mode
- User Action: Ensure Blockify mode is selected (usually default)
- System Response: Blockify-specific settings (LLM selection, Prompts) are visible
- Visual Cues: “Blockify” tab highlighted
Step 3: Configure Target Dataset
- User Action: Choose to Create New Dataset or Add to Existing
- Reference: See Select Target Dataset
- Decision Point:
- New: Requires Name + Embedding Model
- Existing: Locked to dataset’s Embedding Model
Step 4: Upload Documents
- User Action: Drag & drop files or use file picker
- Reference: See Upload Files
- System Response: Files upload and text extraction begins immediately
- Validation: Ensure green checkmarks appear for all files
Step 5: Configure Blockify Settings
- User Action:
- Select Blockify Model (LLM to use for processing)
- (Optional) Customize System Prompt for specific extraction style
- Default Behavior: Uses default “General Knowledge” prompt
Step 6: Configure Chunking (Optional)
- User Action: Adjust chunk size/overlap if needed
- Reference: See Configure Basic Chunks or Advanced Settings
- Best Practice: Default settings usually work best for Blockify
Step 7: Start Job
- User Action: Click Start Job button
- System Response:
- Job enters “Processing” state
- Progress bar appears
- Navigation may redirect to Job Details or stay on Jobs List
Final Step: Job Processing
- Success Indicator: Job appears in “Active Jobs” list with moving progress bar
- User Requirement: DO NOT CLOSE THE APPLICATION while job is running
Error States & Recovery
Error 1: Missing Models
Cause: No Embedding or Blockify models installed
User Experience: “No models available” error in dropdowns; specific steps blocked
Recovery: Go to Settings > Models and upload required models.
Error 2: Text Extraction Failed
Cause: Encrypted PDF or Scanned Image PDF
User Experience: File uploads but shows error icon
Recovery: Remove file; use OCR software to convert to text-based PDF or text file, then re-upload.
Error 3: Insufficient Memory/Resources
Cause: Too many parallel jobs or very large files
User Experience: Job starts but stalls or crashes app
Recovery: Cancel job; retry with fewer files or smaller batch sizes.
Pain Points & Friction
- Long Processing Times: Blockify involves extensive LLM inference, which creates significant delay compared to simple chunking.
- Mitigation: Progress bars and accurate status estimates.
- App Must Stay Open: Users often forget and close the app, pausing/killing the job.
- Mitigation: Warning modal on exit if jobs are running.
Design Considerations
- Defaults First: Pre-select the most capable active model and “Create New Dataset” to reduce friction.
- Visual Feedback: Show “Ready” states clearly before permitting the “Start” action.
- Education: Tooltips explaining why “Blockify” is better than “Chunking” (AI structure vs. mechanical split).
Related Flows
- Create Chunking Job - Faster, non-AI alternative
- Monitor Job Progress - Tracking the running job
- View Dataset Details - Seeing results
Technical References
src/components/blockify-corpus/new-job-screen.jssrc/engines/blockify.jssrc/constants/job-types.js