Upload Large Language Model (LLM)
Upload Large Language Model (LLM)
Overview
Flow ID: llm-model-upload
Category: Model Management
Estimated Duration: 2-10 minutes (depends on file size)
User Role: Admin / Power User
Complexity: Moderate
Purpose: Import a .gguf formatted model file into the system to be used for Chat, Blockify, or other AI tasks. The system runs these models locally.
Trigger
What initiates this flow:
- User manually initiates
Specific trigger: Clicking Upload New Model in the Chat Models settings tab.
Prerequisites
Before starting, users must have:
- A valid model file in GGUF format (e.g.,
llama-3-8b-instruct.Q4_K_M.gguf). - Sufficient disk space (Models range from 1GB to 20GB+).
- Reliable knowledge of the model’s parameters (optional but helpful).
User Intent Analysis
Primary Intent
Enable the application to use a specific, likely newer or specialized, AI intelligence.
Secondary Intents
- Testing a fine-tuned model
- Running an uncensored model
- Upgrading to a larger parameter size for better reasoning
Step-by-Step Flow
Main Path (Happy Path)
Step 1: Open Settings
- User Action: Navigate to Settings > Chat AI Models.
Step 2: Initiate Upload
- User Action: Click Upload New.
- System Response: File picker dialog opens.
Step 3: Select File
- User Action: Navigate to and select the
.gguffile. - System Response:
- File validation begins (checking extension).
- Upload progress bar appears.
- Important: The app creates a copy in its internal
modelsdirectory.
Step 4: Rename (Optional)
- User Action: System uses filename by default. User can rename (e.g., “Llama 3 Instruct”).
- System Response: Saves metadata.
Step 5: Completion
- System Response:
- Progress hits 100%.
- Model appears in the Available Models list with “Ready” status.
Error States & Recovery
Error 1: Invalid Format
Cause: User tries to upload .bin, .pytorch, or .safetensors.
User Experience: “Invalid file format. Only .gguf is supported.”
Recovery: User must convert model or download GGUF version (e.g., from HuggingFace).
Error 2: Disk Full
Cause: Not enough space for the 5GB+ file.
User Experience: Upload fails halfway with “IO Error” or “Disk Full”.
Recovery: Clear disk space and retry.
Pain Points & Friction
- “Which GGUF should I pick?”: HuggingFace repos have 10+ quantization versions (Q2, Q4, Q8).
- Mitigation: Helper text recommending “Q4_K_M” or “Q5_K_M” as the best balance.
- Slow Copy: Copying 10GB takes time.
- Mitigation: Accurate progress bar.
Design Considerations
- Verification: Check sha256 hash if possible to ensure no corruption (advanced feature).
- Quantization Info: Automatically parse the filename to detect quantization level (e.g., “Q4”) and display it.
Related Flows
- Select LLM Model - Using it after upload
- Delete Model
Technical References
src/handlers/model-manager.jssrc/constants/file-types.js