Upload Large Language Model (LLM)

Upload Large Language Model (LLM)

Overview

Flow ID: llm-model-upload
Category: Model Management
Estimated Duration: 2-10 minutes (depends on file size)
User Role: Admin / Power User
Complexity: Moderate

Purpose: Import a .gguf formatted model file into the system to be used for Chat, Blockify, or other AI tasks. The system runs these models locally.


Trigger

What initiates this flow:

  • User manually initiates

Specific trigger: Clicking Upload New Model in the Chat Models settings tab.


Prerequisites

Before starting, users must have:

  • A valid model file in GGUF format (e.g., llama-3-8b-instruct.Q4_K_M.gguf).
  • Sufficient disk space (Models range from 1GB to 20GB+).
  • Reliable knowledge of the model’s parameters (optional but helpful).

User Intent Analysis

Primary Intent

Enable the application to use a specific, likely newer or specialized, AI intelligence.

Secondary Intents

  • Testing a fine-tuned model
  • Running an uncensored model
  • Upgrading to a larger parameter size for better reasoning

Step-by-Step Flow

Main Path (Happy Path)

Step 1: Open Settings

  • User Action: Navigate to Settings > Chat AI Models.

Step 2: Initiate Upload

  • User Action: Click Upload New.
  • System Response: File picker dialog opens.

Step 3: Select File

  • User Action: Navigate to and select the .gguf file.
  • System Response:
    • File validation begins (checking extension).
    • Upload progress bar appears.
    • Important: The app creates a copy in its internal models directory.

Step 4: Rename (Optional)

  • User Action: System uses filename by default. User can rename (e.g., “Llama 3 Instruct”).
  • System Response: Saves metadata.

Step 5: Completion

  • System Response:
    • Progress hits 100%.
    • Model appears in the Available Models list with “Ready” status.

Error States & Recovery

Error 1: Invalid Format

Cause: User tries to upload .bin, .pytorch, or .safetensors.
User Experience: “Invalid file format. Only .gguf is supported.”
Recovery: User must convert model or download GGUF version (e.g., from HuggingFace).

Error 2: Disk Full

Cause: Not enough space for the 5GB+ file.
User Experience: Upload fails halfway with “IO Error” or “Disk Full”.
Recovery: Clear disk space and retry.


Pain Points & Friction

  1. “Which GGUF should I pick?”: HuggingFace repos have 10+ quantization versions (Q2, Q4, Q8).
    • Mitigation: Helper text recommending “Q4_K_M” or “Q5_K_M” as the best balance.
  2. Slow Copy: Copying 10GB takes time.
    • Mitigation: Accurate progress bar.

Design Considerations

  • Verification: Check sha256 hash if possible to ensure no corruption (advanced feature).
  • Quantization Info: Automatically parse the filename to detect quantization level (e.g., “Q4”) and display it.


Technical References

  • src/handlers/model-manager.js
  • src/constants/file-types.js

Related Articles

View All Model Management Articles

Still need help?

Get personalized support with our team for tailored guidance and quick resolution.