Upload Large Language Model (LLM)

Overview

Flow ID: llm-model-upload
Category: Model Management
Estimated Duration: 2-10 minutes (depends on file size)
User Role: Admin / Power User
Complexity: Moderate

Purpose: Import a .gguf formatted model file into the system to be used for Chat, Blockify, or other AI tasks. The system runs these models locally.

Trigger

What initiates this flow:

User manually initiates

Specific trigger: Clicking Upload New Model in the Chat Models settings tab.

Prerequisites

Before starting, users must have:

A valid model file in GGUF format (e.g., llama-3-8b-instruct.Q4_K_M.gguf).
Sufficient disk space (Models range from 1GB to 20GB+).
Reliable knowledge of the model’s parameters (optional but helpful).

User Intent Analysis

Primary Intent

Enable the application to use a specific, likely newer or specialized, AI intelligence.

Secondary Intents

Testing a fine-tuned model
Running an uncensored model
Upgrading to a larger parameter size for better reasoning

Step-by-Step Flow

Main Path (Happy Path)

Step 1: Open Settings

User Action: Navigate to Settings > Chat AI Models.

Step 2: Initiate Upload

User Action: Click Upload New.
System Response: File picker dialog opens.

Step 3: Select File

User Action: Navigate to and select the .gguf file.
System Response:
- File validation begins (checking extension).
- Upload progress bar appears.
- Important: The app creates a copy in its internal models directory.

Step 4: Rename (Optional)

User Action: System uses filename by default. User can rename (e.g., “Llama 3 Instruct”).
System Response: Saves metadata.

Step 5: Completion

System Response:
- Progress hits 100%.
- Model appears in the Available Models list with “Ready” status.

Error States & Recovery

Error 1: Invalid Format

Cause: User tries to upload .bin, .pytorch, or .safetensors.
User Experience: “Invalid file format. Only .gguf is supported.”
Recovery: User must convert model or download GGUF version (e.g., from HuggingFace).

Error 2: Disk Full

Cause: Not enough space for the 5GB+ file.
User Experience: Upload fails halfway with “IO Error” or “Disk Full”.
Recovery: Clear disk space and retry.

Pain Points & Friction

“Which GGUF should I pick?”: HuggingFace repos have 10+ quantization versions (Q2, Q4, Q8).
- Mitigation: Helper text recommending “Q4_K_M” or “Q5_K_M” as the best balance.
Slow Copy: Copying 10GB takes time.
- Mitigation: Accurate progress bar.

Design Considerations

Verification: Check sha256 hash if possible to ensure no corruption (advanced feature).
Quantization Info: Automatically parse the filename to detect quantization level (e.g., “Q4”) and display it.

Select LLM Model - Using it after upload
Delete Model

Technical References

src/handlers/model-manager.js
src/constants/file-types.js

Upload Large Language Model (LLM)

Upload Large Language Model (LLM)

Overview

Trigger

Prerequisites

User Intent Analysis

Primary Intent

Secondary Intents

Step-by-Step Flow

Main Path (Happy Path)

Error States & Recovery

Error 1: Invalid Format

Error 2: Disk Full

Pain Points & Friction

Design Considerations

Technical References

Related Articles

Still need help?

Upload Large Language Model (LLM)

Upload Large Language Model (LLM)

Overview

Trigger

Prerequisites

User Intent Analysis

Primary Intent

Secondary Intents

Step-by-Step Flow

Main Path (Happy Path)

Error States & Recovery

Error 1: Invalid Format

Error 2: Disk Full

Pain Points & Friction

Design Considerations

Related Flows

Technical References

Related Articles

Still need help?