From fa2c918ac72af49320368e24b19e69d24820bbacfa97ccc8580d89e56cc50e14 Mon Sep 17 00:00:00 2001 From: mstoeck3 Date: Sun, 18 Jan 2026 22:28:43 +0100 Subject: [PATCH] todo list --- README.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/README.md b/README.md index f65efa6..2fd6408 100644 --- a/README.md +++ b/README.md @@ -2,6 +2,18 @@ Utilities for managing Ollama LLM models, including automated installation from HuggingFace. +## TODO + +| **Use Case** | **Best Model** | **VRAM** | **Speed** | **Why** | +|--------------|----------------|----------|-----------|---------| +| **IDE Autocomplete** | Qwen2.5-Coder-1.5B (Q8) | 2.5GB | 120-150 t/s | Latency critical, FIM optimized | +| **Quick Drafting** | Yi-Coder-9B (Q5_K_M) | 7-8GB | 50-80 t/s | Best speed/quality balance | +| **Large Code Analysis** | Qwen2.5-Coder-14B (Q4_K_M) | 14-16GB | 30-40 t/s | SOTA repo-level, 128K context | +| **Reverse Engineering** | DeepCoder-14B (Q5_K_M) | 11-12GB | 30-50 t/s | Strongest reasoning, RL-trained | + +gemma3-12b-it-qat +gemma3-4b-it-qat + ## Web Interface Start the web interface: