todo list
This commit is contained in:
12
README.md
12
README.md
@@ -2,6 +2,18 @@
|
||||
|
||||
Utilities for managing Ollama LLM models, including automated installation from HuggingFace.
|
||||
|
||||
## TODO
|
||||
|
||||
| **Use Case** | **Best Model** | **VRAM** | **Speed** | **Why** |
|
||||
|--------------|----------------|----------|-----------|---------|
|
||||
| **IDE Autocomplete** | Qwen2.5-Coder-1.5B (Q8) | 2.5GB | 120-150 t/s | Latency critical, FIM optimized |
|
||||
| **Quick Drafting** | Yi-Coder-9B (Q5_K_M) | 7-8GB | 50-80 t/s | Best speed/quality balance |
|
||||
| **Large Code Analysis** | Qwen2.5-Coder-14B (Q4_K_M) | 14-16GB | 30-40 t/s | SOTA repo-level, 128K context |
|
||||
| **Reverse Engineering** | DeepCoder-14B (Q5_K_M) | 11-12GB | 30-50 t/s | Strongest reasoning, RL-trained |
|
||||
|
||||
gemma3-12b-it-qat
|
||||
gemma3-4b-it-qat
|
||||
|
||||
## Web Interface
|
||||
|
||||
Start the web interface:
|
||||
|
||||
Reference in New Issue
Block a user