Available Models
Performance Assessment
VRAM Test
Test a model's VRAM usage and CPU offloading. This will load the model with a minimal prompt and report actual VRAM consumption.
Context Optimizer
Find the optimal context size (num_ctx) for a model based on available VRAM. This iteratively tests different context sizes.