Available Models

Loading models...

Performance Assessment

VRAM Test

Test a model's VRAM usage and CPU offloading. This will load the model with a minimal prompt and report actual VRAM consumption.

Context Optimizer

Find the optimal context size (num_ctx) for a model based on available VRAM. This iteratively tests different context sizes.