Benchmarked open-weight models tested on Infersec infrastructure. Quality metrics, throughput, and latency data from real hardware runs.
barozp
A 20% expert-pruned variant of Qwen3.6-35B-A3B using the REAP method. 28B total parameters with 3B active, providing strong performance at reduced compute cost.
unsloth
GGUF quantization of OpenAI's gpt-oss-20b, provided by Unsloth. A 20B parameter text generation model compatible with llama.cpp.
LiquidAI
LFM2.5-8B-A1B is a multilingual text generation model from Liquid AI, available in GGUF format for llama.cpp. Supports English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish.