kapynAI / Models

glm-5.2-sm120 — GLM-5.2-NVFP4-REAP-469B serving on SM120 (4× RTX PRO 6000 Blackwell) — one-command vLLM launch recipe, 250K context, Dee

GLM-5.2-SM120 showcases a 469 billion parameter model optimized for inference on 4x RTX PRO 6000 Blackwell GPUs. This open-source project provides a one-command vLLM recipe for serving, featuring a 250K context window and DeepSeek Sparse Attention with Mixture-of-Experts speculative decoding for efficient performance.

GitHub·Jun 18, 2026

Opening Kapyn…