Cyber Security

SGlang CVE-2026-5760 (CVSS 9.8) Enables RCE with Malicious GGUF Model Files

IRavie LakshmananApril 20, 2026Open Source / Server Security

A critical security vulnerability has been disclosed SGlang that, if successfully deployed, could lead to remote code execution on vulnerable systems.

Vulnerability, followed by CVE-2026-5760it holds a CVSS score of 9.8 out of 10.0. It has been described as a command injection issue that leads to the execution of arbitrary code.

SGlang is a high-performance, open-source framework for large-scale modeling languages ​​and multi-object models. The official GitHub project has been forked over 5,500 times and starred 26,100 times.

According to the CERT Coordination Center (CERT/CC), the vulnerability affects the “/v1/rerank” endpoint, which allows an attacker to gain arbitrary code execution in the context of the SGLang service using a specially crafted GPT-Generated Unified Format (GGUF) file.

“An attacker exploits this vulnerability by creating a malicious GPT Generated Unified Format (GGUF) template file with a crafted tokenizer.chat_template parameter that contains Jinja2’s server-side template injection (SSTI) with a key phrase to open a path to the vulnerable code,” CERT/CC said in an advisory issued today.

“The victim then downloads and uploads the model to SGLang, and when the request reaches the “/v1/rerank” endpoint, a malicious template is served, using the attacker’s unspecified Python code on the server. This chain of events enables the attacker to achieve remote code execution (RCE) on the SGLang server.”

According to security researcher Stuart Beck, who discovered and reported the bug, the root problem stems from the use of jinja2.Environment() without sandboxing instead of ImmutableSandboxedEnvironment. This, in turn, makes the malicious model run arbitrary Python code on the inference server.

The whole sequence of actions is as follows:

  • Attacker creates GGUF template file with malicious tokenizer.chat_template containing Jinja2 SSTI payload
  • The template includes the Qwen3 reranker trigger phrase to open the vulnerable code path in “entrypoints/openai/serving_rerank.py”
  • The victim downloads and uploads the model to SGlang from sources such as Hugging Face
  • When a request arrives at the “/v1/rerank” endpoint, SGlang reads the dialog_template and returns it as jinja2.Environment()
  • SSTI’s payload uses arbitrary Python code on the server

It’s worth noting that CVE-2026-5760 falls under the same vulnerability category as CVE-2024-34359 (aka Llama Drama, CVSS score: 9.7), a now-patched critical bug in the llama_cpp_python package for Python that could have resulted in a fix. A similar attack surface was also fixed in vLLM late last year (CVE-2025-61620, CVSS score: 6.5).

“To mitigate this vulnerability, it is recommended to use ImmutableSandboxedEnvironment instead of jinja2.Environment() to provide dialog templates,” CERT/CC said. “This will prevent unauthorized use of Python code on the server. No response or patch was found during the compilation process.”

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button