Description
vLLM is an inference and serving engine for large language models (LLMs). Prior to 0.22.0, an assert-based security check in vLLM's activation function loading allows any unauthenticated attacker to achieve arbitrary code execution on the server by publishing a malicious HuggingFace model, when vLLM runs in Python optimized mode (python -O or PYTHONOPTIMIZE=1). This vulnerability is fixed in 0.22.0.
Problem types
CWE-94: Improper Control of Generation of Code ('Code Injection')
Product status
References
github.com/...t/vllm/security/advisories/GHSA-q8gq-377p-jq3r
github.com/...ommit/b3c7ffcab82c2439726f8cb213800f6f38c023d3
huntr.com/bounties/dcb05b04-e625-41e7-adbc-bbae0cc2d64c