Description
vLLM is an inference and serving engine for large language models (LLMs). From version 0.5.5 to before version 0.18.0, Librosa defaults to using numpy.mean for mono downmixing (to_mono), while the international standard ITU-R BS.775-4 specifies a weighted downmixing algorithm. This discrepancy results in inconsistency between audio heard by humans (e.g., through headphones/regular speakers) and audio processed by AI models (Which infra via Librosa, such as vllm, transformer). This issue has been patched in version 0.18.0.
Problem types
CWE-20: Improper Input Validation
Product status
References
github.com/...t/vllm/security/advisories/GHSA-6c4r-fmh3-7rh8
github.com/vllm-project/vllm/pull/37058
github.com/...ommit/c7f98b4d0a63b32ed939e2b6dfaa8a626e9b46c4
github.com/vllm-project/vllm/releases/tag/v0.18.0