Description
Pdfminer.six is a community maintained fork of the original PDFMiner, a tool for extracting information from PDF documents. Prior to version 20251107, pdfminer.six will execute arbitrary code from a malicious pickle file if provided with a malicious PDF file. The `CMapDB._load_data()` function in pdfminer.six uses `pickle.loads()` to deserialize pickle files. These pickle files are supposed to be part of the pdfminer.six distribution stored in the `cmap/` directory, but a malicious PDF can specify an alternative directory and filename as long as the filename ends in `.pickle.gz`. A malicious, zipped pickle file can then contain code which will automatically execute when the PDF is processed. Version 20251107 fixes the issue.
Problem types
CWE-502: Deserialization of Untrusted Data
Product status
References
github.com/...er.six/security/advisories/GHSA-wf5f-4jwr-ppcp
lists.debian.org/debian-lts-announce/2025/11/msg00017.html
github.com/...er.six/security/advisories/GHSA-wf5f-4jwr-ppcp
github.com/...ommit/b808ee05dd7f0c8ea8ec34bdf394d40e63501086
github.com/pdfminer/pdfminer.six/releases/tag/20251107