CVE-2025-46570 DetailDescriptionvLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.9.0, when a new prompt is processed, if the PageAttention mechanism finds a matching prefix chunk, the prefill process speeds up, which is reflected in the TTFT (Time to First Token). These timing differences caused by matching chunks are significant enough to be recognized and exploited. This issue has been patched in version 0.9.0. Metrics
NVD enrichment efforts reference publicly available information to associate
vector strings. CVSS information contributed by other sources is also
displayed.
CVSS 4.0 Severity and Vector Strings:
References to Advisories, Solutions, and ToolsBy selecting these links, you will be leaving NIST webspace. We have provided these links to other web sites because they may have information that would be of interest to you. No inferences should be drawn on account of other sites being referenced, or not, from this page. There may be other web sites that are more appropriate for your purpose. NIST does not necessarily endorse the views expressed, or concur with the facts presented on these sites. Further, NIST does not endorse any commercial products that may be mentioned on these sites. Please address comments about this page to [email protected].
Weakness Enumeration
Known Affected Software Configurations Switch to CPE 2.2CPEs loading, please wait.
Denotes Vulnerable Software Quick InfoCVE Dictionary Entry:CVE-2025-46570 NVD Published Date: 05/29/2025 NVD Last Modified: 06/24/2025 Source: GitHub, Inc. |