NLTK (Natural Language Toolkit) is a suite of open source Python modules, data sets, and tutorials supporting research and development in Natural Language Processing. Prior to 3.10.0-rc1, nltk.data.load() in NLTK is vulnerable to path traversal via URL-encoded path separators and traversal segments when using the nltk: URL scheme. The unsafe-path regex check is performed before url2pathname() decodes the %xx sequences (a classic decode-after-check / TOCTOU-style flaw), allowing an attacker to bypass the protection documented in NLTK's SECURITY.md and read arbitrary files from the filesystem. While literal traversal strings such as ../../../etc/passwd are correctly blocked, encoded variants such as %2fetc%2fpasswd, %2e%2e%2f..., and ..%2f..%2f slip past the regex and are subsequently decoded into a real filesystem path. This vulnerability is fixed in 3.10.0-rc1.

Project Subscriptions

Vendors Products
Advisories
Source ID Title
Github GHSA Github GHSA GHSA-p4gq-832x-fm9v Natural Language Toolkit (NLTK): URL-Encoded Path Traversal in nltk.data.load() Allows Arbitrary Local File Read
Fixes

Solution

No solution given by the vendor.


Workaround

No workaround given by the vendor.

History

Mon, 22 Jun 2026 21:30:00 +0000

Type Values Removed Values Added
First Time appeared Nltk
Nltk nltk
Vendors & Products Nltk
Nltk nltk

Mon, 22 Jun 2026 18:45:00 +0000

Type Values Removed Values Added
Description NLTK (Natural Language Toolkit) is a suite of open source Python modules, data sets, and tutorials supporting research and development in Natural Language Processing. Prior to 3.10.0-rc1, nltk.data.load() in NLTK is vulnerable to path traversal via URL-encoded path separators and traversal segments when using the nltk: URL scheme. The unsafe-path regex check is performed before url2pathname() decodes the %xx sequences (a classic decode-after-check / TOCTOU-style flaw), allowing an attacker to bypass the protection documented in NLTK's SECURITY.md and read arbitrary files from the filesystem. While literal traversal strings such as ../../../etc/passwd are correctly blocked, encoded variants such as %2fetc%2fpasswd, %2e%2e%2f..., and ..%2f..%2f slip past the regex and are subsequently decoded into a real filesystem path. This vulnerability is fixed in 3.10.0-rc1.
Title NLTK: URL-Encoded Path Traversal in nltk.data.load() Allows Arbitrary Local File Read
Weaknesses CWE-22
References
Metrics cvssV3_1

{'score': 7.5, 'vector': 'CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N'}


Projects

Sign in to view the affected projects.

cve-icon MITRE

Status: PUBLISHED

Assigner: GitHub_M

Published:

Updated: 2026-06-22T21:18:59.775Z

Reserved: 2026-06-12T17:46:37.293Z

Link: CVE-2026-54293

cve-icon Vulnrichment

No data.

cve-icon NVD

No data.

cve-icon Redhat

No data.

cve-icon OpenCVE Enrichment

Updated: 2026-06-22T21:15:04Z

Weaknesses