NLTK (Natural Language Toolkit) is a suite of open source Python modules, data sets, and tutorials supporting research and development in Natural Language Processing. Prior to 3.10.0-rc1, nltk.data.load() in NLTK is vulnerable to path traversal via URL-encoded path separators and traversal segments when using the nltk: URL scheme. The unsafe-path regex check is performed before url2pathname() decodes the %xx sequences (a classic decode-after-check / TOCTOU-style flaw), allowing an attacker to bypass the protection documented in NLTK's SECURITY.md and read arbitrary files from the filesystem. While literal traversal strings such as ../../../etc/passwd are correctly blocked, encoded variants such as %2fetc%2fpasswd, %2e%2e%2f..., and ..%2f..%2f slip past the regex and are subsequently decoded into a real filesystem path. This vulnerability is fixed in 3.10.0-rc1.
Advisories
| Source | ID | Title |
|---|---|---|
Github GHSA |
GHSA-p4gq-832x-fm9v | Natural Language Toolkit (NLTK): URL-Encoded Path Traversal in nltk.data.load() Allows Arbitrary Local File Read |
Fixes
Solution
No solution given by the vendor.
Workaround
No workaround given by the vendor.
References
History
Mon, 22 Jun 2026 21:30:00 +0000
| Type | Values Removed | Values Added |
|---|---|---|
| First Time appeared |
Nltk
Nltk nltk |
|
| Vendors & Products |
Nltk
Nltk nltk |
Mon, 22 Jun 2026 18:45:00 +0000
| Type | Values Removed | Values Added |
|---|---|---|
| Description | NLTK (Natural Language Toolkit) is a suite of open source Python modules, data sets, and tutorials supporting research and development in Natural Language Processing. Prior to 3.10.0-rc1, nltk.data.load() in NLTK is vulnerable to path traversal via URL-encoded path separators and traversal segments when using the nltk: URL scheme. The unsafe-path regex check is performed before url2pathname() decodes the %xx sequences (a classic decode-after-check / TOCTOU-style flaw), allowing an attacker to bypass the protection documented in NLTK's SECURITY.md and read arbitrary files from the filesystem. While literal traversal strings such as ../../../etc/passwd are correctly blocked, encoded variants such as %2fetc%2fpasswd, %2e%2e%2f..., and ..%2f..%2f slip past the regex and are subsequently decoded into a real filesystem path. This vulnerability is fixed in 3.10.0-rc1. | |
| Title | NLTK: URL-Encoded Path Traversal in nltk.data.load() Allows Arbitrary Local File Read | |
| Weaknesses | CWE-22 | |
| References |
| |
| Metrics |
cvssV3_1
|
Projects
Sign in to view the affected projects.
Status: PUBLISHED
Assigner: GitHub_M
Published:
Updated: 2026-06-22T21:18:59.775Z
Reserved: 2026-06-12T17:46:37.293Z
Link: CVE-2026-54293
No data.
No data.
No data.
OpenCVE Enrichment
Updated: 2026-06-22T21:15:04Z
Weaknesses
Github GHSA