Skip to content

Commit 2e760e3

Browse files
ppkarwaszjkowalleck
andcommitted
Subpath: simplify parsing
This PR: - Moves the check for `.` and `..` segments after UTF-8 decoding. This ensures that overlong encoding of `.` (like `%C0%AE`, `%E0%80%AE`, `%F0%80%80%AE`) are also discarded. - Requires the parser to throw an error if an (percent-encoded) solidus `/` is encountered in any path segment. Signed-off-by: Piotr P. Karwasz <piotr@github.copernik.eu> Co-authored-by: Jan Kowalleck <jan.kowalleck@gmail.com>
1 parent 4d2de7e commit 2e760e3

1 file changed

Lines changed: 3 additions & 4 deletions

File tree

docs/how-to-parse.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,11 @@ To parse a `purl` string in its components:
1111
- Split the `purl` string once from right on '#'
1212

1313
- The left side is the `remainder`
14-
- Strip the right side from leading and trailing '/'
15-
- Split this on '/'
16-
- Discard any empty string segment from that split
14+
- Split the right side on `/`
1715
- Percent-decode each segment
18-
- Discard any '.' or '..' segment from that split
1916
- UTF-8-decode each segment if needed in your programming language
17+
- Discard any segment that is empty, or equal to `.` or `..`
18+
- Report an error if any segment contains a slash `/`
2019
- Join segments back with a '/'
2120
- This is the `subpath`
2221

0 commit comments

Comments
 (0)