Skip to content

Commit 212c1e7

Browse files
committed
Update spec with Unicode escape sequences
Refined Unicode escape handling to support `\u{XXXXXX}` format, updated examples, ABNF grammar, and syntax highlighting rules. Enhanced test cases to validate new syntax and error scenarios. Drop \f and \b.
1 parent 6457a22 commit 212c1e7

File tree

4 files changed

+19
-17
lines changed

4 files changed

+19
-17
lines changed

.vitepress/maml.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,7 @@
131131
"stringcontent": {
132132
"patterns": [
133133
{
134-
"match": "\\\\(?:[\"/\\\\bfnrt]|u\\h{4})",
134+
"match": "\\\\(?:[\"/\\\\nrt]|u\\{\\h{1,6}\\})",
135135
"name": "constant.character.escape.maml"
136136
},
137137
{

spec/v0.1.md

Lines changed: 13 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -123,24 +123,23 @@ the control characters other than tab (U+0000 to U+0008, U+000A to U+001F,
123123
U+007F).
124124

125125
```maml
126-
"String with a \"nested\" string, \t tab, 😁 emoji, and \u0022 sequence"
126+
"String with a \"nested\" string, \t tab, 😁 emoji, and \u{10FFFF} sequence"
127127
```
128128

129129
For convenience, some popular characters have a compact escape sequence.
130130

131-
```
132-
\b - backspace (U+0008)
133-
\t - tab (U+0009)
134-
\n - linefeed (U+000A)
135-
\f - form feed (U+000C)
136-
\r - carriage return (U+000D)
137-
\" - quote (U+0022)
138-
\\ - backslash (U+005C)
139-
```
131+
| Escape | Name | Unicode |
132+
|--------|-----------------|----------|
133+
| `\t` | tab | U+0009 |
134+
| `\n` | linefeed | U+000A |
135+
| `\r` | carriage return | U+000D |
136+
| `\"` | quote | U+0022 |
137+
| `\\` | backslash | U+005C |
138+
140139

141-
A Unicode character may be escaped with the `\uXXXX` form.
142-
The escape codes must be valid Unicode [scalar
143-
values](https://unicode.org/glossary/#unicode_scalar_value).
140+
A Unicode character may be escaped with the `\u{XXXXXX}` form. The hexadecimal
141+
number can be from 1 to 6 digits long. The escape codes must be valid Unicode
142+
[scalar values](https://unicode.org/glossary/#unicode_scalar_value).
144143

145144
All other escape sequences not listed above are reserved; if they are used,
146145
MAML should produce an error. All strings must contain only valid UTF-8
@@ -202,7 +201,7 @@ All escape sequences are preserved as is.
202201

203202
```maml
204203
"""
205-
There is no escaping, so \n, \u0022, etc.,
204+
There is no escaping, so \n, \u{0022}, etc.,
206205
are interpreted as-is without modification.
207206
"""
208207
```

spec/v0.1/maml.abnf

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ integer = "0" / ( onenine *DIGIT )
3939
quote = %x22
4040

4141
char = %x20-21 / %x23-5B / %x5D-10FFFF
42-
char =/ %x5C ( %x5C / quote / "/" / "b" / "f" / "n" / "r" / "t" / "u" 4HEXDIG )
42+
char =/ %x5C ( %x5C / quote / "/" / "n" / "r" / "t" / "u" "{" 1*6HEXDIG "}" )
4343

4444
comment = "#" *non-eol
4545

spec/v0.1/maml.test.js

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -112,10 +112,13 @@ describe('MAML v0.1', () => {
112112
parse(`"a"`)
113113
parse(`"\\n"`)
114114
parse(`"\\""`)
115+
parse(`"\\u{10FFFF}"`)
115116

116117
expect(() => parse(`"\\`)).toThrow()
117118
expect(() => parse(`"\n"`)).toThrow()
118-
expect(() => parse(`"\\uGGGG"`)).toThrow()
119+
expect(() => parse(`"\\u0000"`)).toThrow()
120+
expect(() => parse(`"\\u{G}"`)).toThrow()
121+
expect(() => parse(`"\\u{1234567}"`)).toThrow()
119122
})
120123

121124
test('multiline string', () => {

0 commit comments

Comments
 (0)