Based on a discussion with @mike-lischke in db#1216.
I've found also a handful of possible issues where the grammar doesn't accept valid MySQL syntax.
I tested everything against real MySQL servers and checked the official MySQL docs to make sure I'm not crazy.
All of these work fine in actual MySQL but fail to parse with the grammar.
I've linked each one with MySQL documentation links to show they're legit.
Let me know if I'm missing something, or if I need to look into these edge cases more closely.
Thanks in advance.
1. DEFAULT keyword gets blocked in SET CHARSET
The charsetName rule has a version check that blocks DEFAULT in MySQL 8.0.11+:
charsetName
: textOrIdentifier
| BINARY_SYMBOL
| {this.isServerVersionLt80011()}? DEFAULT_SYMBOL // ← This blocks DEFAULT in 8.0.11+
;
So this valid SQL fails to parse:
SET CHARACTER SET DEFAULT;
SET CHARSET DEFAULT;
According to MySQL docs, DEFAULT works in all versions with no restrictions:
mysql> SET CHARSET DEFAULT;
Query OK, 0 rows affected (0.00 sec)
Works fine!
2. GEOMCOLLECTION alias is missing
The grammar only knows about GEOMETRYCOLLECTION but not the shorter GEOMCOLLECTION alias that MySQL accepts:
-- ❌ This fails
CREATE TABLE t (geo GEOMCOLLECTION);
-- ✅ This works
CREATE TABLE t (geo GEOMETRYCOLLECTION);
MySQL treats GEOMCOLLECTION as an alias for GEOMETRYCOLLECTION.
It's mentioned in the docs:
I tested it on MySQL 8.0:
mysql> CREATE TABLE test (g GEOMCOLLECTION);
Query OK, 0 rows affected (0.02 sec)
mysql> SHOW CREATE TABLE test\G
Create Table: CREATE TABLE `test` (
`g` geometrycollection DEFAULT NULL
) ENGINE=InnoDB
3. BIT literals like b'1' don't work
You can't use BIT literal syntax in the grammar:
-- ❌ This fails
CREATE TABLE t (
flags BIT(8) DEFAULT b'11110000'
);
The lexer sees b'1' as identifier b + string '1' instead of a single BIT literal.
It's mentioned in the docs:
I tested it on MySQL 8.0:
mysql> CREATE TABLE test (b BIT(8) DEFAULT b'11110000');
Query OK, 0 rows affected (0.02 sec)
mysql> SHOW CREATE TABLE test\G
Create Table: CREATE TABLE `test` (
`b` bit(8) DEFAULT b'11110000'
) ENGINE=InnoDB
4. JSON_TABLE reserved keyword blocks table names
JSON_TABLE became a reserved keyword in MySQL 8.0.14+ but the grammar blocks it from being used as a table/column name:
-- ❌ This fails
CREATE TABLE json_table (
id INT,
data JSON
);
According to MySQL docs, reserved keywords can be used as identifiers when properly quoted or in unambiguous contexts:
I tested it on MySQL 8.0:
mysql> CREATE TABLE json_table (id INT);
Query OK, 0 rows affected (0.02 sec)
mysql> SHOW TABLES;
+------------------+
| Tables_in_test |
+------------------+
| json_table |
+------------------+
5. Window function keywords blocked as identifiers
Window function keywords are treated as reserved in the grammar, but they're actually NON-RESERVED in MySQL 8.0+.
According to MySQL docs, these are all NON-RESERVED keywords introduced for window functions in MySQL 8.0.2+:
- MySQL 8.0 Keywords: https://dev.mysql.com/doc/refman/8.0/en/keywords.html
The docs explicitly list these as nonreserved (all added in 8.0.2):
- CUME_DIST (nonreserved since 8.0.2)
- DENSE_RANK (nonreserved since 8.0.2)
- FIRST_VALUE (nonreserved since 8.0.2)
- LAG (nonreserved since 8.0.2)
- LAST_VALUE (nonreserved since 8.0.2)
- LEAD (nonreserved since 8.0.2)
- NTH_VALUE (nonreserved since 8.0.2)
- NTILE (nonreserved since 8.0.2)
- PERCENT_RANK (nonreserved since 8.0.2)
- RANK (nonreserved since 8.0.2)
- ROW_NUMBER (nonreserved since 8.0.2)
- MySQL 8.0 Window Functions: https://dev.mysql.com/doc/refman/8.0/en/window-functions.html
Tested on MySQL 8.0:
mysql> CREATE TABLE t_keywords (
-> lag varchar(256),
-> lead varchar(256),
-> rank varchar(256),
-> row_number varchar(256),
-> first_value varchar(256)
-> );
Query OK, 0 rows affected (0.02 sec)
mysql> SHOW CREATE TABLE t_keywords\G
*************************** 1. row ***************************
Table: t_keywords
Create Table: CREATE TABLE t_keywords (
lag varchar(256) DEFAULT NULL,
lead varchar(256) DEFAULT NULL,
rank varchar(256) DEFAULT NULL,
row_number varchar(256) DEFAULT NULL,
first_value varchar(256) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
6. BACK_TICK_QUOTED_ID doesn't handle escaped backticks correctly
The lexer rule for backtick-quoted identifiers doesn't properly support the standard MySQL escape sequence for backticks:
BACK_TICK_QUOTED_ID
: BACK_TICK (({this.isBackTickQuotedId()}? '\\')? .)*? BACK_TICK
;
This rule only handles backslash escaping (`), but MySQL's primary escape method is doubling the backtick (``), which is not supported.
Failing SQL:
-- ❌ This fails (doubled backtick escape - standard MySQL method)
CREATE TABLE t (```c1``` INT);
Working SQL:
-- ✅ This works (backslash escape - non-standard)
CREATE TABLE t (`\`c1\`` INT);
Why it's wrong:
According to MySQL docs, both escape methods are valid, but doubled backticks are the documented standard:
Tested on MySQL 8.0:
mysql> CREATE TABLE t (```c1``` INT);
Query OK, 0 rows affected (0.02 sec)
mysql> SHOW CREATE TABLE t\G
*************************** 1. row ***************************
Table: t
Create Table: CREATE TABLE `t` (
``c1`` int DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
7. VECTOR data type is missing (MySQL 9.0+)
The grammar doesn't support the VECTOR data type introduced in MySQL
9.0 for AI/ML embeddings:
-- ❌ This fails
CREATE TABLE vectors (
id INT PRIMARY KEY,
embedding VECTOR(2048)
);
The VECTOR(n) type is used to store vector embeddings for similarity
search and AI/ML workloads. It was introduced in MySQL 9.0.0.
According to MySQL docs:
Tested on MySQL 9.0:
mysql> CREATE TABLE test (
-> id INT PRIMARY KEY,
-> embedding VECTOR(128)
-> );
Query OK, 0 rows affected (0.02 sec)
mysql> SHOW CREATE TABLE test\G
*************************** 1. row ***************************
Table: test
Create Table: CREATE TABLE `test` (
`id` int NOT NULL,
`embedding` vector(128) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
Based on a discussion with @mike-lischke in db#1216.
I've found also a handful of possible issues where the grammar doesn't accept valid MySQL syntax.
I tested everything against real MySQL servers and checked the official MySQL docs to make sure I'm not crazy.
All of these work fine in actual MySQL but fail to parse with the grammar.
I've linked each one with MySQL documentation links to show they're legit.
Let me know if I'm missing something, or if I need to look into these edge cases more closely.
Thanks in advance.
1. DEFAULT keyword gets blocked in SET CHARSET
The
charsetNamerule has a version check that blocksDEFAULTin MySQL 8.0.11+:So this valid SQL fails to parse:
According to MySQL docs,
DEFAULTworks in all versions with no restrictions:MySQL 8.0 docs: https://dev.mysql.com/doc/refman/8.0/en/set-character-set.html
Says the syntax is:
SET {CHARSET | CHARACTER SET} {'charset_name' | DEFAULT}No mention of any version restrictions.
MySQL 8.4 docs: https://dev.mysql.com/doc/refman/8.4/en/set-character-set.html (same syntax)
MySQL 9.0 docs: https://dev.mysql.com/doc/refman/9.0/en/set-character-set.html (same syntax)
MySQL 8.0.11 changelog: https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-11.html
No mention of any breaking changes to SET CHARSET.
Works fine!
2. GEOMCOLLECTION alias is missing
The grammar only knows about
GEOMETRYCOLLECTIONbut not the shorterGEOMCOLLECTIONalias that MySQL accepts:MySQL treats
GEOMCOLLECTIONas an alias forGEOMETRYCOLLECTION.It's mentioned in the docs:
I tested it on MySQL 8.0:
3. BIT literals like b'1' don't work
You can't use BIT literal syntax in the grammar:
The lexer sees
b'1'as identifierb+ string'1'instead of a single BIT literal.It's mentioned in the docs:
I tested it on MySQL 8.0:
4. JSON_TABLE reserved keyword blocks table names
JSON_TABLE became a reserved keyword in MySQL 8.0.14+ but the grammar blocks it from being used as a table/column name:
According to MySQL docs, reserved keywords can be used as identifiers when properly quoted or in unambiguous contexts:
I tested it on MySQL 8.0:
5. Window function keywords blocked as identifiers
Window function keywords are treated as reserved in the grammar, but they're actually NON-RESERVED in MySQL 8.0+.
According to MySQL docs, these are all NON-RESERVED keywords introduced for window functions in MySQL 8.0.2+:
The docs explicitly list these as nonreserved (all added in 8.0.2):
Tested on MySQL 8.0:
mysql> CREATE TABLE t_keywords (
-> lag varchar(256),
-> lead varchar(256),
-> rank varchar(256),
-> row_number varchar(256),
-> first_value varchar(256)
-> );
Query OK, 0 rows affected (0.02 sec)
mysql> SHOW CREATE TABLE t_keywords\G
*************************** 1. row ***************************
Table: t_keywords
Create Table: CREATE TABLE
t_keywords(lagvarchar(256) DEFAULT NULL,leadvarchar(256) DEFAULT NULL,rankvarchar(256) DEFAULT NULL,row_numbervarchar(256) DEFAULT NULL,first_valuevarchar(256) DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
6. BACK_TICK_QUOTED_ID doesn't handle escaped backticks correctly
The lexer rule for backtick-quoted identifiers doesn't properly support the standard MySQL escape sequence for backticks:
This rule only handles backslash escaping (`), but MySQL's primary escape method is doubling the backtick (``), which is not supported.
Failing SQL:
Working SQL:
Why it's wrong:
According to MySQL docs, both escape methods are valid, but doubled backticks are the documented standard:
Says: "Within a quoted identifier, to include a backtick character, double it."
Example:
SELECTidFROMtable ``` creates identifieridfrom table `table`Tested on MySQL 8.0:
7. VECTOR data type is missing (MySQL 9.0+)
The grammar doesn't support the
VECTORdata type introduced in MySQL9.0 for AI/ML embeddings:
The VECTOR(n) type is used to store vector embeddings for similarity
search and AI/ML workloads. It was introduced in MySQL 9.0.0.
According to MySQL docs:
MySQL 9.0 VECTOR Data Type:
https://dev.mysql.com/doc/refman/9.0/en/vector.html
Syntax: VECTOR(dimensions)
Example: CREATE TABLE items (v VECTOR(3));
MySQL 9.0 What's New:
https://dev.mysql.com/doc/relnotes/mysql/9.0/en/news-9-0-0.html
Lists VECTOR as a new data type for storing fixed-length arrays of
floating-point values.
Tested on MySQL 9.0: