Skip to content

Potential issues in the Oracle MySQL grammar #89

@vsantonastaso

Description

@vsantonastaso

Based on a discussion with @mike-lischke in db#1216.

I've found also a handful of possible issues where the grammar doesn't accept valid MySQL syntax.
I tested everything against real MySQL servers and checked the official MySQL docs to make sure I'm not crazy.

All of these work fine in actual MySQL but fail to parse with the grammar.
I've linked each one with MySQL documentation links to show they're legit.
Let me know if I'm missing something, or if I need to look into these edge cases more closely.
Thanks in advance.

1. DEFAULT keyword gets blocked in SET CHARSET

The charsetName rule has a version check that blocks DEFAULT in MySQL 8.0.11+:

charsetName
    : textOrIdentifier
    | BINARY_SYMBOL
    | {this.isServerVersionLt80011()}? DEFAULT_SYMBOL  // ← This blocks DEFAULT in 8.0.11+
    ;

So this valid SQL fails to parse:

SET CHARACTER SET DEFAULT;
SET CHARSET DEFAULT;

According to MySQL docs, DEFAULT works in all versions with no restrictions:

mysql> SET CHARSET DEFAULT;
Query OK, 0 rows affected (0.00 sec)

Works fine!

2. GEOMCOLLECTION alias is missing

The grammar only knows about GEOMETRYCOLLECTION but not the shorter GEOMCOLLECTION alias that MySQL accepts:

-- ❌ This fails
CREATE TABLE t (geo GEOMCOLLECTION);

-- ✅ This works
CREATE TABLE t (geo GEOMETRYCOLLECTION);

MySQL treats GEOMCOLLECTION as an alias for GEOMETRYCOLLECTION.
It's mentioned in the docs:

I tested it on MySQL 8.0:

mysql> CREATE TABLE test (g GEOMCOLLECTION);
Query OK, 0 rows affected (0.02 sec)

mysql> SHOW CREATE TABLE test\G
Create Table: CREATE TABLE `test` (
  `g` geometrycollection DEFAULT NULL
) ENGINE=InnoDB

3. BIT literals like b'1' don't work

You can't use BIT literal syntax in the grammar:

-- ❌ This fails
CREATE TABLE t (
    flags BIT(8) DEFAULT b'11110000'
);

The lexer sees b'1' as identifier b + string '1' instead of a single BIT literal.

It's mentioned in the docs:

I tested it on MySQL 8.0:

mysql> CREATE TABLE test (b BIT(8) DEFAULT b'11110000');
Query OK, 0 rows affected (0.02 sec)

mysql> SHOW CREATE TABLE test\G
Create Table: CREATE TABLE `test` (
  `b` bit(8) DEFAULT b'11110000'
) ENGINE=InnoDB

4. JSON_TABLE reserved keyword blocks table names

JSON_TABLE became a reserved keyword in MySQL 8.0.14+ but the grammar blocks it from being used as a table/column name:

-- ❌ This fails
CREATE TABLE json_table (
    id INT,
    data JSON
);

According to MySQL docs, reserved keywords can be used as identifiers when properly quoted or in unambiguous contexts:

I tested it on MySQL 8.0:

mysql> CREATE TABLE json_table (id INT);
Query OK, 0 rows affected (0.02 sec)

mysql> SHOW TABLES;
+------------------+
| Tables_in_test   |
+------------------+
| json_table       |
+------------------+

5. Window function keywords blocked as identifiers

Window function keywords are treated as reserved in the grammar, but they're actually NON-RESERVED in MySQL 8.0+.

According to MySQL docs, these are all NON-RESERVED keywords introduced for window functions in MySQL 8.0.2+:

  • MySQL 8.0 Keywords: https://dev.mysql.com/doc/refman/8.0/en/keywords.html
    The docs explicitly list these as nonreserved (all added in 8.0.2):
    • CUME_DIST (nonreserved since 8.0.2)
    • DENSE_RANK (nonreserved since 8.0.2)
    • FIRST_VALUE (nonreserved since 8.0.2)
    • LAG (nonreserved since 8.0.2)
    • LAST_VALUE (nonreserved since 8.0.2)
    • LEAD (nonreserved since 8.0.2)
    • NTH_VALUE (nonreserved since 8.0.2)
    • NTILE (nonreserved since 8.0.2)
    • PERCENT_RANK (nonreserved since 8.0.2)
    • RANK (nonreserved since 8.0.2)
    • ROW_NUMBER (nonreserved since 8.0.2)
  • MySQL 8.0 Window Functions: https://dev.mysql.com/doc/refman/8.0/en/window-functions.html

Tested on MySQL 8.0:
mysql> CREATE TABLE t_keywords (
-> lag varchar(256),
-> lead varchar(256),
-> rank varchar(256),
-> row_number varchar(256),
-> first_value varchar(256)
-> );
Query OK, 0 rows affected (0.02 sec)

mysql> SHOW CREATE TABLE t_keywords\G
*************************** 1. row ***************************
Table: t_keywords
Create Table: CREATE TABLE t_keywords (
lag varchar(256) DEFAULT NULL,
lead varchar(256) DEFAULT NULL,
rank varchar(256) DEFAULT NULL,
row_number varchar(256) DEFAULT NULL,
first_value varchar(256) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

6. BACK_TICK_QUOTED_ID doesn't handle escaped backticks correctly

The lexer rule for backtick-quoted identifiers doesn't properly support the standard MySQL escape sequence for backticks:

BACK_TICK_QUOTED_ID
     : BACK_TICK (({this.isBackTickQuotedId()}? '\\')? .)*? BACK_TICK
     ;

This rule only handles backslash escaping (`), but MySQL's primary escape method is doubling the backtick (``), which is not supported.

Failing SQL:

-- ❌ This fails (doubled backtick escape - standard MySQL method)
CREATE TABLE t (```c1``` INT);

Working SQL:

-- ✅ This works (backslash escape - non-standard)
  CREATE TABLE t (`\`c1\`` INT);

Why it's wrong:

According to MySQL docs, both escape methods are valid, but doubled backticks are the documented standard:

Tested on MySQL 8.0:

  mysql> CREATE TABLE t (```c1``` INT);
  Query OK, 0 rows affected (0.02 sec)

  mysql> SHOW CREATE TABLE t\G
  *************************** 1. row ***************************
         Table: t
  Create Table: CREATE TABLE `t` (
    ``c1`` int DEFAULT NULL
  ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

7. VECTOR data type is missing (MySQL 9.0+)

The grammar doesn't support the VECTOR data type introduced in MySQL
9.0 for AI/ML embeddings:

-- ❌ This fails
CREATE TABLE vectors (
    id INT PRIMARY KEY,
    embedding VECTOR(2048)
);

The VECTOR(n) type is used to store vector embeddings for similarity
search and AI/ML workloads. It was introduced in MySQL 9.0.0.

According to MySQL docs:

Tested on MySQL 9.0:

mysql> CREATE TABLE test (
    ->   id INT PRIMARY KEY,
    ->   embedding VECTOR(128)
    -> );
Query OK, 0 rows affected (0.02 sec)

mysql> SHOW CREATE TABLE test\G
*************************** 1. row ***************************
       Table: test
Create Table: CREATE TABLE `test` (
  `id` int NOT NULL,
  `embedding` vector(128) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions