Bug: StreamParser fails when language tags are split across chunk boundaries
The StreamParser throws Unexpected "-nl" when a language tag like @nl-nl happens to be split across two HTTP response chunks (e.g. @nl in one chunk, -nl in the next).
Parsing the same data as a complete string with Parser.parse() works fine.
Reproduction
The issue occurs when streaming large SPARQL CONSTRUCT responses (~12 MB+) through StreamParser via fetch-sparql-endpoint. The exact split point varies between runs because HTTP chunking is non-deterministic, making the error intermittent.
import {StreamParser, Parser} from 'n3';
// Non-streaming: always works
const parser = new Parser({format: 'text/turtle'});
const quads = parser.parse(largeTurtleString); // OK
// Streaming: intermittently fails with "Unexpected '-nl'"
const streamParser = new StreamParser({format: 'text/turtle'});
responseStream.pipe(streamParser); // ERROR at random positions
Environment
n3 v2.0.3
- Node.js v24
- Data contains literals with
@nl-nl language tags
- The SPARQL server (QLever) returns N-Triples content with
Content-Type: text/turtle, so fetch-sparql-endpoint passes text/turtle as the format to StreamParser. N-Triples is valid Turtle, but the Turtle tokenizer may handle chunk boundaries differently.
Expected behaviour
The StreamParser should buffer incomplete tokens across chunk boundaries, recognising that @nl at the end of a chunk is an incomplete language tag when followed by -nl in the next chunk.
Observed behaviour
The parser treats the chunk boundary as a token boundary, yielding @nl as a complete language tag, then encounters -nl as the start of the next token and throws:
Unexpected "-nl" on line 64895.
The line number and occurrence vary between runs depending on how HTTP chunks are split.
Bug: StreamParser fails when language tags are split across chunk boundaries
The
StreamParserthrowsUnexpected "-nl"when a language tag like@nl-nlhappens to be split across two HTTP response chunks (e.g.@nlin one chunk,-nlin the next).Parsing the same data as a complete string with
Parser.parse()works fine.Reproduction
The issue occurs when streaming large SPARQL CONSTRUCT responses (~12 MB+) through
StreamParserviafetch-sparql-endpoint. The exact split point varies between runs because HTTP chunking is non-deterministic, making the error intermittent.Environment
n3v2.0.3@nl-nllanguage tagsContent-Type: text/turtle, sofetch-sparql-endpointpassestext/turtleas the format toStreamParser. N-Triples is valid Turtle, but the Turtle tokenizer may handle chunk boundaries differently.Expected behaviour
The
StreamParsershould buffer incomplete tokens across chunk boundaries, recognising that@nlat the end of a chunk is an incomplete language tag when followed by-nlin the next chunk.Observed behaviour
The parser treats the chunk boundary as a token boundary, yielding
@nlas a complete language tag, then encounters-nlas the start of the next token and throws:The line number and occurrence vary between runs depending on how HTTP chunks are split.