Skip to content

Commit

Permalink
Return more-specific error when input might be application/json-seq
Browse files Browse the repository at this point in the history
JSON Test Sequences, aka JSON-SEQ, aka application/json-seq are defined in
https://datatracker.ietf.org/doc/html/rfc7464. Per the RFC, the format is:

   any number of JSON texts, each encoded in UTF-8 [RFC3629],
   each preceded by one ASCII RS character, and each followed by a line
   feed (LF).

jq supports this format but requires the --seq parameter to be used in order to
correct parse it. If the option is omitted, then an ambiguous and confusing
error message is printed. The RFC is designed to avoid this ambiguity:

   Since RS is an ASCII control character, it may only
   appear in JSON strings in escaped form (see [RFC7159]), and since RS
   may not appear in JSON texts in any other form, RS unambiguously
   delimits the start of any element in the sequence.  RS is sufficient
   to unambiguously delimit all top-level JSON value types other than
   numbers.

This change adds ASCII RS character (0x1e) detection when --seq is omitted, and
prints a useful error message recommending to retry with the option.

Fixes #3156.
  • Loading branch information
LPardue committed Oct 17, 2024
1 parent 562d5c5 commit a89cfa6
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 0 deletions.
2 changes: 2 additions & 0 deletions src/jv_parse.c
Original file line number Diff line number Diff line change
Expand Up @@ -514,6 +514,8 @@ static pfunc check_literal(struct jv_parser* p) {
case 'f': pattern = "false"; plen = 5; v = jv_false(); break;
case '\'':
return "Invalid string literal; expected \", but got '";
case 0x1e:
return "Record Separator (RS) detected, this might be application/json-seq. Try using the --seq option.";
case 'n':
// if it starts with 'n', it could be a literal "nan"
if (p->tokenbuf[1] == 'u') {
Expand Down
4 changes: 4 additions & 0 deletions tests/jq.test
Original file line number Diff line number Diff line change
Expand Up @@ -2191,6 +2191,10 @@ try fromjson catch .
"{'a': 123}"
"Invalid string literal; expected \", but got ' at line 1, column 5 (while parsing '{'a': 123}')"

try fromjson catch .
"\u001e{\"a\": 123}"
"Record Separator (RS) detected, this might be application/json-seq. Try using the --seq option. at line 1, column 2 (while parsing '\u001e{\"a\": 123}')"

# ltrimstr/1 rtrimstr/1 don't leak on invalid input #2977

try ltrimstr(1) catch "x", try rtrimstr(1) catch "x" | "ok"
Expand Down

0 comments on commit a89cfa6

Please sign in to comment.