History log of /openbmc/qemu/qobject/json-lexer.c (Results 1 – 25 of 41)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v9.2.0, v9.1.2, v9.1.1, v9.1.0
# 886c0453 22-May-2023 Richard Henderson <richard.henderson@linaro.org>

Merge tag 'pull-qapi-2023-05-17-v2' of https://repo.or.cz/qemu/armbru into staging

QAPI patches patches for 2023-05-17

# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhl

Merge tag 'pull-qapi-2023-05-17-v2' of https://repo.or.cz/qemu/armbru into staging

QAPI patches patches for 2023-05-17

# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmRrTcgSHGFybWJydUBy
# ZWRoYXQuY29tAAoJEDhwtADrkYZTMycP/3sP6/U4kwOKMGGcB+n2pHJeioQS4xgF
# 94NCW+KpewxApP0XzIC2nDGjUe/rPcUfQmBNUumvYbqHO91tq91wFwkllBv2UR0q
# 6qfRji+e8+9H9hMDeVzzSNjlZZg/tSdIJlhkJDw1u4/3fpjfAmzVx6DO3wepSQ9Q
# m5Af/+uhVZWyUXMZqcKr2Zq8qur6ZFEBNpXpPvT60Tvy2heuQ+vcoE3tl2ZRQbmj
# b/jhtCu+NPjgOHtg9Gr2BPXqQiZBR4vFA7WBsB8wCf2xxULfTwHJvFz/e0vx5fUC
# q0Fsyybf4USo2PRMsRFv2v4dEuVGHb3E1RIJY4NTAxQMqqm4zfOyK0BzOGNDkxCn
# owNP4vKly0e/CfYDY74FHaPId295xyeo6S4Cj5ib9W23AAWUNt6f6vbjlDOLCLON
# c7yXP/aJwhTb2w1t0mLTmsKum3DpLlrudPudTylVlmYfwchkvUGsWYbaxu6H6XWk
# 49Ox/QPVwqG6elXNn3kTY4QqTAppXhE7QcPbioX9WOThVPf6aJCLdZSHEHu4HXkZ
# 4FRu73Z2wcPNB789xOrQoXs24GdKmWXQ6K01KC4v7WNJQBXccec52yGxvktQRZBm
# GL3zYdOOJEL+Y/8JrXTIo26M8HP/4kxV2VqB6KOuaGygMsW9w9jbG+ygLyjqUDQg
# 3APV3hdmVOht
# =6anf
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 22 May 2023 04:11:04 AM PDT
# gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg: issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [undefined]
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653

* tag 'pull-qapi-2023-05-17-v2' of https://repo.or.cz/qemu/armbru:
docs/interop: Delete qmp-intro.txt
docs/interop/qmp-spec: Update error description for parsing errors
docs/interop: Convert qmp-spec.txt to rST
qapi: Improve error message for description following section

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

show more ...


# d5657258 15-May-2023 Peter Maydell <peter.maydell@linaro.org>

docs/interop: Convert qmp-spec.txt to rST

Convert the qmp-spec.txt document to restructuredText.
Notable points about the conversion:
* numbers at the start of section headings are removed, to matc

docs/interop: Convert qmp-spec.txt to rST

Convert the qmp-spec.txt document to restructuredText.
Notable points about the conversion:
* numbers at the start of section headings are removed, to match
the style of the rest of the manual
* cross-references to other sections or documents are hyperlinked
* various formatting tweaks (notably the examples, which need the
-> and <- prefixed so the QMP code-block lexer will accept them)
* English prose fixed in a few places

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230515162245.3964307-2-peter.maydell@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
[.. code-block:: dumbed down to :: to work around CI failure]

show more ...


Revision tags: v8.0.0, v7.2.0, v7.0.0, v6.2.0, v6.1.0, v5.2.0, v5.0.0, v4.2.0, v4.0.0, v4.0.0-rc1
# 199f8d94 26-Mar-2019 Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/armbru/tags/pull-misc-2019-03-26' into staging

Miscellaneous patches for 2019-03-26

# gpg: Signature made Tue 26 Mar 2019 07:10:23 GMT
# gpg: us

Merge remote-tracking branch 'remotes/armbru/tags/pull-misc-2019-03-26' into staging

Miscellaneous patches for 2019-03-26

# gpg: Signature made Tue 26 Mar 2019 07:10:23 GMT
# gpg: using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-misc-2019-03-26:
qapi/qmp-dispatch: fix return value in do_qmp_dispatch
json: Fix off-by-one assert check in next_state()
xen-block: Replace qdict_put_obj() by qdict_put() where appropriate
util/error: Remove an unnecessary NULL check

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


# 19e8ff48 21-Mar-2019 Liam Merwick <liam.merwick@oracle.com>

json: Fix off-by-one assert check in next_state()

The assert checking if the value of lexer->state in next_state(),
which is used as an index to the 'json_lexer' array, incorrectly
checks for an ind

json: Fix off-by-one assert check in next_state()

The assert checking if the value of lexer->state in next_state(),
which is used as an index to the 'json_lexer' array, incorrectly
checks for an index value less than or equal to ARRAY_SIZE(json_lexer).
Fix assert so that it just checks for an index less than the array size.

Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
Message-Id: <1553169472-25325-1-git-send-email-liam.merwick@oracle.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>

show more ...


Revision tags: v4.0.0-rc0, v3.1.0, v3.1.0-rc5, v3.1.0-rc4, v3.1.0-rc3, v3.1.0-rc2, v3.1.0-rc1, v3.1.0-rc0, libfdt-20181002
# f69d20fa 25-Sep-2018 Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/armbru/tags/pull-qobject-2018-09-24' into staging

QObject patches for 2018-09-24

# gpg: Signature made Mon 24 Sep 2018 17:09:58 BST
# gpg: using

Merge remote-tracking branch 'remotes/armbru/tags/pull-qobject-2018-09-24' into staging

QObject patches for 2018-09-24

# gpg: Signature made Mon 24 Sep 2018 17:09:58 BST
# gpg: using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qobject-2018-09-24:
json: Eliminate lexer state IN_WHITESPACE, pseudo-token JSON_SKIP
json: Eliminate lexer state IN_ERROR
json: Nicer recovery from lexical errors
json: Make lexer's "character consumed" logic less confusing
json: Clean up how lexer consumes "end of input"
json: Fix lexer for lookahead character beyond '\x7F'

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


Revision tags: ppc-for-3.1-20180925, ppc-for-3.1-20180907
# 1e960b46 31-Aug-2018 Markus Armbruster <armbru@redhat.com>

json: Eliminate lexer state IN_WHITESPACE, pseudo-token JSON_SKIP

The lexer ignores whitespace like this:

on whitespace on non-ws spontaneously
IN_START --> IN_WHITESPACE --> JS

json: Eliminate lexer state IN_WHITESPACE, pseudo-token JSON_SKIP

The lexer ignores whitespace like this:

on whitespace on non-ws spontaneously
IN_START --> IN_WHITESPACE --> JSON_SKIP --> IN_START
^ |
\__/ on whitespace

This accumulates a whitespace token in state IN_WHITESPACE, only to
throw it away on the transition via JSON_SKIP to the start state.
Wasteful. Go from IN_START to IN_START on whitespace directly,
dropping the whitespace character.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180831075841.13363-7-armbru@redhat.com>

show more ...


# 2ce4ee64 31-Aug-2018 Markus Armbruster <armbru@redhat.com>

json: Eliminate lexer state IN_ERROR

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180831075841.13363-6-armbru@redhat.com>


# 0f07a5d5 31-Aug-2018 Markus Armbruster <armbru@redhat.com>

json: Nicer recovery from lexical errors

When the lexer chokes on an input character, it consumes the
character, emits a JSON error token, and enters its start state. This
can lead to suboptimal er

json: Nicer recovery from lexical errors

When the lexer chokes on an input character, it consumes the
character, emits a JSON error token, and enters its start state. This
can lead to suboptimal error recovery. For instance, input

0123 ,

produces the tokens

JSON_ERROR 01
JSON_INTEGER 23
JSON_COMMA ,

Make the lexer skip characters after a lexical error until a
structural character ('[', ']', '{', '}', ':', ','), an ASCII control
character, or '\xFE', or '\xFF'.

Note that we must not skip ASCII control characters, '\xFE', '\xFF',
because those are documented to force the JSON parser into known-good
state, by docs/interop/qmp-spec.txt.

The lexer now produces

JSON_ERROR 01
JSON_COMMA ,

Update qmp-test for the nicer error recovery: QMP now reports just one
error for input %p instead of two. Also drop the newline after %p; it
was needed to tease out the second error.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180831075841.13363-5-armbru@redhat.com>
[Conflict with commit ebb4d82d888 resolved]

show more ...


# c0ee3afa 31-Aug-2018 Markus Armbruster <armbru@redhat.com>

json: Make lexer's "character consumed" logic less confusing

The lexer uses macro TERMINAL_NEEDED_LOOKAHEAD() to decide whether a
state transition consumes the input character. It returns true when

json: Make lexer's "character consumed" logic less confusing

The lexer uses macro TERMINAL_NEEDED_LOOKAHEAD() to decide whether a
state transition consumes the input character. It returns true when
the state transition is defined with the TERMINAL() macro. To detect
that, it checks whether input '\0' would have resulted in the same
state transition, and the new state is not IN_ERROR.

Why does that even work? For all states, the new state on input '\0'
is either IN_ERROR or defined with TERMINAL(). If the state
transition equals the one we'd get for input '\0', it goes to IN_ERROR
or to the argument of TERMINAL(). We never use TERMINAL(IN_ERROR),
because it makes no sense. Thus, if it doesn't go to IN_ERROR, it
must be defined with TERMINAL().

Since this isn't quite confusing enough, we negate the result to get
@char_consumed, and ignore it when @flush is true.

Instead of deriving the lookahead bit from the state transition, make
it explicit. This is easier to understand, and a bit more flexible,
too.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180831075841.13363-4-armbru@redhat.com>

show more ...


# 852dfa76 31-Aug-2018 Markus Armbruster <armbru@redhat.com>

json: Clean up how lexer consumes "end of input"

When the lexer isn't in its start state at the end of input, it's
working on a token. To flush it out, it needs to transit to its start
state on "en

json: Clean up how lexer consumes "end of input"

When the lexer isn't in its start state at the end of input, it's
working on a token. To flush it out, it needs to transit to its start
state on "end of input" lookahead.

There are two ways to the start state, depending on the current state:

* If the lexer is in a TERMINAL(JSON_FOO) state, it can emit a
JSON_FOO token.

* Else, it can go to IN_ERROR state, and emit a JSON_ERROR token.

There are complications, however:

* The transition to IN_ERROR state consumes the input character and
adds it to the JSON_ERROR token. The latter is inappropriate for
the "end of input" character, so we suppress that. See also recent
commit a2ec6be72b8 "json: Fix lexer to include the bad character in
JSON_ERROR token".

* The transition to a TERMINAL(JSON_FOO) state doesn't consume the
input character. In that case, the lexer normally loops until it is
consumed. We have to suppress that for the "end of input" input
character. If we didn't, the lexer would consume it by entering
IN_ERROR state, emitting a bogus JSON_ERROR token. We fixed that in
commit bd3924a33a6.

However, simply breaking the loop this way assumes that the lexer
needs exactly one state transition to reach its start state. That
assumption is correct now, but it's unclean, and I'll soon break it.
Clean up: instead of breaking the loop after one iteration, break it
after it reached the start state.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180831075841.13363-3-armbru@redhat.com>

show more ...


# 2a96042a 31-Aug-2018 Markus Armbruster <armbru@redhat.com>

json: Fix lexer for lookahead character beyond '\x7F'

The lexer fails to end a valid token when the lookahead character is
beyond '\x7F'. For instance, input

true\xC2\xA2

produces the tokens

json: Fix lexer for lookahead character beyond '\x7F'

The lexer fails to end a valid token when the lookahead character is
beyond '\x7F'. For instance, input

true\xC2\xA2

produces the tokens

JSON_ERROR true\xC2
JSON_ERROR \xA2

This should be

JSON_KEYWORD true
JSON_ERROR \xC2
JSON_ERROR \xA2

instead.

The culprit is

#define TERMINAL(state) [0 ... 0x7F] = (state)

It leaves [0x80..0xFF] zero, i.e. IN_ERROR. Has always been broken.
Fix it to initialize the complete array.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180831075841.13363-2-armbru@redhat.com>

show more ...


# cc9821fa 25-Aug-2018 Peter Maydell <peter.maydell@linaro.org>

Merge remote-tracking branch 'remotes/armbru/tags/pull-qobject-2018-08-24' into staging

QObject patches for 2018-08-24

# gpg: Signature made Fri 24 Aug 2018 20:28:53 BST
# gpg: using

Merge remote-tracking branch 'remotes/armbru/tags/pull-qobject-2018-08-24' into staging

QObject patches for 2018-08-24

# gpg: Signature made Fri 24 Aug 2018 20:28:53 BST
# gpg: using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qobject-2018-08-24: (58 commits)
json: Update references to RFC 7159 to RFC 8259
json: Support %% in JSON strings when interpolating
json: Improve safety of qobject_from_jsonf_nofail() & friends
json: Keep interpolation state in JSONParserContext
tests/drive_del-test: Fix harmless JSON interpolation bug
json: Clean up headers
qobject: Drop superfluous includes of qemu-common.h
json: Make JSONToken opaque outside json-parser.c
json: Unbox tokens queue in JSONMessageParser
json: Streamline json_message_process_token()
json: Enforce token count and size limits more tightly
qjson: Have qobject_from_json() & friends reject empty and blank
json: Assert json_parser_parse() consumes all tokens on success
json: Fix streamer not to ignore trailing unterminated structures
json: Fix latent parser aborts at end of input
qjson: Fix qobject_from_json() & friends for multiple values
json: Improve names of lexer states related to numbers
json: Replace %I64d, %I64u by %PRId64, %PRIu64
json: Leave rejecting invalid interpolation to parser
json: Pass lexical errors and limit violations to callback
...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

show more ...


# 86cdf9ec 23-Aug-2018 Markus Armbruster <armbru@redhat.com>

json: Clean up headers

The JSON parser has three public headers, json-lexer.h, json-parser.h,
json-streamer.h. They all contain stuff that is of no interest
outside qobject/json-*.c.

Collect the p

json: Clean up headers

The JSON parser has three public headers, json-lexer.h, json-parser.h,
json-streamer.h. They all contain stuff that is of no interest
outside qobject/json-*.c.

Collect the public interface in include/qapi/qmp/json-parser.h, and
everything else in qobject/json-parser-int.h.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180823164025.12553-54-armbru@redhat.com>

show more ...


# 812ce33e 23-Aug-2018 Markus Armbruster <armbru@redhat.com>

qobject: Drop superfluous includes of qemu-common.h

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180823164025.12553-53-armbru@redha

qobject: Drop superfluous includes of qemu-common.h

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180823164025.12553-53-armbru@redhat.com>

show more ...


# f9277915 23-Aug-2018 Markus Armbruster <armbru@redhat.com>

json: Fix streamer not to ignore trailing unterminated structures

json_message_process_token() accumulates tokens until it got the
sequence of tokens that comprise a single JSON value (it counts cur

json: Fix streamer not to ignore trailing unterminated structures

json_message_process_token() accumulates tokens until it got the
sequence of tokens that comprise a single JSON value (it counts curly
braces and square brackets to decide). It feeds those token sequences
to json_parser_parse(). If a non-empty sequence of tokens remains at
the end of the parse, it's silently ignored. check-qjson.c cases
unterminated_array(), unterminated_array_comma(), unterminated_dict(),
unterminated_dict_comma() demonstrate this bug.

Fix as follows. Introduce a JSON_END_OF_INPUT token. When the
streamer receives it, it feeds the accumulated tokens to
json_parser_parse().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180823164025.12553-46-armbru@redhat.com>

show more ...


# 4d400661 23-Aug-2018 Markus Armbruster <armbru@redhat.com>

json: Improve names of lexer states related to numbers

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180823164025.12553-43-armbru@re

json: Improve names of lexer states related to numbers

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180823164025.12553-43-armbru@redhat.com>

show more ...


# f7617d45 23-Aug-2018 Markus Armbruster <armbru@redhat.com>

json: Leave rejecting invalid interpolation to parser

Both lexer and parser reject invalid interpolation specifications.
The parser's check is useless.

The lexer ends the token right after the firs

json: Leave rejecting invalid interpolation to parser

Both lexer and parser reject invalid interpolation specifications.
The parser's check is useless.

The lexer ends the token right after the first bad character. This
tends to lead to suboptimal error reporting. For instance, input

[ %04d ]

produces the tokens

JSON_LSQUARE [
JSON_ERROR %0
JSON_INTEGER 4
JSON_KEYWORD d
JSON_RSQUARE ]

The parser then yields an error, an object and two more errors:

error: Invalid JSON syntax
object: 4
error: JSON parse error, invalid keyword
error: JSON parse error, expecting value

Dumb down the lexer to accept [A-Za-z0-9]*. The parser's check is now
used. Emit a proper error there.

The lexer now produces

JSON_LSQUARE [
JSON_INTERP %04d
JSON_RSQUARE ]

and the parser reports just

JSON parse error, invalid interpolation '%04d'

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180823164025.12553-41-armbru@redhat.com>

show more ...


# 84a56f38 23-Aug-2018 Markus Armbruster <armbru@redhat.com>

json: Pass lexical errors and limit violations to callback

The callback to consume JSON values takes QObject *json, Error *err.
If both are null, the callback is supposed to make up an error by
itse

json: Pass lexical errors and limit violations to callback

The callback to consume JSON values takes QObject *json, Error *err.
If both are null, the callback is supposed to make up an error by
itself. This sucks.

qjson.c's consume_json() neglects to do so, which makes
qobject_from_json() null instead of failing. I consider that a bug.

The culprit is json_message_process_token(): it passes two null
pointers when it runs into a lexical error or a limit violation. Fix
it to pass a proper Error object then. Update the callbacks:

* monitor.c's handle_qmp_command(): the code to make up an error is
now dead, drop it.

* qga/main.c's process_event(): lumps the "both null" case together
with the "not a JSON object" case. The former is now gone. The
error message "Invalid JSON syntax" is misleading for the latter.
Improve it to "Input must be a JSON object".

* qobject/qjson.c's consume_json(): no update; check-qjson
demonstrates qobject_from_json() now sets an error on lexical
errors, but still doesn't on some other errors.

* tests/libqtest.c's qmp_response(): the Error object is now reliable,
so use it to improve the error message.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180823164025.12553-40-armbru@redhat.com>

show more ...


# 2cbd15aa 23-Aug-2018 Markus Armbruster <armbru@redhat.com>

json: Treat unwanted interpolation as lexical error

The JSON parser optionally supports interpolation. The lexer
recognizes interpolation tokens unconditionally. The parser rejects
them when inter

json: Treat unwanted interpolation as lexical error

The JSON parser optionally supports interpolation. The lexer
recognizes interpolation tokens unconditionally. The parser rejects
them when interpolation is disabled, in parse_interpolation().
However, it neglects to set an error then, which can make
json_parser_parse() fail without setting an error.

Move the check for unwanted interpolation from the parser's
parse_interpolation() into the lexer's finite state machine. When
interpolation is disabled, '%' is now handled like any other
unexpected character.

The next commit will improve how such lexical errors are handled.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180823164025.12553-39-armbru@redhat.com>

show more ...


# 61030280 23-Aug-2018 Markus Armbruster <armbru@redhat.com>

json: Rename token JSON_ESCAPE & friends to JSON_INTERP

The JSON parser optionally supports interpolation. The code calls it
"escape". Awkward, because it uses the same term for escape sequences
w

json: Rename token JSON_ESCAPE & friends to JSON_INTERP

The JSON parser optionally supports interpolation. The code calls it
"escape". Awkward, because it uses the same term for escape sequences
within strings. The latter usage is consistent with RFC 8259 "The
JavaScript Object Notation (JSON) Data Interchange Format" and ISO C.
Call the former "interpolation" instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180823164025.12553-38-armbru@redhat.com>

show more ...


# 037f2440 23-Aug-2018 Markus Armbruster <armbru@redhat.com>

json: Have lexer call streamer directly

json_lexer_init() takes the function to process a token as an
argument. It's always json_message_process_token(). Makes the code
harder to understand for no

json: Have lexer call streamer directly

json_lexer_init() takes the function to process a token as an
argument. It's always json_message_process_token(). Makes the code
harder to understand for no actual gain. Drop the indirection.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180823164025.12553-34-armbru@redhat.com>

show more ...


# 7c1e1d54 23-Aug-2018 Marc-André Lureau <marcandre.lureau@redhat.com>

json: remove useless return value from lexer/parser

The lexer always returns 0 when char feeding. Furthermore, none of the
caller care about the return value.

Signed-off-by: Marc-André Lureau <marc

json: remove useless return value from lexer/parser

The lexer always returns 0 when char feeding. Furthermore, none of the
caller care about the return value.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180326150916.9602-10-marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180823164025.12553-32-armbru@redhat.com>

show more ...


# b2da4a4d 23-Aug-2018 Markus Armbruster <armbru@redhat.com>

json: Leave rejecting invalid escape sequences to parser

Both lexer and parser reject invalid escape sequences in strings. The
parser's check is useless.

The lexer ends the token right after the f

json: Leave rejecting invalid escape sequences to parser

Both lexer and parser reject invalid escape sequences in strings. The
parser's check is useless.

The lexer ends the token right after the first non-well-formed byte.
This tends to lead to suboptimal error reporting. For instance, input

{"abc\@ijk": 1}

produces the tokens

JSON_LCURLY {
JSON_ERROR "abc\@
JSON_KEYWORD ijk
JSON_ERROR ": 1}\n

The parser then reports three errors

Invalid JSON syntax
JSON parse error, invalid keyword 'ijk'
Invalid JSON syntax

before it recovers at the newline.

Drop the lexer's escape sequence checking, and make it accept the same
characters after backslash it accepts elsewhere in strings. It now
produces

JSON_LCURLY {
JSON_STRING "abc\@ijk"
JSON_COLON :
JSON_INTEGER 1
JSON_RCURLY

and the parser reports just

JSON parse error, invalid escape sequence in string

While there, fix parse_string()'s inaccurate function comment.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180823164025.12553-27-armbru@redhat.com>

show more ...


# 4b1c0cd7 23-Aug-2018 Markus Armbruster <armbru@redhat.com>

json: Accept overlong \xC0\x80 as U+0000 ("modified UTF-8")

Since the JSON grammer doesn't accept U+0000 anywhere, this merely
exchanges one kind of parse error for another. It's purely for
consist

json: Accept overlong \xC0\x80 as U+0000 ("modified UTF-8")

Since the JSON grammer doesn't accept U+0000 anywhere, this merely
exchanges one kind of parse error for another. It's purely for
consistency with qobject_to_json(), which accepts \xC0\x80 (see commit
e2ec3f97680).

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180823164025.12553-26-armbru@redhat.com>

show more ...


# de930f45 23-Aug-2018 Markus Armbruster <armbru@redhat.com>

json: Leave rejecting invalid UTF-8 to parser

Both the lexer and the parser (attempt to) validate UTF-8 in JSON
strings.

The lexer rejects bytes that can't occur in valid UTF-8: \xC0..\xC1,
\xF5..\

json: Leave rejecting invalid UTF-8 to parser

Both the lexer and the parser (attempt to) validate UTF-8 in JSON
strings.

The lexer rejects bytes that can't occur in valid UTF-8: \xC0..\xC1,
\xF5..\xFF. This rejects some, but not all invalid UTF-8. It also
rejects ASCII control characters \x00..\x1F, in accordance with RFC
8259 (see recent commit "json: Reject unescaped control characters").

When the lexer rejects, it ends the token right after the first bad
byte. Good when the bad byte is a newline. Not so good when it's
something like an overlong sequence in the middle of a string. For
instance, input

{"abc\xC0\xAFijk": 1}\n

produces the tokens

JSON_LCURLY {
JSON_ERROR "abc\xC0
JSON_ERROR \xAF
JSON_KEYWORD ijk
JSON_ERROR ": 1}\n

The parser then reports four errors

Invalid JSON syntax
Invalid JSON syntax
JSON parse error, invalid keyword 'ijk'
Invalid JSON syntax

before it recovers at the newline.

The commit before previous made the parser reject invalid UTF-8
sequences. Since then, anything the lexer rejects, the parser would
reject as well. Thus, the lexer's rejecting is unnecessary for
correctness, and harmful for error reporting.

However, we want to keep rejecting ASCII control characters in the
lexer, because that produces the behavior we want for unclosed
strings.

We also need to keep rejecting \xFF in the lexer, because we
documented that as a way to reset the JSON parser
(docs/interop/qmp-spec.txt section 2.6 QGA Synchronization), which
means we can't change how we recover from this error now. I wish we
hadn't done that.

I think we should treat \xFE the same as \xFF.

Change the lexer to accept \xC0..\xC1 and \xF5..\xFD. It now rejects
only \x00..\x1F and \xFE..\xFF. Error reporting for invalid UTF-8 in
strings is much improved, except for \xFE and \xFF. For the example
above, the lexer now produces

JSON_LCURLY {
JSON_STRING "abc\xC0\xAFijk"
JSON_COLON :
JSON_INTEGER 1
JSON_RCURLY

and the parser reports just

JSON parse error, invalid UTF-8 sequence in string

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180823164025.12553-25-armbru@redhat.com>

show more ...


12