Skip to content

Commit

Permalink
src/cpp-common: add bt2_common::parseJson() functions (listener mode)
Browse files Browse the repository at this point in the history
This patch adds the bt2_common::parseJson() functions in
`parse-json.hpp`.

Those functions wrap the file-internal
`bt2_common::internal::JsonParser` class of which an instance can parse
a single JSON value, calling specific methods of a JSON event listener
as it processes. Internally, `bt2_common::internal::JsonParser` uses a
string scanner (`bt2_common::StrScanner`).

In searching for a simple JSON parsing solution, I could not find, as of
this date, any project which satisfies the following requirements out of
the box:

* Is well-known, well documented, and well tested.

* Has an MIT-compatible license.

* Parses both unsigned and signed 64-bit integers (range
  -9,223,372,036,854,775,808 to 18,446,744,073,709,551,615).

* Provides an exact text location (offset, line number, column number)
  on parsing error (through logging and the message of an error cause).

* Provides an exact text location (offset, line number, column number)
  for each parsed value.

I believe the text locations are essential as this JSON parser will be
used to decode CTF2‑SPECRC‑4.0 [1] auxiliary and metadata streams:
because Babeltrace 2 will be a reference implementation of CTF 2, it
makes sense to make an effort to pinpoint the exact location of
syntactic and semantic errors.

More specifically:

* JSON for Modern C++ (by Niels Lohmann) [2] doesn't support text
  location access, although there's a pending pull request (draft as of
  this date) to add such support [3].

* The exceptions of JsonCpp [4] don't contain a text location, only a
  message.

* SimpleJSON [5] doesn't offer text location access and seems to be an
  archived project.

* RapidJSON [6] doesn't offer text location access.

* yajl [7] could offer some form of text location access (offset, at
  least) with yajl_get_bytes_consumed(), remembering the last offset on
  our side, although I don't know how nice it would play with
  whitespaces.

  That being said, regarding integers, the `yajl_callbacks`
  structure [8] only contains a `yajl_integer` function pointer which
  receives a `long long` value (no direct 64-bit unsigned integer
  support). It's possible to set the `yajl_number` callback for any
  number, but the `yajl_double` callback gets disabled in that case, and
  the callback receives a string which needs further parsing on our
  side: this is pretty much what's implemented `bt2_common::StrScanner`
  anyway.

At this point I stopped searching as I already had a working and tested
string scanner and, as you can see, without comments, `parse-json.hpp`
is only 231 lines of effective code and satisfies all the requirements
above.

You can test bt2_common::parseJson() with a simple program like this:

    #include <iostream>
    #include <cstring>

    #include "parse-json.hpp"

    struct Printer
    {
        void onNull(const bt2_common::TextLoc&)
        {
            std::cout << "null\n";
        }

        template <typename ValT>
        void onScalarVal(const ValT& val, const bt2_common::TextLoc&)
        {
            std::cout << val << '\n';
        }

        void onArrayBegin(const bt2_common::TextLoc&)
        {
            std::cout << "[\n";
        }

        void onArrayEnd(const bt2_common::TextLoc&)
        {
            std::cout << "]\n";
        }

        void onObjBegin(const bt2_common::TextLoc&)
        {
            std::cout << "{\n";
        }

        void onObjKey(const std::string& key,
                      const bt2_common::TextLoc&)
        {
            std::cout << key << ": ";
        }

        void onObjEnd(const bt2_common::TextLoc&)
        {
            std::cout << "}\n";
        }
    };

    int main(const int, const char * const * const argv)
    {
        Printer printer;

        bt2_common::parseJson(argv[1], printer);
    }

Then:

    $ ./test-parse-json 23
    $ ./test-parse-json '"\u03c9 represents angular velocity"'
    $ ./test-parse-json '{"salut": [23, true, 42.4e-9, {"meow": null}]}'
    $ ./test-parse-json 18446744073709551615
    $ ./test-parse-json -9223372036854775808

Also try some parsing errors:

    $ ./test-parse-json '{"salut": [false, 42.4e-9, "meow": null}]}'
    $ ./test-parse-json 18446744073709551616
    $ ./test-parse-json -9223372036854775809
    $ ./test-parse-json '"invalid \u8dkf codepoint"'

[1]: https://diamon.org/ctf/files/CTF2-SPECRC-4.0.html
[2]: https://github.com/nlohmann/json
[3]: nlohmann/json#3165
[4]: https://github.com/open-source-parsers/jsoncpp
[5]: https://github.com/nbsdx/SimpleJSON
[6]: https://rapidjson.org/
[7]: https://github.com/lloyd/yajl
[8]: https://lloyd.github.io/yajl/yajl-2.1.0/structyajl__callbacks.html

Signed-off-by: Philippe Proulx <eeppeliteloop@gmail.com>
Change-Id: Id32c2b64723ca50b044369c424fe046c0a183cce
Reviewed-on: https://review.lttng.org/c/babeltrace/+/7411
  • Loading branch information
eepp authored and simark committed Aug 3, 2022
1 parent f6feeb9 commit 685f5b4
Show file tree
Hide file tree
Showing 2 changed files with 543 additions and 1 deletion.
4 changes: 3 additions & 1 deletion src/cpp-common/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ libcppcommon_la_SOURCES = \
text-loc.hpp text-loc.cpp \
text-loc-str.hpp text-loc-str.cpp \
uuid-view.hpp \
str-scanner.hpp str-scanner.cpp
text-loc.hpp text-loc.cpp \
str-scanner.hpp str-scanner.cpp \
parse-json.hpp

EXTRA_DIST = bt2
Loading

0 comments on commit 685f5b4

Please sign in to comment.