Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify json serialization by using rapidjson #2960

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

aarranz
Copy link
Contributor

@aarranz aarranz commented Jul 20, 2017

Current implementation of the Context Broker has a mixture of code for serializing JSON structures. I'm proposing to change it to use rapidjson.

Current pull request code is confirmed to fix bug #2955 and it's passing all the unit test (including a new test for stressing bug #2955). I'm currently passing the valgrind tests (at least 0000_https_support/https and 0000_ipv6_support/ipv4_ipv6_both are failing) so probably this pull request has to be updated.

Anyway, it also simplifies current context broker code by removing the need to deal with JSON commas, different function for serializing JSON, ...

It probably fix (or can easily be extended to fix) bugs #2938, #2425, #2959 as well as other bugs regarding JSON serialization.

@fgalan
Copy link
Member

fgalan commented Jul 20, 2017

Impressive work... +/- 6000 changed lines ;)

One quick question, before analyzing the work and providing more in-deep feedback. You say:

I'm currently passing the valgrind tests [...]

Do you mean valgrind suite (i.e. make valgrind) or functional test (i.e. make ft) ?

@aarranz
Copy link
Contributor Author

aarranz commented Jul 20, 2017

I mean make valgrind and it's taking a lot of time (not finished yet, currently running test 440). I know that there are functional tests not passed when using the make valgrind command. I will run make ft after this first round using make valgrind.

These are the reported errors until now:

Test 023/927: 0000_https_support/https .............................................................................................................. OK (15.10 seconds) (no leak but ftest error 11)
Test 024/927: 0000_ipv6_support/ipv4_ipv6_both ...................................................................................................... OK (76.73 seconds) (no leak but ftest error 1)
Test 144/927: 0000_statistics_operation/statistics_with_semaphores .................................................................................. OK (55.44 seconds) (no leak but ftest error 1)
Test 369/927: 0876_attribute_dates/filter_attr_datecreated_sub ...................................................................................... OK (91.25 seconds) (no leak but ftest error 1)
Test 370/927: 0876_attribute_dates/filter_attr_datemodified_sub ..................................................................................... OK (82.35 seconds) (no leak but ftest error 1)

@fgalan
Copy link
Member

fgalan commented Jul 20, 2017

The valgrind regression is time intensive. As figure, it could take around 2-3 hours 5-6 hours in our reference CI platform.

For the future, do make ft first. The valgring regression is typically done after all the functional test (and unit test) are ok so the only thing that is pending is to check about memory leaks and memory access errors.

@fgalan
Copy link
Member

fgalan commented Jul 25, 2017

It probably fix (or can easily be extended to fix) bugs #2938, #2425, #2959 as well as other bugs regarding JSON serialization.

I think the reference to #2959 is a typo. Actually #2959 is a fix in the valgrind test framework...

@fgalan
Copy link
Member

fgalan commented Jul 26, 2017

Finally, we have had some time to have a look to this PR a bit closer.

First, thanks for your contribution! In some sense, it addresses the same old technical debt topic related with improving responses payload rendering described in #1298. Moreover, the PR has adopted a smarter approach, removing some structures (iostreams) that raised concern in some developers about performance in #1298. In addition, the removal of many internal returning of std::string between render methods (in favor of JsonHelper/writer reference passing) is also a good point.

Let’s consider the two different families in the NGSI API, i.e. NGSIv1 and NGSIv2.

Regarding NGSIv2, I understand this PR doesn’t change anything from a functional point of view. In other words, all the NGSIv2 .test should pass regression ok, even concerning Content-Length in response payload. However, a rendering-intensive performance test is required, in order to see that performance is not decreasing (or even better, is increasing :)

Regarding NGSIv1, it is a bit more complicated… The work in this PR collides with the payload simplification work already started at haderding/remove_ngsiv1_indent branch (yes, branch name has a typo ;). Although probably the work done in the present PR would reach at the end a similar result to the one in that branch, we have paid an extreme care to backward compatibility in it, including the “NGSIv1 indent paranoid” CLI flag, that we don’t want to lose. In fact, most probably current PR is not passing NGSIv1 .tests as they are now. Have a look to #2760 for more detail on haderding/remove_ngsiv1_indent work.

Thus, our proposal is to reduce the scope of this PR (maybe in a new fresh PR, leaving the present one to be closed), limiting to NGSIv2 renders by the moment. Once you have done that, let’s ensure that: 1) all .test (both NGSIv1 and NGSIv2) pass without changes, 2) the result of an NGSIv2 rendering-intensive performance test is positive (that test have to be carefully designed), 3) there is not conflict with haderding/remove_ngsiv1_indent work.

In a second iteration, once haderding/remove_ngsiv1_indent gets consolidated into master, the same approach (i.e. using rapidjson to render NGSIv1 responses) could be done. But that’s a separate piece of work for the future.

In addition to the above, the PR has some code style issues (see contribution guidelines for more detail). @kzangeli could elaborate on that… he knows more on style than me :)

One final note… the PR is HUGE. It is very difficult to review it in deep. In fact, we are not sure of having capture all the valuable feedback or missing potential problems with it :) Reducing the scope to just NGSIv2 could be also a good idea to make the review task easier. Moreover, maybe is a good idea to start with a first little PR addressing the NGSIv2 (covering just a small subset of render methods, corresponding to just one NGSIv2 operation) and progressing step by step in successive easy to review PRs.

@@ -0,0 +1,239 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is already part of radidjson (typically, /usr/local/include/rapidjson/internal/dtoa.h). Maybe it could be used from the library instead of including it as Orion library?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have included this file to change how rapidjson render numbers (I am working now on creating a PR in rapidjson). I think is a bug to use a different syntax when rendering float and integer numbers. Take into account that JSON only uses one type for numbers (numbers are always floating numbers, there is not a different syntax for integer numbers).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rendering problem is exposed when rendering an integer number. E.g. if you use the Double method provided by rapidjson to render a 4 number it is rendered as: 4.0. This syntax is required by some languages (like C/C++) to mark it as a floating number, but in JSON it is not required, so it would be better written as 4, and is what is usually done by other libraries (in fact, rapidjson is the only library I know that serialises numbers using this C/C++ pattern)

}


void JsonHelper::Null() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, great job!

A minor detail, this line here (and many lines that follow), doesn't follow the Orion style guide (see https://github.com/telefonicaid/fiware-orion/blob/master/doc/manuals/contribution_guidelines.md).

As you will see in that doc, and throughout the code, we put brackets ({}) on a line of its own, all functions have a function header and functions/methods are camelCase, starting with lowercase (null, int, etc may give problems with the compiler for being reserved words ... - if so, an exception-decision needs to be made, and added to the style guide).

Please read the style guide and fix these minor details.
[ I wont add a github-comment for each occurrence here, but instead do a re-check after this has been fixed ]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I will fix those problems.

if (indent == 0) {
writer.Bool(value);
} else {
pwriter.Bool(value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure it is possible, but it would be great to replace all these if (indent == 0) { writer } else { pwriter } for a single if upon creation of the writer/pwriter object. Polymorhphism ... (this wont change during the rendering of one JSON object - it's either pretty or not). If this can be done, the code would gain a lot in readability and number of source code lines.

Copy link
Member

@kzangeli kzangeli Jul 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, a doubt about writer/pwriter.
This code now seems to be prepared to either do "pretty printing" or "no whitespace".
How/where is the mode selected?

It would be great (personal opinion) to have a URI parameter like ?output=formated or something similar to let the client select. This is something we've been thinking about for a long time ...

However, not to be included in this very PR, it is very big already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also hate those if sentences. The problem is that polymorhphism cannot be used in this case due rapidjson providing the Writer and PrettyWriter classes using class templates. Moreover, currently, rapidjson is not marking the required methods as virtual (and it is very difficult to use that scheme using class templates, due how class templates are implemented in C++).

Regarding supporting the ?output=formated seems easy to implement. But I also agree with you regarding implementing it after cleaning and merging this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aarranz remember we have a pending talk. Don't continue coding on this PR without talking with me first ;)

@fgalan
Copy link
Member

fgalan commented Dec 3, 2018

This PR cannot be considered stale. It shows how rapidjson can be used to render JSON payload and, in that sense, it is useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants