Memory consumption improvements (less append-reallocs) #5

iv-menshenin · 2024-11-10T20:08:04Z

As I mentioned in another PR when processing large files we have a large number of memory reallocations associated with reaching capacity for a slice.
In this PR I modified the benchmarks so that they give us a more realistic estimate of memory, relying only on the data generated in a particular test (got rid of shared pool).

I achieved a reduction in memory consumption by getting rid of slice reallocations altogether by adding a chain of linked lists

Now for the proof of performance.

Here are the results of the tests

go test --run=None --bench="BenchmarkParse/.*/fastjson" --memprofile=mem.out

BEFORE

BenchmarkParse/small/fastjson-8       	41621682	        29.88 ns/op	6358.96 MB/s	       0 B/op	       0 allocs/op
BenchmarkParse/small/fastjson-get-8   	24688966	        47.32 ns/op	4015.26 MB/s	       0 B/op	       0 allocs/op
BenchmarkParse/medium/fastjson-8      	 4803705	       247.9 ns/op	9395.82 MB/s	       0 B/op	       0 allocs/op
BenchmarkParse/medium/fastjson-get-8  	 4600939	       260.5 ns/op	8939.31 MB/s	       0 B/op	       0 allocs/op
BenchmarkParse/large/fastjson-8       	  357400	      3314 ns/op	8484.08 MB/s	       8 B/op	       0 allocs/op
BenchmarkParse/large/fastjson-get-8   	  359676	      3276 ns/op	8582.07 MB/s	       8 B/op	       0 allocs/op
BenchmarkParse/canada/fastjson-8      	    1044	   1203621 ns/op	1870.24 MB/s	  404343 B/op	    2489 allocs/op
BenchmarkParse/canada/fastjson-get-8  	    1074	   1092411 ns/op	2060.63 MB/s	  393048 B/op	    2419 allocs/op
BenchmarkParse/citm/fastjson-8        	    3796	    316964 ns/op	5449.21 MB/s	   34197 B/op	     191 allocs/op
BenchmarkParse/citm/fastjson-get-8    	    3456	    342819 ns/op	5038.24 MB/s	   37561 B/op	     209 allocs/op
BenchmarkParse/twitter/fastjson-8     	   17029	     80574 ns/op	7837.72 MB/s	    2804 B/op	       6 allocs/op
BenchmarkParse/twitter/fastjson-get-8 	   15583	     73254 ns/op	8620.89 MB/s	    3064 B/op	       7 allocs/op
BenchmarkParse/20mb/fastjson-8        	     109	   9272379 ns/op	2209.87 MB/s	22814671 B/op	   89938 allocs/op
BenchmarkParse/20mb/fastjson-get-8    	     116	   8675982 ns/op	2361.78 MB/s	21437947 B/op	   84510 allocs/op

AFTER

BenchmarkParse/small/fastjson-8       	36022194	        34.03 ns/op	5583.94 MB/s	       0 B/op	       0 allocs/op
BenchmarkParse/small/fastjson-get-8   	22143430	        52.25 ns/op	3636.07 MB/s	       0 B/op	       0 allocs/op
BenchmarkParse/medium/fastjson-8      	 4418314	       270.3 ns/op	8615.86 MB/s	       0 B/op	       0 allocs/op
BenchmarkParse/medium/fastjson-get-8  	 4103462	       292.2 ns/op	7970.02 MB/s	       0 B/op	       0 allocs/op
BenchmarkParse/large/fastjson-8       	  325362	      3596 ns/op	7818.30 MB/s	       5 B/op	       0 allocs/op
BenchmarkParse/large/fastjson-get-8   	  314974	      3658 ns/op	7686.62 MB/s	       5 B/op	       0 allocs/op
BenchmarkParse/canada/fastjson-8      	     925	   1270325 ns/op	1772.03 MB/s	  194292 B/op	    2320 allocs/op
BenchmarkParse/canada/fastjson-get-8  	     838	   1293470 ns/op	1740.33 MB/s	  214463 B/op	    2561 allocs/op
BenchmarkParse/citm/fastjson-8        	    2630	    381243 ns/op	4530.45 MB/s	   22446 B/op	     201 allocs/op
BenchmarkParse/citm/fastjson-get-8    	    2632	    383046 ns/op	4509.14 MB/s	   22429 B/op	     201 allocs/op
BenchmarkParse/twitter/fastjson-8     	   15778	     76047 ns/op	8304.27 MB/s	    1640 B/op	       5 allocs/op
BenchmarkParse/twitter/fastjson-get-8 	   15397	     76405 ns/op	8265.33 MB/s	    1681 B/op	       6 allocs/op
BenchmarkParse/20mb/fastjson-8        	     138	   8586573 ns/op	2386.37 MB/s	 9507728 B/op	   60292 allocs/op
BenchmarkParse/20mb/fastjson-get-8    	     134	   8658302 ns/op	2366.60 MB/s	 9791524 B/op	   62092 allocs/op

As you can see, the number of memory allocations hasn't changed much, but the overall memory consumption (B/ops) has decreased by almost three times. This is especially evident on tests with huge files.

I added a 20 megabyte file to the tests to achieve dramatic effect

iv-menshenin · 2024-11-10T20:12:53Z

@StarpTech
Let me know if there is any further clarification needed

StarpTech · 2024-11-11T07:50:41Z

Hi @iv-menshenin, thanks for the PR. Could you attach a report with https://pkg.go.dev/golang.org/x/perf/cmd/benchstat that gives us a report of before and after. What I immediately recognize is a significant drop in performance for smaller datasets. If this can't be fixed, it is unlikely that we will merge it.

iv-menshenin · 2024-11-11T17:08:06Z

I admit my mistake, the only thing I did was to push the changes from fastjson to your repo and didn't notice some difference.
For example this

p.c = &cache{vs: make([]Value, 4)}

Also, I didn't want to re-immerse myself in the problem or do extensive testing. However, your request to attach benchstat prompted me to take action.

After analyzing the results (completely unsatisfying), I realized that my case was very narrow and made the following refinements:

I took into account the peculiarities of your repo
improved memory handling of not only arrays, but also objects with a large number of keys-values

iv-menshenin · 2024-11-11T17:18:32Z

pkg: github.com/wundergraph/astjson
cpu: Intel(R) Core(TM) i7-9700F CPU @ 3.00GHz
                  │ BenchmarkArenaTypicalUse-before.txt │ BenchmarkArenaTypicalUse-after.txt │
                  │               sec/op                │   sec/op     vs base               │
ArenaTypicalUse-8                           42.32n ± 0%   43.65n ± 0%  +3.14% (p=0.000 n=10)

                  │ BenchmarkArenaTypicalUse-before.txt │ BenchmarkArenaTypicalUse-after.txt  │
                  │                 B/s                 │     B/s       vs base               │
ArenaTypicalUse-8                          2.729Gi ± 0%   2.646Gi ± 0%  -3.05% (p=0.000 n=10)

                  │ BenchmarkArenaTypicalUse-before.txt │ BenchmarkArenaTypicalUse-after.txt │
                  │                B/op                 │      B/op       vs base            │
ArenaTypicalUse-8                            0.000 ± 0%       0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

                  │ BenchmarkArenaTypicalUse-before.txt │ BenchmarkArenaTypicalUse-after.txt │
                  │              allocs/op              │   allocs/op     vs base            │
ArenaTypicalUse-8                            0.000 ± 0%       0.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

pkg: github.com/wundergraph/astjson
cpu: Intel(R) Core(TM) i7-9700F CPU @ 3.00GHz
                                            │ BenchmarkParseRawString-before.txt │  BenchmarkParseRawString-after.txt   │
                                            │               sec/op               │    sec/op     vs base                │
ParseRawString/""-8                                                0.6833n ±  1%   0.6833n ± 0%        ~ (p=0.955 n=10)
ParseRawString/"a"-8                                               0.8707n ± 17%   0.7368n ± 1%  -15.37% (p=0.023 n=10)
ParseRawString/"abcd"-8                                            0.8715n ±  0%   0.8799n ± 2%        ~ (p=0.085 n=10)
ParseRawString/"abcdefghijk"-8                                     0.8993n ±  1%   0.8816n ± 1%   -1.97% (p=0.000 n=10)
ParseRawString/"qwertyuiopasdfghjklzxcvb"-8                         1.022n ±  1%    1.050n ± 1%   +2.69% (p=0.000 n=10)
geomean                                                            0.8622n         0.8366n        -2.97%

                                            │ BenchmarkParseRawString-before.txt │  BenchmarkParseRawString-after.txt   │
                                            │                B/s                 │     B/s       vs base                │
ParseRawString/""-8                                                2.726Gi ±  1%   2.726Gi ± 0%        ~ (p=0.971 n=10)
ParseRawString/"a"-8                                               3.209Gi ± 20%   3.792Gi ± 1%  +18.15% (p=0.023 n=10)
ParseRawString/"abcd"-8                                            6.412Gi ±  0%   6.351Gi ± 2%        ~ (p=0.089 n=10)
ParseRawString/"abcdefghijk"-8                                     13.46Gi ±  1%   13.73Gi ± 1%   +2.01% (p=0.000 n=10)
ParseRawString/"qwertyuiopasdfghjklzxcvb"-8                        23.69Gi ±  1%   23.08Gi ± 1%   -2.60% (p=0.000 n=10)
geomean                                                            7.088Gi         7.305Gi        +3.06%

                                            │ BenchmarkParseRawString-before.txt │  BenchmarkParseRawString-after.txt  │
                                            │                B/op                │    B/op     vs base                 │
ParseRawString/""-8                                                 0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseRawString/"a"-8                                                0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseRawString/"abcd"-8                                             0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseRawString/"abcdefghijk"-8                                      0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseRawString/"qwertyuiopasdfghjklzxcvb"-8                         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                                                                        ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                            │ BenchmarkParseRawString-before.txt │  BenchmarkParseRawString-after.txt  │
                                            │             allocs/op              │ allocs/op   vs base                 │
ParseRawString/""-8                                                 0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseRawString/"a"-8                                                0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseRawString/"abcd"-8                                             0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseRawString/"abcdefghijk"-8                                      0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseRawString/"qwertyuiopasdfghjklzxcvb"-8                         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                                                                        ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pkg: github.com/wundergraph/astjson
cpu: Intel(R) Core(TM) i7-9700F CPU @ 3.00GHz
                                    │ BenchmarkParseRawNumber-before.txt │   BenchmarkParseRawNumber-after.txt   │
                                    │               sec/op               │    sec/op      vs base                │
ParseRawNumber/1-8                                         0.4634n ±  1%   0.4642n ±  1%        ~ (p=0.578 n=10)
ParseRawNumber/1234-8                                      0.6455n ± 11%   0.7236n ±  4%  +12.10% (p=0.000 n=10)
ParseRawNumber/123456-8                                    0.8386n ±  1%   0.8301n ±  0%   -1.01% (p=0.002 n=10)
ParseRawNumber/-1234-8                                     0.7809n ±  0%   0.9466n ± 11%  +21.23% (p=0.000 n=10)
ParseRawNumber/1234567890.1234567-8                         1.479n ±  1%    1.799n ±  2%  +21.64% (p=0.000 n=10)
ParseRawNumber/-1.32434e+12-8                               1.326n ±  0%    1.666n ±  1%  +25.68% (p=0.000 n=10)
geomean                                                    0.8526n         0.9617n        +12.80%

                                    │ BenchmarkParseRawNumber-before.txt │   BenchmarkParseRawNumber-after.txt   │
                                    │                B/s                 │      B/s       vs base                │
ParseRawNumber/1-8                                         2.010Gi ±  1%   2.006Gi ±  1%        ~ (p=0.529 n=10)
ParseRawNumber/1234-8                                      5.792Gi ± 10%   5.148Gi ±  4%  -11.11% (p=0.000 n=10)
ParseRawNumber/123456-8                                    6.664Gi ±  1%   6.732Gi ±  0%   +1.02% (p=0.002 n=10)
ParseRawNumber/-1234-8                                     5.964Gi ±  0%   4.919Gi ± 12%  -17.51% (p=0.000 n=10)
ParseRawNumber/1234567890.1234567-8                       11.332Gi ±  1%   9.318Gi ±  2%  -17.77% (p=0.000 n=10)
ParseRawNumber/-1.32434e+12-8                              8.427Gi ±  0%   6.706Gi ±  1%  -20.43% (p=0.000 n=10)
geomean                                                    5.946Gi         5.268Gi        -11.40%

                                    │ BenchmarkParseRawNumber-before.txt │  BenchmarkParseRawNumber-after.txt  │
                                    │                B/op                │    B/op     vs base                 │
ParseRawNumber/1-8                                          0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseRawNumber/1234-8                                       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseRawNumber/123456-8                                     0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseRawNumber/-1234-8                                      0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseRawNumber/1234567890.1234567-8                         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseRawNumber/-1.32434e+12-8                               0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                                                                ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                    │ BenchmarkParseRawNumber-before.txt │  BenchmarkParseRawNumber-after.txt  │
                                    │             allocs/op              │ allocs/op   vs base                 │
ParseRawNumber/1-8                                          0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseRawNumber/1234-8                                       0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseRawNumber/123456-8                                     0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseRawNumber/-1234-8                                      0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseRawNumber/1234567890.1234567-8                         0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ParseRawNumber/-1.32434e+12-8                               0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                                                                ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pkg: github.com/wundergraph/astjson
cpu: Intel(R) Core(TM) i7-9700F CPU @ 3.00GHz
                                    │ BenchmarkObjectGet-before.txt │     BenchmarkObjectGet-after.txt      │
                                    │            sec/op             │    sec/op      vs base                │
ObjectGet/items_10/lookups_0-8                         33.86n ±  3%    39.98n ±  3%  +18.09% (p=0.000 n=10)
ObjectGet/items_10/lookups_1-8                         47.55n ±  2%    54.93n ±  3%  +15.51% (p=0.000 n=10)
ObjectGet/items_10/lookups_2-8                         52.49n ±  2%    60.00n ±  0%  +14.30% (p=0.000 n=10)
ObjectGet/items_10/lookups_4-8                         62.63n ±  2%    70.30n ±  2%  +12.25% (p=0.000 n=10)
ObjectGet/items_10/lookups_8-8                         80.18n ±  2%    87.34n ±  2%   +8.92% (p=0.000 n=10)
ObjectGet/items_10/lookups_16-8                        115.8n ±  1%    124.9n ±  1%   +7.90% (p=0.000 n=10)
ObjectGet/items_10/lookups_32-8                        186.6n ±  0%    197.7n ±  1%   +5.95% (p=0.000 n=10)
ObjectGet/items_10/lookups_64-8                        328.4n ±  1%    345.4n ±  0%   +5.18% (p=0.000 n=10)
ObjectGet/items_100/lookups_0-8                        384.5n ±  0%    428.9n ±  1%  +11.53% (p=0.000 n=10)
ObjectGet/items_100/lookups_1-8                        411.8n ±  1%    460.1n ±  1%  +11.74% (p=0.000 n=10)
ObjectGet/items_100/lookups_2-8                        437.9n ±  0%    488.9n ±  0%  +11.65% (p=0.000 n=10)
ObjectGet/items_100/lookups_4-8                        488.7n ±  0%    543.6n ±  0%  +11.24% (p=0.000 n=10)
ObjectGet/items_100/lookups_8-8                        588.0n ±  1%    646.8n ±  0%  +10.01% (p=0.000 n=10)
ObjectGet/items_100/lookups_16-8                       786.4n ±  0%    857.1n ±  0%   +9.00% (p=0.000 n=10)
ObjectGet/items_100/lookups_32-8                       1.185µ ±  1%    1.271µ ±  0%   +7.26% (p=0.000 n=10)
ObjectGet/items_100/lookups_64-8                       1.976µ ±  0%    2.116µ ±  0%   +7.09% (p=0.000 n=10)
ObjectGet/items_1000/lookups_0-8                       3.947µ ±  0%    4.394µ ±  0%  +11.34% (p=0.000 n=10)
ObjectGet/items_1000/lookups_1-8                       4.178µ ±  0%    4.618µ ±  0%  +10.53% (p=0.000 n=10)
ObjectGet/items_1000/lookups_2-8                       4.401µ ±  0%    4.864µ ±  1%  +10.53% (p=0.000 n=10)
ObjectGet/items_1000/lookups_4-8                       4.861µ ±  0%    5.335µ ±  0%   +9.76% (p=0.000 n=10)
ObjectGet/items_1000/lookups_8-8                       5.749µ ±  1%    6.279µ ±  0%   +9.21% (p=0.000 n=10)
ObjectGet/items_1000/lookups_16-8                      7.554µ ±  0%    8.134µ ±  0%   +7.68% (p=0.000 n=10)
ObjectGet/items_1000/lookups_32-8                      11.16µ ±  1%    11.88µ ±  0%   +6.54% (p=0.000 n=10)
ObjectGet/items_1000/lookups_64-8                      18.34µ ±  0%    19.33µ ±  1%   +5.40% (p=0.000 n=10)
ObjectGet/items_10000/lookups_0-8                      41.64µ ±  1%    46.44µ ±  0%  +11.52% (p=0.000 n=10)
ObjectGet/items_10000/lookups_1-8                      43.39µ ±  1%    48.20µ ±  1%  +11.08% (p=0.000 n=10)
ObjectGet/items_10000/lookups_2-8                      45.19µ ±  0%    50.04µ ±  0%  +10.74% (p=0.000 n=10)
ObjectGet/items_10000/lookups_4-8                      48.85µ ±  1%    53.67µ ±  0%   +9.86% (p=0.000 n=10)
ObjectGet/items_10000/lookups_8-8                      55.94µ ±  1%    61.00µ ±  0%   +9.04% (p=0.000 n=10)
ObjectGet/items_10000/lookups_16-8                     70.32µ ±  0%    75.42µ ±  0%   +7.26% (p=0.000 n=10)
ObjectGet/items_10000/lookups_32-8                     99.05µ ±  1%   104.36µ ±  0%   +5.36% (p=0.000 n=10)
ObjectGet/items_10000/lookups_64-8                     155.8µ ±  0%    161.9µ ±  1%   +3.91% (p=0.000 n=10)
ObjectGet/items_100000/lookups_0-8                     848.3µ ±  0%    786.0µ ± 14%   -7.34% (p=0.023 n=10)
ObjectGet/items_100000/lookups_1-8                     899.5µ ±  2%    842.3µ ±  1%   -6.36% (p=0.002 n=10)
ObjectGet/items_100000/lookups_2-8                     957.8µ ±  1%    898.8µ ± 13%   -6.16% (p=0.023 n=10)
ObjectGet/items_100000/lookups_4-8                    1038.3µ ±  1%    971.2µ ±  2%   -6.46% (p=0.000 n=10)
ObjectGet/items_100000/lookups_8-8                     1.188m ± 18%    1.088m ±  3%   -8.42% (p=0.000 n=10)
ObjectGet/items_100000/lookups_16-8                    1.408m ±  1%    1.312m ±  2%   -6.78% (p=0.000 n=10)
ObjectGet/items_100000/lookups_32-8                    2.036m ±  2%    1.913m ±  2%   -6.01% (p=0.000 n=10)
ObjectGet/items_100000/lookups_64-8                    3.189m ±  3%    3.031m ±  3%   -4.95% (p=0.000 n=10)
geomean                                                7.870µ          8.352µ         +6.13%

                                    │ BenchmarkObjectGet-before.txt │     BenchmarkObjectGet-after.txt      │
                                    │              B/s              │      B/s       vs base                │
ObjectGet/items_10/lookups_0-8                        5.254Gi ±  3%   4.449Gi ±  3%  -15.32% (p=0.000 n=10)
ObjectGet/items_10/lookups_1-8                        3.741Gi ±  2%   3.238Gi ±  3%  -13.42% (p=0.000 n=10)
ObjectGet/items_10/lookups_2-8                        3.389Gi ±  2%   2.965Gi ±  0%  -12.51% (p=0.000 n=10)
ObjectGet/items_10/lookups_4-8                        2.840Gi ±  2%   2.530Gi ±  2%  -10.91% (p=0.000 n=10)
ObjectGet/items_10/lookups_8-8                        2.218Gi ±  2%   2.037Gi ±  2%   -8.19% (p=0.000 n=10)
ObjectGet/items_10/lookups_16-8                       1.536Gi ±  1%   1.424Gi ±  1%   -7.29% (p=0.000 n=10)
ObjectGet/items_10/lookups_32-8                       976.1Mi ±  0%   921.5Mi ±  1%   -5.59% (p=0.000 n=10)
ObjectGet/items_10/lookups_64-8                       554.7Mi ±  1%   527.3Mi ±  0%   -4.93% (p=0.000 n=10)
ObjectGet/items_100/lookups_0-8                       5.040Gi ±  0%   4.519Gi ±  1%  -10.34% (p=0.000 n=10)
ObjectGet/items_100/lookups_1-8                       4.707Gi ±  1%   4.212Gi ±  1%  -10.51% (p=0.000 n=10)
ObjectGet/items_100/lookups_2-8                       4.426Gi ±  0%   3.964Gi ±  0%  -10.44% (p=0.000 n=10)
ObjectGet/items_100/lookups_4-8                       3.965Gi ±  0%   3.565Gi ±  0%  -10.10% (p=0.000 n=10)
ObjectGet/items_100/lookups_8-8                       3.296Gi ±  1%   2.996Gi ±  0%   -9.09% (p=0.000 n=10)
ObjectGet/items_100/lookups_16-8                      2.465Gi ±  0%   2.261Gi ±  0%   -8.26% (p=0.000 n=10)
ObjectGet/items_100/lookups_32-8                      1.635Gi ±  1%   1.525Gi ±  0%   -6.75% (p=0.000 n=10)
ObjectGet/items_100/lookups_64-8                     1004.6Mi ±  0%   937.9Mi ±  0%   -6.63% (p=0.000 n=10)
ObjectGet/items_1000/lookups_0-8                      5.375Gi ±  0%   4.828Gi ±  0%  -10.17% (p=0.000 n=10)
ObjectGet/items_1000/lookups_1-8                      5.078Gi ±  0%   4.594Gi ±  0%   -9.53% (p=0.000 n=10)
ObjectGet/items_1000/lookups_2-8                      4.821Gi ±  0%   4.362Gi ±  1%   -9.52% (p=0.000 n=10)
ObjectGet/items_1000/lookups_4-8                      4.365Gi ±  0%   3.977Gi ±  0%   -8.89% (p=0.000 n=10)
ObjectGet/items_1000/lookups_8-8                      3.691Gi ±  1%   3.379Gi ±  0%   -8.43% (p=0.000 n=10)
ObjectGet/items_1000/lookups_16-8                     2.809Gi ±  0%   2.608Gi ±  0%   -7.13% (p=0.000 n=10)
ObjectGet/items_1000/lookups_32-8                     1.902Gi ±  1%   1.785Gi ±  0%   -6.14% (p=0.000 n=10)
ObjectGet/items_1000/lookups_64-8                     1.157Gi ±  0%   1.097Gi ±  1%   -5.13% (p=0.000 n=10)
ObjectGet/items_10000/lookups_0-8                     5.541Gi ±  1%   4.969Gi ±  0%  -10.33% (p=0.000 n=10)
ObjectGet/items_10000/lookups_1-8                     5.318Gi ±  1%   4.787Gi ±  1%   -9.97% (p=0.000 n=10)
ObjectGet/items_10000/lookups_2-8                     5.107Gi ±  0%   4.612Gi ±  0%   -9.70% (p=0.000 n=10)
ObjectGet/items_10000/lookups_4-8                     4.724Gi ±  1%   4.300Gi ±  0%   -8.98% (p=0.000 n=10)
ObjectGet/items_10000/lookups_8-8                     4.125Gi ±  1%   3.783Gi ±  0%   -8.29% (p=0.000 n=10)
ObjectGet/items_10000/lookups_16-8                    3.282Gi ±  0%   3.060Gi ±  0%   -6.77% (p=0.000 n=10)
ObjectGet/items_10000/lookups_32-8                    2.330Gi ±  1%   2.211Gi ±  0%   -5.08% (p=0.000 n=10)
ObjectGet/items_10000/lookups_64-8                    1.481Gi ±  0%   1.426Gi ±  1%   -3.76% (p=0.000 n=10)
ObjectGet/items_100000/lookups_0-8                    2.940Gi ±  0%   3.173Gi ± 13%   +7.92% (p=0.023 n=10)
ObjectGet/items_100000/lookups_1-8                    2.773Gi ±  2%   2.961Gi ±  1%   +6.78% (p=0.002 n=10)
ObjectGet/items_100000/lookups_2-8                    2.604Gi ±  1%   2.775Gi ± 12%   +6.57% (p=0.023 n=10)
ObjectGet/items_100000/lookups_4-8                    2.402Gi ±  1%   2.568Gi ±  2%   +6.91% (p=0.000 n=10)
ObjectGet/items_100000/lookups_8-8                    2.100Gi ± 16%   2.293Gi ±  3%   +9.19% (p=0.000 n=10)
ObjectGet/items_100000/lookups_16-8                   1.772Gi ±  1%   1.900Gi ±  2%   +7.27% (p=0.000 n=10)
ObjectGet/items_100000/lookups_32-8                   1.225Gi ±  2%   1.303Gi ±  2%   +6.39% (p=0.000 n=10)
ObjectGet/items_100000/lookups_64-8                   800.8Mi ±  3%   842.5Mi ±  3%   +5.21% (p=0.000 n=10)
geomean                                               2.684Gi         2.529Gi         -5.77%

                                    │ BenchmarkObjectGet-before.txt │      BenchmarkObjectGet-after.txt       │
                                    │             B/op              │     B/op       vs base                  │
ObjectGet/items_10/lookups_0-8                        0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
ObjectGet/items_10/lookups_1-8                        0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
ObjectGet/items_10/lookups_2-8                        0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
ObjectGet/items_10/lookups_4-8                        0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
ObjectGet/items_10/lookups_8-8                        0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
ObjectGet/items_10/lookups_16-8                       0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
ObjectGet/items_10/lookups_32-8                       0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
ObjectGet/items_10/lookups_64-8                       0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
ObjectGet/items_100/lookups_0-8                       0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
ObjectGet/items_100/lookups_1-8                       0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
ObjectGet/items_100/lookups_2-8                       0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
ObjectGet/items_100/lookups_4-8                       0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
ObjectGet/items_100/lookups_8-8                       0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
ObjectGet/items_100/lookups_16-8                      0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
ObjectGet/items_100/lookups_32-8                      0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
ObjectGet/items_100/lookups_64-8                      0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
ObjectGet/items_1000/lookups_0-8                     11.000 ±  0%       6.000 ±  0%  -45.45% (p=0.000 n=10)
ObjectGet/items_1000/lookups_1-8                     12.000 ±  8%       6.000 ± 17%  -50.00% (p=0.000 n=10)
ObjectGet/items_1000/lookups_2-8                     12.000 ±  0%       7.000 ±  0%  -41.67% (p=0.000 n=10)
ObjectGet/items_1000/lookups_4-8                     13.000 ±  8%       8.000 ±  0%  -38.46% (p=0.000 n=10)
ObjectGet/items_1000/lookups_8-8                     16.000 ±  0%       9.000 ±  0%  -43.75% (p=0.000 n=10)
ObjectGet/items_1000/lookups_16-8                     21.00 ±  0%       12.00 ±  0%  -42.86% (p=0.000 n=10)
ObjectGet/items_1000/lookups_32-8                     31.00 ±  6%       17.00 ±  0%  -45.16% (p=0.000 n=10)
ObjectGet/items_1000/lookups_64-8                     52.00 ±  2%       28.00 ±  0%  -46.15% (p=0.000 n=10)
ObjectGet/items_10000/lookups_0-8                    1724.5 ±  1%       809.5 ±  0%  -53.06% (p=0.000 n=10)
ObjectGet/items_10000/lookups_1-8                    1792.5 ±  0%       843.0 ±  1%  -52.97% (p=0.000 n=10)
ObjectGet/items_10000/lookups_2-8                    1869.5 ±  1%       871.0 ±  2%  -53.41% (p=0.000 n=10)
ObjectGet/items_10000/lookups_4-8                    2016.5 ±  1%       936.5 ±  1%  -53.56% (p=0.000 n=10)
ObjectGet/items_10000/lookups_8-8                   2.257Ki ±  1%     1.038Ki ±  1%  -54.00% (p=0.000 n=10)
ObjectGet/items_10000/lookups_16-8                  2.843Ki ±  3%     1.280Ki ±  0%  -54.96% (p=0.000 n=10)
ObjectGet/items_10000/lookups_32-8                  3.998Ki ±  0%     2.042Ki ±  0%  -48.93% (p=0.000 n=10)
ObjectGet/items_10000/lookups_64-8                  6.289Ki ±  1%     2.951Ki ±  5%  -53.08% (p=0.000 n=10)
ObjectGet/items_100000/lookups_0-8                  377.5Ki ±  3%     105.9Ki ±  6%  -71.96% (p=0.000 n=10)
ObjectGet/items_100000/lookups_1-8                  401.0Ki ±  5%     110.7Ki ±  5%  -72.39% (p=0.000 n=10)
ObjectGet/items_100000/lookups_2-8                  423.3Ki ±  6%     116.5Ki ±  8%  -72.48% (p=0.000 n=10)
ObjectGet/items_100000/lookups_4-8                  454.9Ki ±  5%     123.1Ki ±  4%  -72.94% (p=0.000 n=10)
ObjectGet/items_100000/lookups_8-8                  513.4Ki ± 36%     143.7Ki ±  3%  -72.02% (p=0.000 n=10)
ObjectGet/items_100000/lookups_16-8                 612.4Ki ±  2%     168.0Ki ±  6%  -72.56% (p=0.000 n=10)
ObjectGet/items_100000/lookups_32-8                 859.3Ki ±  2%     242.3Ki ± 10%  -71.80% (p=0.000 n=10)
ObjectGet/items_100000/lookups_64-8                1483.4Ki ±  9%     383.5Ki ±  3%  -74.14% (p=0.000 n=10)
geomean                                                           ²                  -40.94%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                    │ BenchmarkObjectGet-before.txt │      BenchmarkObjectGet-after.txt       │
                                    │           allocs/op           │ allocs/op   vs base                     │
ObjectGet/items_10/lookups_0-8                         0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_10/lookups_1-8                         0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_10/lookups_2-8                         0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_10/lookups_4-8                         0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_10/lookups_8-8                         0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_10/lookups_16-8                        0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_10/lookups_32-8                        0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_10/lookups_64-8                        0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_100/lookups_0-8                        0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_100/lookups_1-8                        0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_100/lookups_2-8                        0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_100/lookups_4-8                        0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_100/lookups_8-8                        0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_100/lookups_16-8                       0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_100/lookups_32-8                       0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_100/lookups_64-8                       0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_1000/lookups_0-8                       0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_1000/lookups_1-8                       0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_1000/lookups_2-8                       0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_1000/lookups_4-8                       0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_1000/lookups_8-8                       0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_1000/lookups_16-8                      0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_1000/lookups_32-8                      0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_1000/lookups_64-8                      0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_10000/lookups_0-8                      0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_10000/lookups_1-8                      0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_10000/lookups_2-8                      0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_10000/lookups_4-8                      0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_10000/lookups_8-8                      0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_10000/lookups_16-8                     0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_10000/lookups_32-8                     0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_10000/lookups_64-8                     0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_100000/lookups_0-8                     0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_100000/lookups_1-8                     0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_100000/lookups_2-8                     0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_100000/lookups_4-8                     0.000 ± 0%     0.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_100000/lookups_8-8                     0.000 ±  ?     0.000 ± 0%         ~ (p=0.474 n=10)
ObjectGet/items_100000/lookups_16-8                    1.000 ± 0%     0.000 ±  ?  -100.00% (p=0.011 n=10)
ObjectGet/items_100000/lookups_32-8                    1.000 ± 0%     1.000 ± 0%         ~ (p=1.000 n=10) ¹
ObjectGet/items_100000/lookups_64-8                    2.000 ± 0%     2.000 ± 0%         ~ (p=1.000 n=10) ¹
geomean                                                           ²               ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean

pkg: github.com/wundergraph/astjson
cpu: Intel(R) Core(TM) i7-9700F CPU @ 3.00GHz
                    │ BenchmarkMarshalTo-before.txt │     BenchmarkMarshalTo-after.txt     │
                    │            sec/op             │    sec/op     vs base                │
MarshalTo/small-8                      15.65n ±  1%   15.41n ±  1%   -1.53% (p=0.000 n=10)
MarshalTo/medium-8                     149.8n ± 17%   145.4n ± 17%   -2.90% (p=0.014 n=10)
MarshalTo/large-8                      2.787µ ±  0%   2.780µ ±  0%        ~ (p=0.109 n=10)
MarshalTo/canada-8                     238.9µ ±  1%   262.3µ ±  2%   +9.81% (p=0.000 n=10)
MarshalTo/citm-8                       48.01µ ±  0%   48.63µ ±  2%   +1.28% (p=0.002 n=10)
MarshalTo/twitter-8                    47.85µ ±  0%   47.40µ ±  1%   -0.94% (p=0.017 n=10)
MarshalTo/20mb-8                       3.820m ± 10%   4.354m ±  8%  +13.99% (p=0.000 n=10)
geomean                                10.46µ         10.73µ         +2.61%

                    │ BenchmarkMarshalTo-before.txt │     BenchmarkMarshalTo-after.txt      │
                    │              B/s              │      B/s       vs base                │
MarshalTo/small-8                     11.31Gi ±  1%   11.48Gi ±  1%   +1.56% (p=0.000 n=10)
MarshalTo/medium-8                    14.49Gi ± 14%   14.92Gi ± 14%   +2.98% (p=0.015 n=10)
MarshalTo/large-8                     9.398Gi ±  0%   9.419Gi ±  0%        ~ (p=0.123 n=10)
MarshalTo/canada-8                    8.777Gi ±  1%   7.993Gi ±  2%   -8.93% (p=0.000 n=10)
MarshalTo/citm-8                      33.50Gi ±  0%   33.08Gi ±  2%   -1.26% (p=0.002 n=10)
MarshalTo/twitter-8                   12.29Gi ±  0%   12.41Gi ±  1%   +0.94% (p=0.019 n=10)
MarshalTo/20mb-8                      4.996Gi ±  9%   4.383Gi ±  7%  -12.27% (p=0.000 n=10)
geomean                               11.57Gi         11.28Gi         -2.55%

                    │ BenchmarkMarshalTo-before.txt │      BenchmarkMarshalTo-after.txt      │
                    │             B/op              │     B/op       vs base                 │
MarshalTo/small-8                     0.000 ±  0%       0.000 ±  0%       ~ (p=1.000 n=10) ¹
MarshalTo/medium-8                    0.000 ±  0%       0.000 ±  0%       ~ (p=1.000 n=10) ¹
MarshalTo/large-8                     2.000 ±  0%       2.000 ±  0%       ~ (p=1.000 n=10) ¹
MarshalTo/canada-8                  26.16Ki ±  2%     27.56Ki ± 10%  +5.35% (p=0.000 n=10)
MarshalTo/citm-8                    1.120Ki ±  6%     1.069Ki ±  2%  -4.54% (p=0.014 n=10)
MarshalTo/twitter-8                   922.5 ±  1%       928.0 ±  1%  +0.60% (p=0.002 n=10)
MarshalTo/20mb-8                    3.255Mi ± 10%     3.473Mi ± 10%       ~ (p=0.165 n=10)
geomean                                           ²                  +1.10%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                    │ BenchmarkMarshalTo-before.txt │     BenchmarkMarshalTo-after.txt      │
                    │           allocs/op           │  allocs/op    vs base                 │
MarshalTo/small-8                     0.000 ±  0%      0.000 ±  0%       ~ (p=1.000 n=10) ¹
MarshalTo/medium-8                    0.000 ±  0%      0.000 ±  0%       ~ (p=1.000 n=10) ¹
MarshalTo/large-8                     0.000 ±  0%      0.000 ±  0%       ~ (p=1.000 n=10) ¹
MarshalTo/canada-8                    23.00 ±  4%      25.00 ±  8%  +8.70% (p=0.000 n=10)
MarshalTo/citm-8                      1.000 ±  0%      1.000 ±  0%       ~ (p=1.000 n=10)
MarshalTo/twitter-8                   0.000 ±  0%      0.000 ±  0%       ~ (p=1.000 n=10) ¹
MarshalTo/20mb-8                     1.544k ± 23%     1.683k ± 10%       ~ (p=0.072 n=10)
geomean                                           ²                 +2.45%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pkg: github.com/wundergraph/astjson
cpu: Intel(R) Core(TM) i7-9700F CPU @ 3.00GHz
                             │ BenchmarkParse-before.txt │       BenchmarkParse-after.txt       │
                             │          sec/op           │    sec/op     vs base                │
Parse/small/fastjson-8                      28.97n ±  3%   35.30n ±  2%  +21.85% (p=0.000 n=10)
Parse/small/fastjson-get-8                  59.42n ± 19%   66.47n ± 19%  +11.86% (p=0.008 n=10)
Parse/medium/fastjson-8                     307.5n ±  1%   364.0n ±  0%  +18.37% (p=0.000 n=10)
Parse/medium/fastjson-get-8                 323.9n ±  0%   375.4n ±  0%  +15.88% (p=0.000 n=10)
Parse/large/fastjson-8                      4.088µ ±  1%   4.661µ ±  7%  +14.02% (p=0.000 n=10)
Parse/large/fastjson-get-8                  4.111µ ±  0%   4.656µ ±  1%  +13.27% (p=0.000 n=10)
Parse/canada/fastjson-8                     1.101m ±  1%   1.130m ±  1%   +2.62% (p=0.000 n=10)
Parse/canada/fastjson-get-8                 1.101m ±  1%   1.128m ±  0%   +2.48% (p=0.000 n=10)
Parse/citm/fastjson-8                       317.1µ ±  1%   349.0µ ±  2%  +10.06% (p=0.000 n=10)
Parse/citm/fastjson-get-8                   320.4µ ±  1%   346.3µ ±  1%   +8.10% (p=0.000 n=10)
Parse/twitter/fastjson-8                    81.00µ ±  1%   93.77µ ±  0%  +15.75% (p=0.000 n=10)
Parse/twitter/fastjson-get-8                81.08µ ±  1%   93.81µ ±  2%  +15.70% (p=0.000 n=10)
Parse/20mb/fastjson-8                       8.584m ±  6%   7.042m ±  1%  -17.97% (p=0.000 n=10)
Parse/20mb/fastjson-get-8                   8.321m ±  3%   7.000m ±  2%  -15.88% (p=0.000 n=10)
geomean                                     27.82µ         29.94µ         +7.61%

                             │ BenchmarkParse-before.txt │       BenchmarkParse-after.txt        │
                             │            B/s            │      B/s       vs base                │
Parse/small/fastjson-8                     6.107Gi ±  3%   5.012Gi ±  2%  -17.93% (p=0.000 n=10)
Parse/small/fastjson-get-8                 2.978Gi ± 24%   2.662Gi ± 23%  -10.61% (p=0.009 n=10)
Parse/medium/fastjson-8                    7.055Gi ±  1%   5.959Gi ±  0%  -15.53% (p=0.000 n=10)
Parse/medium/fastjson-get-8                6.697Gi ±  0%   5.779Gi ±  0%  -13.71% (p=0.000 n=10)
Parse/large/fastjson-8                     6.406Gi ±  1%   5.618Gi ±  6%  -12.30% (p=0.000 n=10)
Parse/large/fastjson-get-8                 6.371Gi ±  0%   5.625Gi ±  1%  -11.72% (p=0.000 n=10)
Parse/canada/fastjson-8                    1.904Gi ±  1%   1.855Gi ±  1%   -2.55% (p=0.000 n=10)
Parse/canada/fastjson-get-8                1.904Gi ±  1%   1.858Gi ±  0%   -2.42% (p=0.000 n=10)
Parse/citm/fastjson-8                      5.073Gi ±  1%   4.609Gi ±  2%   -9.14% (p=0.000 n=10)
Parse/citm/fastjson-get-8                  5.021Gi ±  1%   4.645Gi ±  1%   -7.49% (p=0.000 n=10)
Parse/twitter/fastjson-8                   7.261Gi ±  1%   6.272Gi ±  0%  -13.61% (p=0.000 n=10)
Parse/twitter/fastjson-get-8               7.254Gi ±  1%   6.270Gi ±  2%  -13.57% (p=0.000 n=10)
Parse/20mb/fastjson-8                      2.223Gi ±  7%   2.710Gi ±  1%  +21.90% (p=0.000 n=10)
Parse/20mb/fastjson-get-8                  2.293Gi ±  3%   2.726Gi ±  2%  +18.87% (p=0.000 n=10)
geomean                                    4.350Gi         4.042Gi         -7.07%

                             │ BenchmarkParse-before.txt │        BenchmarkParse-after.txt         │
                             │           B/op            │     B/op       vs base                  │
Parse/small/fastjson-8                     0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
Parse/small/fastjson-get-8                 0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
Parse/medium/fastjson-8                    0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
Parse/medium/fastjson-get-8                0.000 ±  0%       0.000 ±  0%        ~ (p=1.000 n=10) ¹
Parse/large/fastjson-8                    10.000 ± 10%       7.000 ±  0%  -30.00% (p=0.000 n=10)
Parse/large/fastjson-get-8                10.000 ±  0%       7.000 ±  0%  -30.00% (p=0.000 n=10)
Parse/canada/fastjson-8                  381.7Ki ±  1%     195.4Ki ±  6%  -48.80% (p=0.000 n=10)
Parse/canada/fastjson-get-8              381.7Ki ±  1%     197.3Ki ±  1%  -48.32% (p=0.000 n=10)
Parse/citm/fastjson-8                    33.76Ki ±  1%     20.74Ki ± 13%  -38.59% (p=0.000 n=10)
Parse/citm/fastjson-get-8                34.04Ki ±  1%     20.60Ki ± 13%  -39.48% (p=0.000 n=10)
Parse/twitter/fastjson-8                 3.140Ki ±  1%     2.454Ki ±  1%  -21.84% (p=0.000 n=10)
Parse/twitter/fastjson-get-8             3.144Ki ±  1%     2.452Ki ±  1%  -22.01% (p=0.000 n=10)
Parse/20mb/fastjson-8                   19.441Mi ± 16%     8.002Mi ±  3%  -58.84% (p=0.000 n=10)
Parse/20mb/fastjson-get-8               17.250Mi ±  8%     8.102Mi ±  3%  -53.03% (p=0.000 n=10)
geomean                                                ²                  -30.88%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                             │ BenchmarkParse-before.txt │        BenchmarkParse-after.txt        │
                             │         allocs/op         │  allocs/op    vs base                  │
Parse/small/fastjson-8                     0.000 ±  0%      0.000 ±  0%        ~ (p=1.000 n=10) ¹
Parse/small/fastjson-get-8                 0.000 ±  0%      0.000 ±  0%        ~ (p=1.000 n=10) ¹
Parse/medium/fastjson-8                    0.000 ±  0%      0.000 ±  0%        ~ (p=1.000 n=10) ¹
Parse/medium/fastjson-get-8                0.000 ±  0%      0.000 ±  0%        ~ (p=1.000 n=10) ¹
Parse/large/fastjson-8                     0.000 ±  0%      0.000 ±  0%        ~ (p=1.000 n=10) ¹
Parse/large/fastjson-get-8                 0.000 ±  0%      0.000 ±  0%        ~ (p=1.000 n=10) ¹
Parse/canada/fastjson-8                   2.406k ±  1%     1.862k ±  6%  -22.61% (p=0.000 n=10)
Parse/canada/fastjson-get-8               2.405k ±  1%     1.880k ±  1%  -21.85% (p=0.000 n=10)
Parse/citm/fastjson-8                      192.5 ±  1%      152.5 ± 13%  -20.78% (p=0.000 n=10)
Parse/citm/fastjson-get-8                  194.0 ±  2%      151.5 ± 14%  -21.91% (p=0.000 n=10)
Parse/twitter/fastjson-8                   7.000 ±  0%      5.000 ±  0%  -28.57% (p=0.000 n=10)
Parse/twitter/fastjson-get-8               7.000 ±  0%      5.000 ±  0%  -28.57% (p=0.000 n=10)
Parse/20mb/fastjson-8                     80.36k ± 16%     42.99k ±  3%  -46.50% (p=0.000 n=10)
Parse/20mb/fastjson-get-8                 71.30k ±  8%     43.53k ±  3%  -38.95% (p=0.000 n=10)
geomean                                                ²                 -17.98%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

pkg: github.com/wundergraph/astjson
cpu: Intel(R) Core(TM) i7-9700F CPU @ 3.00GHz
                            │ BenchmarkValidate-before.txt │     BenchmarkValidate-after.txt     │
                            │            sec/op            │    sec/op     vs base               │
Validate/small/fastjson-8                     30.73n ± 22%   31.58n ± 23%  +2.75% (p=0.037 n=10)
Validate/medium/fastjson-8                    373.1n ±  0%   371.6n ±  1%       ~ (p=0.101 n=10)
Validate/large/fastjson-8                     5.173µ ±  0%   4.750µ ±  1%  -8.17% (p=0.000 n=10)
Validate/canada/fastjson-8                    343.2µ ±  1%   320.3µ ±  0%  -6.69% (p=0.000 n=10)
Validate/citm/fastjson-8                      166.4µ ±  0%   173.8µ ±  2%  +4.47% (p=0.000 n=10)
Validate/twitter/fastjson-8                   97.77µ ±  1%   94.73µ ±  0%  -3.11% (p=0.000 n=10)
geomean                                       8.318µ         8.154µ        -1.96%

                            │ BenchmarkValidate-before.txt │     BenchmarkValidate-after.txt      │
                            │             B/s              │      B/s       vs base               │
Validate/small/fastjson-8                    5.758Gi ± 18%   5.604Gi ± 18%  -2.67% (p=0.035 n=10)
Validate/medium/fastjson-8                   5.815Gi ±  0%   5.836Gi ±  1%       ~ (p=0.089 n=10)
Validate/large/fastjson-8                    5.062Gi ±  0%   5.512Gi ±  1%  +8.88% (p=0.000 n=10)
Validate/canada/fastjson-8                   6.108Gi ±  1%   6.546Gi ±  0%  +7.17% (p=0.000 n=10)
Validate/citm/fastjson-8                     9.667Gi ±  0%   9.254Gi ±  2%  -4.28% (p=0.000 n=10)
Validate/twitter/fastjson-8                  6.016Gi ±  1%   6.209Gi ±  0%  +3.21% (p=0.000 n=10)
geomean                                      6.260Gi         6.386Gi        +2.00%

                            │ BenchmarkValidate-before.txt │     BenchmarkValidate-after.txt     │
                            │             B/op             │    B/op     vs base                 │
Validate/small/fastjson-8                     0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Validate/medium/fastjson-8                    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Validate/large/fastjson-8                     0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Validate/canada/fastjson-8                    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Validate/citm/fastjson-8                      0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Validate/twitter/fastjson-8                   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                                                  ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                            │ BenchmarkValidate-before.txt │     BenchmarkValidate-after.txt     │
                            │          allocs/op           │ allocs/op   vs base                 │
Validate/small/fastjson-8                     0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Validate/medium/fastjson-8                    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Validate/large/fastjson-8                     0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Validate/canada/fastjson-8                    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Validate/citm/fastjson-8                      0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
Validate/twitter/fastjson-8                   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                                                  ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

Tests definitely show that memory consumption is reduced by 30% and sometimes 40%

…d Object.getKV with linked list

iv-menshenin · 2024-11-11T17:31:43Z

-	preAllocatedCacheSize = 409 // 32kb class size
-	macAllocatedCacheSize = 1024
+	preAllocatedCacheSize = 341   // 32kb class size
+	maxAllocatedCacheSize = 10922 // 1MB

As you can see I have worked on the constants, calculated the 32KB starting class more accurately and increased the new block limit to 1MB, at times this gives more allocations in tests, but given that memory is overused, only operations before the buffer pool is full are affected

Also I added linked list to object parsing, this showed reduced memory consumption additionally for huge json objects (where there are many keys), before that I only optimized arrays - that was my narrow case

StarpTech · 2024-11-11T17:42:01Z

Could you point me to the right report? I can only see significant drops based on your benchstat.

iv-menshenin · 2024-11-11T17:44:39Z

Could you point me to the right report? I can only see significant drops based on your benchstat.

Quite right.
It is B/ops - memory consumption per operation
So a downgrade in this case is the equivalent of a performance improvement

What do you mean by “right report”?

iv-menshenin added 2 commits November 9, 2024 07:08

bench: added a huge json file to the bench-tests

0531253

Improved memory consumption for hudge files processing

4d358d3

add parse without cache wundergraph#4

d5a9ae7

Memory consumption tuning, set actual pre- and max- constants, update…

8c3cb4f

…d Object.getKV with linked list

fix: marshal with next chains when MarshalTo, test

4076502

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory consumption improvements (less append-reallocs) #5

Memory consumption improvements (less append-reallocs) #5

iv-menshenin commented Nov 10, 2024 •

edited

Loading

iv-menshenin commented Nov 10, 2024

StarpTech commented Nov 11, 2024

iv-menshenin commented Nov 11, 2024

iv-menshenin commented Nov 11, 2024 •

edited

Loading

iv-menshenin commented Nov 11, 2024

StarpTech commented Nov 11, 2024

iv-menshenin commented Nov 11, 2024

Memory consumption improvements (less append-reallocs) #5

Are you sure you want to change the base?

Memory consumption improvements (less append-reallocs) #5

Conversation

iv-menshenin commented Nov 10, 2024 • edited Loading

iv-menshenin commented Nov 10, 2024

StarpTech commented Nov 11, 2024

iv-menshenin commented Nov 11, 2024

iv-menshenin commented Nov 11, 2024 • edited Loading

iv-menshenin commented Nov 11, 2024

StarpTech commented Nov 11, 2024

iv-menshenin commented Nov 11, 2024

iv-menshenin commented Nov 10, 2024 •

edited

Loading

iv-menshenin commented Nov 11, 2024 •

edited

Loading