-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory consumption improvements (less append-reallocs) #5
base: master
Are you sure you want to change the base?
Conversation
@StarpTech |
Hi @iv-menshenin, thanks for the PR. Could you attach a report with https://pkg.go.dev/golang.org/x/perf/cmd/benchstat that gives us a report of before and after. What I immediately recognize is a significant drop in performance for smaller datasets. If this can't be fixed, it is unlikely that we will merge it. |
I admit my mistake, the only thing I did was to push the changes from fastjson to your repo and didn't notice some difference.
Also, I didn't want to re-immerse myself in the problem or do extensive testing. However, your request to attach benchstat prompted me to take action. After analyzing the results (completely unsatisfying), I realized that my case was very narrow and made the following refinements:
|
Tests definitely show that memory consumption is reduced by 30% and sometimes 40% |
…d Object.getKV with linked list
- preAllocatedCacheSize = 409 // 32kb class size
- macAllocatedCacheSize = 1024
+ preAllocatedCacheSize = 341 // 32kb class size
+ maxAllocatedCacheSize = 10922 // 1MB As you can see I have worked on the constants, calculated the 32KB starting class more accurately and increased the new block limit to 1MB, at times this gives more allocations in tests, but given that memory is overused, only operations before the buffer pool is full are affected Also I added linked list to object parsing, this showed reduced memory consumption additionally for huge json objects (where there are many keys), before that I only optimized arrays - that was my narrow case |
Quite right. What do you mean by “right report”? |
As I mentioned in another PR when processing large files we have a large number of memory reallocations associated with reaching capacity for a slice.
In this PR I modified the benchmarks so that they give us a more realistic estimate of memory, relying only on the data generated in a particular test (got rid of shared pool).
I achieved a reduction in memory consumption by getting rid of slice reallocations altogether by adding a chain of linked lists
Now for the proof of performance.
Here are the results of the tests
BEFORE
AFTER
As you can see, the number of memory allocations hasn't changed much, but the overall memory consumption (B/ops) has decreased by almost three times. This is especially evident on tests with huge files.
I added a 20 megabyte file to the tests to achieve dramatic effect