-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wasm buffer manager support #4523
base: master
Are you sure you want to change the base?
Conversation
34bacfa
to
39aa152
Compare
The answer appears to be that |
1160c49
to
3fb91cb
Compare
Benchmark ResultMaster commit hash:
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #4523 +/- ##
=======================================
Coverage 87.24% 87.24%
=======================================
Files 1356 1356
Lines 56755 56754 -1
Branches 7078 7078
=======================================
+ Hits 49515 49517 +2
+ Misses 7068 7065 -3
Partials 172 172 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
3fb91cb
to
aaca54a
Compare
Benchmark ResultMaster commit hash:
|
@@ -1,4 +1,5 @@ | |||
-DATASET CSV tck | |||
-BUFFER_POOL_SIZE 268435456 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this due to recursive joins?
} | ||
#endif | ||
|
||
#if defined(_WIN32) && !BM_MALLOC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The macro here is becoming more a bit hard to follow now. I wonder if we should choose to duplicate the code a bit to separate them more clearly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the second #if/#else
removed is it sufficiently clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep 👍
} catch (AccessViolation& exc) { | ||
// If we encounter an acess violation within the VM region, | ||
// the page was decomitted by another thread | ||
// and is no longer valid memory | ||
#if BM_MALLOC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can this be true when #if defined(_WIN32) && !BM_MALLOC
is true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I added those at two different times.
I'm not sure if accessing freed memory may cause an access violation on Windows, in which case optimisticRead won't be safe even when using malloc unless we handle those access violations (at least, those that occur within the frame).
The CI is passing with it disabled at the moment, which makes me wonder if it's not covered by tests, or if it in practice just isn't happening (maybe it doesn't actually happen at all, or it's just that freed memory that's been re-used wouldn't cause the access violation).
I think I'll just remove this block for now and leave the access violation handling disabled with BM_MALLOC
.
aaca54a
to
ba93738
Compare
} | ||
#endif | ||
|
||
#if defined(_WIN32) && !BM_MALLOC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep 👍
#if BM_MALLOC | ||
uint8_t* getPage() const { return page.get(); } | ||
uint8_t* allocatePage(uint64_t pageSize) { | ||
page = std::make_unique<uint8_t[]>(pageSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should check allocation failure here to avoid seg faults.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly in webassembly the tests are compiled with the default ABORTING_MALLOC=1, which aborts the webassembly process if it runs out of memory instead of returning 0, but outside of the tests we're compiling with ALLOW_MEMORY_GROWTH
, which has the default of having malloc return 0 on failure.
So I think we should set ABORTING_MALLOC=0
for the tests just so it's handled the same (not that we have any tests where we expect this yet).
ba93738
to
41a859b
Compare
Pages are stored in the page state structure; optimistic reads work as normal, except we need to handle the page pointer possibly being null
Should have been lowered before; with the higher limit most tests never do buffer manager evictions
41a859b
to
6d30abf
Compare
Benchmark ResultMaster commit hash:
|
This adds a simple alternative implementation to the BufferManager that allocates pages using malloc (via unique_ptr) and stores the page pointer in the PageState object.
Performance
There is a small performance penalty, pinning pages appears to be ~2-5x slower with this method, however even when doing a lot of new pins (i.e. where it's also allocating; there should be little difference for pins of pages which are already in memory) the overall cost is minimal.
E.g. summing the float edge property in the LDBC datagen-9_0-fb dataset took ~700ms using the normal buffer manager and ~900ms using this variant (obviously not a webassembly build; I overrode the macro used to enable/disable this feature).
Interestingly, it was actually faster when comparing the following query on the much larger graph500-30 dataset:
kuzu kugraph500-30 -d512 --noprogressbar --read_only <<< 'MATCH (v:N)-[e:E]->(:N) RETURN SUM(v.ID);'
, where the original runtime was ~4s, and improved to ~2.8s with this change (also on a machine with 128 threads; the difference may be that this approach scales better or has less of a setup cost, even though the allocation cost is higher; I'll look into it and try and see what the difference is).Outstanding Webassembly bugs
Unfortunately it appears that this is not the issue with the web assembly tests which were running out of memory. They are still running out of memory with this change, so I have left them disabled; as far as I can tell the huge memory allocations in those tests are unrelated to the BufferManager/MemoryManager.
Other changes of note
I also lowered the default buffer pool size for testing back to 64MB. This should have been lowered before, but I guess was missed after it was raised when I was adding ColumnChunk memory to the tracked limit. With the higher limit most tests never do buffer manager evictions (including some of the tests which are disabled on webassembly due to this memory issue).