Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChunkedList: store pointers to next/last elements instead of indices … #48

Conversation

michaelsippel
Copy link
Member

…to save overhead of address calculation and allow more efficient packing of ItemAccess

  • this also reintroduces the chunk-size as template parameter instead of storing it as member
  • initialize AtomicList with chunk_capacity, not allocation size
  • AtomicList: construct Item/ItemControlBlock from memory::Block
  • ChunkedList: construct Chunk from memory::Block
  • reorder ChunkedList::Item struct for more efficient alignment
  • encode has_item of ItemAccess into one 8-byte pointer using bitmagic

@michaelsippel
Copy link
Member Author

Here I'm still getting warnings which I dont understand

warning: 'short unsigned int __atomic_fetch_sub_2(volatile void*, short unsigned int, int)' writing 2 bytes into a region of size 0 overflows the destination [-Wstringop-overflow=]
  645 |       { return __atomic_fetch_sub(&_M_i, __i, int(__m)); }

@psychocoderHPC
Copy link
Member

Here I'm still getting warnings which I dont understand

warning: 'short unsigned int __atomic_fetch_sub_2(volatile void*, short unsigned int, int)' writing 2 bytes into a region of size 0 overflows the destination [-Wstringop-overflow=]
  645 |       { return __atomic_fetch_sub(&_M_i, __i, int(__m)); }

Could it be a hint that some data are unaligned?

@psychocoderHPC
Copy link
Member

@michaelsippel Is this PR a replacement for #45?

Copy link
Member

@psychocoderHPC psychocoderHPC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we still have the double free issue I tried to solve with #45 and we have the problem that more than one thread can append a new chunk.

#include <limits>
#include <memory>
#include <optional>
#include <redGrapes/util/trace.hpp>
#include <redGrapes/memory/allocator.hpp>
#include <redGrapes/memory/bump_allocator.hpp>
//#include <redGrapes/memory/bump_allocator.hpp>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?? why is this commented out?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented because its not needed. Allocator is a template parameter. will be removed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove the line.

}

~Chunk()
{
for( unsigned i = 0; i < last_idx; ++i )
items()[i].~Item();
for( Item * item = first_item; item != last_item; item++ )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last_item is pointing by default to the element before the first item. If we never allocate memory but create a chunked_list does it mean we loop over non existing memory?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe for( Item * item = first_item; item <= last_item ; item++ )?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, currently this bug doesnt manifest because a chunk is only allocated if at least one element exists , but when we switch to the algorithm where the thread with the last slot allocates the new chunk, it can happen that a chunk remains empty and thus creating an endless loop.

typename memory::AtomicList< Chunk, Allocator >::MutBackwardIterator chunk;

/* this pointer packs the address of the current element
* and the `has_element` bit in its MSB.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MSB ?? What is this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MSB = most significant bit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add this to the first usage

@@ -441,7 +438,7 @@ struct ChunkedList
void release_chunk( typename memory::AtomicList< Chunk, Allocator >::MutBackwardIterator chunk )
{
if( chunk->item_count.fetch_sub(1) == 0 )
chunks.erase( chunk );
chunks.erase( chunk );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a thread is entering this line and at the same moment another thread is pushing an item and is increasing the counter and sees that no item is free and then releases the memory again we will have a double free.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this version we increment item_count only after we are safe to have gotten a slot. Only if the increment of item_count happens first and then decrements this is relevant

}
}

release_chunk(chunk);
}

auto prev_chunk = chunks.allocate_item();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we prevent that more than one thread is creating new memory?

@michaelsippel
Copy link
Member Author

michaelsippel commented Dec 12, 2023

Yes you are right, this PR does not include the fix for the double allocation / memory leak yet , #45 is still neccesary, either one needs to be rebased

@psychocoderHPC
Copy link
Member

psychocoderHPC commented Dec 12, 2023

I opened a PR against your PR to solve the double free and alloc issue: michaelsippel#1
My local tests works well, please have a look.
As I wrote before it is not necessary that #45 get merged if this PR is solving the big issue with the double alloc and free.

…to save overhead of address calculation and allow more efficient packing of ItemAccess

* this also reintroduces the chunk-size as template parameter instead of storing it as member
* initialize AtomicList with `chunk_capacity`, not allocation size
* AtomicList: construct Item/ItemControlBlock from `memory::Block`
* ChunkedList: construct Chunk from `memory::Block`
* reorder ChunkedList::Item struct for more efficient alignment
* encode `has_item` of ItemAccess into one 8-byte pointer using bitmagic
Equally to
ComputationalRadiationPhysics#45 this
PR should solve the possible double free and double alloc.
@michaelsippel michaelsippel mentioned this pull request Dec 13, 2023
3 tasks
@psychocoderHPC psychocoderHPC merged commit 794431a into ComputationalRadiationPhysics:dev Dec 13, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants