Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion Failure: assert (n < gc_heap::n_heaps) #106898

Open
mrsharm opened this issue Aug 23, 2024 · 3 comments
Open

Assertion Failure: assert (n < gc_heap::n_heaps) #106898

mrsharm opened this issue Aug 23, 2024 · 3 comments
Assignees
Milestone

Comments

@mrsharm
Copy link
Member

mrsharm commented Aug 23, 2024

Description

The following was the call for the assertion failure assert (n < gc_heap::n_heaps) for the Linux DATAS run for the finalization scenario with STRESS_DYNAMIC_HEAP_COUNT enabled using the Reliability Framework.

E.g., for : dump_11995:

0:274> k
# Child-SP          RetAddr               Call Site
00 00007fb4`fcf8f2b0 00007ff8`7eb46e30     libc_so!wait4+0x57
01 00007fb4`fcf8f2e0 00007ff8`7eb480d8     libcoreclr!PROCCreateCrashDump+0x410 [/home/vwazho/runtime/src/coreclr/pal/src/thread/process.cpp @ 2308] 
02 00007fb4`fcf8f350 00007ff8`7eb4548d     libcoreclr!PROCCreateCrashDumpIfEnabled+0xad8 [/home/vwazho/runtime/src/coreclr/pal/src/thread/process.cpp @ 15732480] 
03 00007fb4`fcf8f3f0 00007ff8`7eb4538b (T) libcoreclr!PROCAbort+0x2d [/home/vwazho/runtime/src/coreclr/pal/src/thread/process.cpp @ 2559] 
04 00007fb4`fcf8f410 00007ff8`7e98aea0 (T) libcoreclr!RaiseFailFastException+0x66 [/home/vwazho/runtime/src/coreclr/pal/src/thread/process.cpp @ 1276] 
05 00007fb4`fcf8f430 00007ff8`7e98aca5     libcoreclr!FailFastOnAssert+0x1b [/home/vwazho/runtime/src/coreclr/utilcode/debug.cpp @ 63] 
06 00007fb4`fcf8f440 00007ff8`7e98af24     libcoreclr!_DbgBreakCheck+0x2d5 [/home/vwazho/runtime/src/coreclr/utilcode/debug.cpp @ 15732480] 
07 00007fb4`fcf904f0 00007ff8`7e98b1b3     libcoreclr!_DbgBreakCheckNoThrow+0x84 [/home/vwazho/runtime/src/coreclr/utilcode/debug.cpp @ 15732480] 
08 00007fb4`fcf90570 00007ff8`7e8357f0     libcoreclr!DbgAssertDialog+0x73 [/home/vwazho/runtime/src/coreclr/utilcode/debug.cpp @ 15732480] 
09 (Inline Function) --------`--------     libcoreclr!SVR::GCHeap::GetHeap+0x5d [/home/vwazho/runtime/src/coreclr/gc/gc.cpp @ 50773] 
0a 00007fb4`fcf905b0 00007ff8`7e835705     libcoreclr!SVR::gc_heap::wait_for_gc_done+0x90 [/usr/include/x86_64-linux-gnu/bits/stdint-uintn.h @ 14647] 
0b 00007fb4`fcf905f0 00007ff8`7e850758     libcoreclr!SVR::WaitLongerNoInstru+0x55 [/home/vwazho/runtime/src/coreclr/gc/gc.cpp @ 1322] 
0c (Inline Function) --------`--------     libcoreclr!SVR::enter_spin_lock_noinstru+0x1d7 [/home/vwazho/runtime/src/coreclr/gc/gc.cpp @ 1490] 
0d (Inline Function) --------`--------     libcoreclr!SVR::enter_spin_lock+0x1d7 [/usr/include/x86_64-linux-gnu/bits/stdint-uintn.h @ 1514] 
0e 00007fb4`fcf90620 00007ff8`7e851fa4     libcoreclr!SVR::GCHeap::GarbageCollectGeneration+0x218 [/usr/include/x86_64-linux-gnu/bits/stdint-uintn.h @ 50508] 
0f 00007fb4`fcf90680 00007ff8`7e8530ec     libcoreclr!SVR::gc_heap::trigger_gc_for_alloc+0x74 [/home/vwazho/runtime/src/coreclr/gc/gc.cpp @ 15732480] 
10 00007fb4`fcf906b0 00007ff8`7e896061     libcoreclr!SVR::gc_heap::try_allocate_more_space+0x32c [/home/vwazho/runtime/src/coreclr/gc/gc.cpp @ 15732480] 
11 (Inline Function) --------`--------     libcoreclr!SVR::gc_heap::allocate_more_space+0x2c [/home/vwazho/runtime/src/coreclr/gc/gc.cpp @ 19413] 
12 (Inline Function) --------`--------     libcoreclr!SVR::gc_heap::allocate+0x69 [/usr/include/x86_64-linux-gnu/bits/stdint-uintn.h @ 19487] 
13 00007fb4`fcf90700 00007ff8`7e6fe2cf     libcoreclr!SVR::GCHeap::Alloc+0x261 [/usr/include/x86_64-linux-gnu/bits/stdint-uintn.h @ 49515] 
14 00007fb4`fcf90760 00007ff8`7e6fe02b     libcoreclr!Alloc+0x11f [/home/vwazho/runtime/src/coreclr/vm/gchelpers.cpp @ 227] 

Initial Investigation

The stack seems to signify the following assertion in GetHeap:

GCHeap* GCHeap::GetHeap (int n)
{
    assert (n < gc_heap::n_heaps);
    return gc_heap::g_heaps[n]->vm_heap;
}

But looking at “n” and “gc_heap::n_heaps” we see that n = 9 and gc_heap::n_heaps = 14, which doesn’t explain why we fired the assert. But looking deeper in the caller: wait_for_gc_done:

    while (gc_heap::gc_started)
    {
#ifdef MULTIPLE_HEAPS
        wait_heap = GCHeap::GetHeap(heap_select::select_heap(NULL))->pGenGCHeap;
        dprintf(2, ("waiting for the gc_done_event on heap %d", wait_heap->heap_number));
#endif // MULTIPLE_HEAPS

#ifdef _PREFAST_
        PREFIX_ASSUME(wait_heap != NULL);
#endif // _PREFAST_

        dwWaitResult = wait_heap->gc_done_event.Wait(timeOut, FALSE);
    }

We are trying to access the HeapNumber while changing the heap count:

e.g., thread 8 (14 stacks indicating these each for n = 14 heaps currently in use)

   8  Id: 2edb.2ee3 Suspend: 0 Teb: 00000000`00000000 Unfrozen
 # Child-SP          RetAddr               Call Site
00 00007ff8`6ebe2ad0 00007ff8`7ecef558     libc_so!_nptl_death_event+0xd6
01 00007ff8`6ebe2b10 00007ff8`7eac4c86     libc_so!pthread_cond_wait+0x1e8
02 00007ff8`6ebe2bd0 00007ff8`7e867378     libcoreclr!GCEvent::Impl::Wait+0xb6 [/home/vwazho/runtime/src/coreclr/gc/unix/events.cpp @ 15732480] 
03 00007ff8`6ebe2c20 00007ff8`7e83c2d2     libcoreclr!SVR::t_join::join+0x198 [/home/vwazho/runtime/src/coreclr/gc/gc.cpp @ 803] 
04 00007ff8`6ebe2c70 00007ff8`7e83a8a3     libcoreclr!SVR::gc_heap::change_heap_count+0xa02 [/home/vwazho/runtime/src/coreclr/gc/gc.cpp @ 26144] 
05 00007ff8`6ebe2d90 00007ff8`7e83a796 (T) libcoreclr!SVR::gc_heap::gc_thread_function+0x103 [/home/vwazho/runtime/src/coreclr/gc/gc.cpp @ 7131] 
06 00007ff8`6ebe2dd0 00007ff8`7e6fcb53     libcoreclr!SVR::gc_heap::gc_thread_stub+0x31 [/home/vwazho/runtime/src/coreclr/gc/gc.cpp @ 37266] 
07 (Inline Function) --------`--------     libcoreclr!<unnamed-namespace>::CreateNonSuspendableThread::$_1::operator()+0x4f [/home/vwazho/runtime/src/coreclr/vm/gcenv.ee.cpp @ 1525] 
08 00007ff8`6ebe2df0 00007ff8`7eb4b142     libcoreclr!<unnamed-namespace>::CreateNonSuspendableThread::$_1::__invoke+0x53 [/home/vwazho/runtime/src/coreclr/vm/gcenv.ee.cpp @ 1510] 
09 00007ff8`6ebe2e20 00007ff8`7ecf0134     libcoreclr!CorUnix::CPalThread::ThreadEntry+0x3c2 [/home/vwazho/runtime/src/coreclr/pal/src/thread/thread.cpp @ 1744] 
0a 00007ff8`6ebe2ee0 00007ff8`7ed707dc     libc_so!pthread_condattr_setpshared+0x4d4

Implying that we could be in a race condition during the time we change the heap count and when we want to get the heap to wait on.

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Aug 23, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/gc
See info in area-owners.md if you want to be subscribed.

@mangod9 mangod9 removed the untriaged New issue has not been triaged by the area owner label Aug 26, 2024
@mangod9 mangod9 added this to the 9.0.0 milestone Aug 26, 2024
@mangod9 mangod9 modified the milestones: 9.0.0, 10.0.0 Sep 5, 2024
@mrsharm
Copy link
Member Author

mrsharm commented Sep 5, 2024

Related PR: #107073

@mangod9
Copy link
Member

mangod9 commented Dec 12, 2024

Is this issue ok to close assume its fixed by #107073 ? Doesnt look like it was ported to 9 correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants