Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak when rendering with DX11 in C++ #7615

Closed
AlexHughesMGP opened this issue May 23, 2024 · 16 comments
Closed

Memory leak when rendering with DX11 in C++ #7615

AlexHughesMGP opened this issue May 23, 2024 · 16 comments
Labels

Comments

@AlexHughesMGP
Copy link

Version/Branch of Dear ImGui:

Version 1.89.6, Branch: master

Back-ends:

imgui_impl_win32.cpp + imgui_impl_dx11.cpp

Compiler, OS:

Windows 10 + MSVC 14.39.33519

Full config/build information:

No response

Details:

My Issue/Question:

I am using Dear ImGui with the win32 and dx11 backends in a fairly complicated control panel application, and it appears to be causing a memory leak. The code I'm using to render is straight from the demo:
image
Every frame, imgui_new_frame() is called, then our UI code, then imgui_dx11_render(). However, at a regular interval 32,903 bytes are allocated and never deallocated. In one test it was every ~8 seconds, and in another every ~700 milliseconds:
image
image
Line 215 is ImGui_ImplDX11_RenderDrawData(ImGui::GetDrawData());. I'm not able to post the full code because there are thousands of lines and it's confidential, so I understand that it may be difficult to help, but it would be helpful to know:

  1. Are there any common/obvious/possible mistakes to make between imgui_new_frame and imgui_dx11_render that could cause this?
  2. Is it possible/likely that the call stack provided by VS2022 is wrong or misleading, and that something else is at fault?
  3. Is there anything I can try to get more information on what's going on to help diagnose the issue?

It's very strange because the code works exactly like the example, and I doubt that Dear ImGui or DX11 have significant memory leaks like this. The fact that it happens at a consistent interval but not every frame also seems odd.

Screenshots/Video:

No response

Minimal, Complete and Verifiable Example code:

// Here's some code anyone can copy and paste to reproduce your issue
void imgui_new_frame() {
    if (g_ResizeWidth != 0 && g_ResizeHeight != 0)
    {
        CleanupRenderTarget();
        g_pSwapChain->ResizeBuffers(0, g_ResizeWidth, g_ResizeHeight, DXGI_FORMAT_UNKNOWN, 0);
        g_ResizeWidth = g_ResizeHeight = 0;
        CreateRenderTarget();
    }
    // Start the Dear ImGui frame
    ImGui_ImplDX11_NewFrame();
    ImGui_ImplWin32_NewFrame();
    ImGui::NewFrame();
}

void imgui_dx11_render() {
    ImVec4 clear_color = ImVec4(0.0f, 0.0f, 0.00f, 1.00f);
    // Rendering
    ImGui::Render();
    const float clear_color_with_alpha[4] = { clear_color.x * clear_color.w, clear_color.y * clear_color.w, clear_color.z * clear_color.w, clear_color.w };
    g_pd3dDeviceContext->OMSetRenderTargets(1, &g_mainRenderTargetView, nullptr);
    g_pd3dDeviceContext->ClearRenderTargetView(g_mainRenderTargetView, clear_color_with_alpha);
    ImGui_ImplDX11_RenderDrawData(ImGui::GetDrawData());
    g_pSwapChain->Present(1, 0);
}

imgui_new_frame();
state.update(target_seconds_per_frame);    
imgui_dx11_render();
@AlexHughesMGP
Copy link
Author

I've just noticed that the issue either only happens when the PC is locked/screen is off, or at least it gets much worse in that situation.

@AlexHughesMGP
Copy link
Author

Definitely seems to be a result of the screen being locked. I tried disabling rendering when g_pSwapChain->Present(1, 0); returns DXGI_STATUS_OCCLUDED, and it drastically reduced the issue. Left shows before the change, right after, and the 5min gap between snapshots is where the screen was locked:
image
Now to figure out how to detect when it's no longer occluded. I tried checking GetClipBox, but it doesn't return NULLREGION while locked for some reason.

@ocornut
Copy link
Owner

ocornut commented May 23, 2024

Thank you for the ongoing investigation :)
Could you also confirm that this happens in the unmodified/vanilla example_win32_dx11/main.cpp example?

Might or not be related, but linking to #2496 and #3907.

@ocornut
Copy link
Owner

ocornut commented May 23, 2024

Now to figure out how to detect when it's no longer occluded.

Calling Present() with DXGI_PRESENT_TEST seems to be doing the job.
https://learn.microsoft.com/en-us/windows/win32/direct3ddxgi/dxgi-present

I logged WM_xxx messages and noticed that whereas minimization #2496 #3907 sends a WM_SIZE with wParam == SIZE_MINIMIZED, locking screen doesn't send any message.

[00117] WndProc 0320 wParam E3006FC7 lParam 00000001 WM_DWMCOLORIZATIONCOLORCHANGED
[00118] WndProc 0020 wParam 000C0AF6 lParam 02000002 WM_SETCURSOR
[00118] WndProc 00A0 wParam 00000002 lParam 012A06D9 WM_NCMOUSEMOVE
[00119] WndProc 0020 wParam 000C0AF6 lParam 02000002 WM_SETCURSOR
[00119] WndProc 00A0 wParam 00000002 lParam 012A06D9 WM_NCMOUSEMOVE
[00122] WndProc 0101 wParam 0000002E lParam A1530001 WM_KEYUP
[00125] WndProc 0086 wParam 00000000 lParam 00000000 WM_NCACTIVATE
[00125] WndProc 0006 wParam 00000000 lParam 00000000 WM_NCACTIVATE
[00125] WndProc 001C wParam 00000000 lParam 00000000 WM_ACTIVATEAPP
[00125] WndProc 0008 wParam 00000000 lParam 00000000 WM_KILLFOCUS
[00125] WndProc 0281 wParam 00000000 lParam C000000F WM_IME_SETCONTEXT
[00125] WndProc 0282 wParam 00000001 lParam 00000000 WM_IME_NOTIFY
[00151] WndProc 02A2 wParam 00000000 lParam 00000000 WM_NCMOUSELEAVE <-- locking around that time (maybe before, maybe after)
[13497] WndProc 001E wParam 00000000 lParam 00000000 WM_TIMECHANGE <-- notice many frames elapsed, as we went unthrolled during the lock (same problem as when minimized)
[13498] WndProc 0088 wParam 00000020 lParam 00000000 WM_SYNCPAINT
[13498] WndProc 0085 wParam 00000001 lParam 00000000 WM_NCPAINT
[13498] WndProc 0014 wParam 910126A5 lParam 00000000 WM_ERASEBKGND
[13498] WndProc 0020 wParam 000C0AF6 lParam 02000002 WM_SETCURSOR
[13498] WndProc 00A0 wParam 00000002 lParam 012A06D9 WM_NCMOUSEMOVE
[13498] WndProc 000F wParam 00000000 lParam 00000000 WM_PAINT
[13500] WndProc 0020 wParam 000C0AF6 lParam 02000002 WM_SETCURSOR
[13500] WndProc 00A0 wParam 00000002 lParam 012A06D9 WM_NCMOUSEMOVE
[13503] WndProc 0046 wParam 00000000 lParam F535F2A0 WM_WINDOWPOSCHANGING
[13503] WndProc 001C wParam 00000001 lParam 0000292C WM_ACTIVATEAPP
[13503] WndProc 0086 wParam 00000001 lParam 00000000 WM_NCACTIVATE
[13503] WndProc 0006 wParam 00000001 lParam 00000000 WM_ACTIVATE
[13503] WndProc 0281 wParam 00000001 lParam C000000F WM_IME_SETCONTEXT
[13503] WndProc 0282 wParam 00000002 lParam 00000000 WM_IME_NOTIFY
[13503] WndProc 003D wParam FFFFFFFF lParam FFFFFFFC WM_GETOBJECT
[13503] WndProc 003D wParam 00000000 lParam FFFFFFF4 WM_GETOBJECT
[13503] WndProc 0007 wParam 00000000 lParam 00000000 WM_SETFOCUS
[13503] WndProc 0020 wParam 000C0AF6 lParam 02000002 WM_SETCURSOR
[13503] WndProc 00A0 wParam 00000002 lParam 012A06D9 WM_NCMOUSEMOVE
[13557] WndProc 0020 wParam 000C0AF6 lParam 02000002 WM_SETCURSOR
[13557] WndProc 00A0 wParam 00000002 lParam 012906D9 WM_NCMOUSEMOVE
[....]

My conclusion is that we need to handle Present() rather than implement #3907 because that one will only solve minimization and not lock.

@ocornut
Copy link
Owner

ocornut commented May 23, 2024

As mentioned in #2496 (comment) one of my issue with solving this based on Present() return value of main viewport, is that it would either freeze applications where main viewport is minimized and others aren't, either leave them unthrolled in that situation (which is already the case technically, but rarely noticed as io.ConfigViewportsNoDefaultParent is rarely set. However based on the premise that there is ALREADY an issue in those examples in that situation, I am not against swapping it for another issue first, while we solve the other cases.

@AlexHughesMGP
Copy link
Author

It does occur in the win32 dx11 example:
image
The big jump in heap size is where it was locked. The call stack again points to ImGui_ImplDX11_RenderDrawData():
image

I'll test the snippet here based on the Present() return value. I'm not familiar with how multiple viewports work so I'm not sure what an appropriate solution for the library as a whole is, but if that works it should solve the problem in my case. Thanks!

@ocornut
Copy link
Owner

ocornut commented May 23, 2024

Here's my current solution could you test it on your end?

static bool                     g_pSwapChainOccluded = false;

After polling WM message loop:

        // Handle window being occluded or screen locked
        if (g_pSwapChainOccluded && g_pSwapChain->Present(0, DXGI_PRESENT_TEST) == DXGI_STATUS_OCCLUDED)
        {
            ::Sleep(10);
            continue;
        }
        g_pSwapChainOccluded = false;

Present:

HRESULT hr = g_pSwapChain->Present(1, 0);   // Present with vsync
g_pSwapChainOccluded = (hr == DXGI_STATUS_OCCLUDED);

@AlexHughesMGP
Copy link
Author

My previous attempt to fix didn't seem to work, unfortunately:
image
I'll try yours now, thanks!

@ocornut
Copy link
Owner

ocornut commented May 23, 2024

Your attempt as pictured in screenshot will let the rest of the code run at max speed, burning a core and essentially making no difference.

@AlexHughesMGP
Copy link
Author

Your solution seems to have worked, this is my result when trying with the example:
image

@AlexHughesMGP
Copy link
Author

Seems to work in my case too:
image
There are still a couple of instances of the 32,903 byte object, but nowhere near as many:
image
Thanks for the help!

ocornut added a commit that referenced this issue May 23, 2024
@ocornut
Copy link
Owner

ocornut commented May 23, 2024

I have pushed ec1d2be which implement this for the DX9, DX10, DX11 and DX12 examples

DX9 and DX12 are both slightly different, and Present() does wait for vsync even on fail so it didn't exhibit a CPU core burn, but adding better support for it means we are not looping when screen locked which is desirable.

If you want to dig further into remaining issues please do but I'd prefer if you do it (or confirm it) in vanilla example. I don't know if there are ways to break into those allocs.

@AlexHughesMGP
Copy link
Author

I have pushed ec1d2be which implement this for the DX9, DX10, DX11 and DX12 examples

DX9 and DX12 are both slightly different, and Present() does wait for vsync even on fail so it didn't exhibit a CPU core burn, but adding better support for it means we are not looping when screen locked which is desirable.

If you want to dig further into remaining issues please do but I'd prefer if you do it (or confirm it) in vanilla example. I don't know if there are ways to break into those allocs.

I suspect that the 32,903 byte objects are some kind of DX11 or win32 data structure which is required, because there are 2 or 3 present from the start when running normally. My guess is that they were being re-created without proper cleanup when trying to render with an occluded window, and that's no longer happening, so all is well! Thanks again.

@RT2Code
Copy link

RT2Code commented May 30, 2024

@ocornut You may already know it, but just in case: I saw on Twitter that you were starting to use the dxgi flip model, but it will never return DXGI_STATUS_OCCLUDED. So if you use DXGI_SWAP_EFFECT_FLIP_DISCARD in the backends, this issue will come back.

@ocornut
Copy link
Owner

ocornut commented May 30, 2024

@RT2Code Thanks for the headup. Do you know how are you supposed to handle this?

I have confirmed that when using FLIP_DISCARD i don't get DXGI_STATUS_OCCLUDED when minimized, however Present still honor vsync so it doesn't burn a CPU. It however ideally should go idle, and perhaps we should combine this with #3907 together...

When locking the computer, I do get a DXGI_STATUS_OCCLUDED signal however.

(I've been investigating DXGI_SWAP_EFFECT_FLIP_DISCARD for #7607 but I think it add a little extra pressure to make our example apps dpi aware, and probably I am going to get back to #3761 for that first, but only after I finish multi-select for 1.91.....)

@RT2Code
Copy link

RT2Code commented May 30, 2024

There's an alternative API to get occluded notifications, but I couldn't get it to work either. I only get 2 notifications at the start of the app, but then nothing. That being said, I'm not sure at all how to use it correctly, I couldn't find any decent documentation about it, so here's the gist of it if you wanna try by yourself:

#include <d3d11_1.h>

#define WM_OCCLUSION_EVENT (WM_USER + 1)

//Data
[...]
static IDXGIFactory2* g_pFactory = nullptr;
[...]

int main(int, char**)
{
[...]
    // Setup Platform/Renderer backends
    ImGui_ImplWin32_Init(hwnd);
    ImGui_ImplDX11_Init(g_pd3dDevice, g_pd3dDeviceContext);

    DWORD occlusionStatusCookie = 0;
    g_pFactory->RegisterOcclusionStatusWindow(hwnd, WM_OCCLUSION_EVENT , &occlusionStatusCookie);
[...]
    g_pFactory->UnregisterOcclusionStatus(occlusionStatusCookie);
    
     // Cleanup
    ImGui_ImplDX11_Shutdown();
    ImGui_ImplWin32_Shutdown();
    ImGui::DestroyContext();
[...]
}

[...]

bool CreateDeviceD3D(HWND hWnd)
{
[...]
    IDXGIDevice * pDXGIDevice;
    res = g_pd3dDevice->QueryInterface(__uuidof(IDXGIDevice), (void**)&pDXGIDevice);
    if (res != S_OK)
        return false;

    IDXGIAdapter * pDXGIAdapter;
    res = pDXGIDevice->GetParent(__uuidof(IDXGIAdapter), (void**)&pDXGIAdapter);
    if (res != S_OK)
        return false;

    res = pDXGIAdapter->GetParent(__uuidof(IDXGIFactory), (void**)&g_pFactory);
    if (res != S_OK)
        return false;
[...]
}

LRESULT WINAPI WndProc(HWND hWnd, UINT msg, WPARAM wParam, LPARAM lParam)
{
[...]
    switch (msg)
    {
    case WM_OCCLUSION_EVENT:
    {
        g_SwapChainOccluded = !g_SwapChainOccluded; // Not sure how we're supposed to tell when it's occluded or not? I guess it alternate, but I'm not sure about that at all.
        break;
    }
[...]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants