Add "previous" member to MVKPerformanceTracker structure #2183

SRSaunders · 2024-03-13T14:06:06Z

Addresses #2177.

This small PR adds a previous member to the MVKPerformanceTracker struct and saves the former latest value before updating latest within updateActivityPerformance(). It also adds previous to MVKLogInfo.

Note that I did not change the MVK_PRIVATE_API_VERSION.

I still have a question about how to determine version information at runtime. Now that vkGetVersionStringsMVK() is deprecated, what is the offical means of determining MoltenVK's version or the MVKPerformanceStatistics version at runtime? This is to handle cases when dynamic linking is used and one needs to determine if the new field is present.

Is the only mechanism to use the VK_SUCCESS or VK_INCOMPLETE return code from vkGetPerformanceStatisticsMVK()? This approach seems limiting since all you get is a success or failure indication and not version information. For instance if you always want to use latest, and optionally use previous (if available), how would you go about determing that at runtime?

…ng "latest"

billhollings

LGTM. Thanks for submitting.

Thanks for pointing out about MVK_PRIVATE_API_VERSION. We increment MVK_PRIVATE_API_VERSION once per MoltenVK release cycle (which is the same as Vulkan SDK cycle), if the private ABI changes. We have just cut MoltenVK v1.2.8 for the upcoming SDK. This PR will be in the next future SDK, and does modify the ABI, so you should increment the value of MVK_PRIVATE_API_VERSION to 41.

Regarding accessing the MoltenVK version at runtime, it is encoded in VkPhysicalDeviceProperties::driverVersion and VkLayerProperties::implementationVersion. The encoding is defined by the MVK_MAKE_VERSION() macro. It's monotonically numeric, so you can compare different versions, or use MVK_MAKE_VERSION() to create a version number to compare against.

If needed, similar to MVK_MAKE_VERSION() and MVK_VERSION_STRING, we could potentially add macros to extract three integers, or create a string, from a passed-in encoded version number.

MVK_PRIVATE_API_VERSION is not runtime derivable, but it is connected to the MoltenVK version as indicted above, so you can derive the runtime version of MVK_PRIVATE_API_VERSION, and the layout of any private ABI structures, by comparing the corresponding encoded MoltenVK version, as I describe above.

I'm not sure how we can directly pass MVK_PRIVATE_API_VERSION back from a runtime, except to create a Vulkan extension to do so.

SRSaunders · 2024-03-14T05:22:32Z

Thanks for the hints re VkPhysicalDeviceProperties::driverVersion and VkLayerProperties::implementationVersion for retrieving the MoltenVK version at runtime.

However, after thinking about this problem a bit more I will be reverting back to the simple approach of guarding the new code at compile time using MVK_MAKE_VERSION(1, 2, 9), and checking for VK_SUCCESS returned from vkGetPerformanceStatisticsMVK() at runtime. My logic is since the perf structure is not a linear list, the runtime and compile time ABI versions (i.e. struct sizes) have to match exactly to get the correct offsets into the returned memory blob. Even if you are dynamically linking to a later version of MoltenVK (i.e. with previous defined), the struct sizes must still match exactly for the data to make sense. Rather than carry around multiple version copies of the performance struct in my code (with different versioned offsets), I plan to simply require a match. This is the downside of using a nested struct vs. a linear list, but that's a more general issue than I want to tackle here.

FYI, I looked at mvkCopyGrowingStruct() to confirm how it behaves, and unfortunately I think I have found an error present in the code:

The current logic is: return (*pCopySize == origSize) ? VK_SUCCESS : VK_INCOMPLETE;
Shouldn't it instead be: return (sizeof(S) == origSize) ? VK_SUCCESS : VK_INCOMPLETE;

The original logic might be okay for linearly growing lists, but is not okay for nested structures like performance data. The only concern I have is this function is also used for MVKConfiguration and MVKPhysicalDeviceMetalFeatures which I think are both linear. While the current behaviour is different from the function's doc statement (Returns VK_SUCCESS if the original value of *pCopySize is the same as the actual size of the struct, or VK_INCOMPLETE otherwise), I don't want to break something unintentionally. Please advise.

If you agree with this change, I will submit commits for this as well as bumping MVK_PRIVATE_API_VERSION.

…r sizes match

SRSaunders · 2024-03-17T15:38:45Z

I decided to make the proposed changes above and @billhollings can review and comment as needed.

FYI - I have one more thing I would potentially like to add to this - a new MVKQueuePerformance entry that tracks async encoding start delay if the thread is already busy encoding the previous frame. I need to research this a bit more before confirming and will come back with an answer asap. So no rush on the merge.

…it times

SRSaunders · 2024-03-19T04:43:51Z

I have completed my research and have submitted an additional commit that proposes and implements two new performance counters in MVKQueuePerformance for measuring asynchronous queue submit wait times: waitSubmitCommandBuffers, and waitPresentSwapchains. These new performance counters are useful when MVK_CONFIG_SYNCHRONOUS_QUEUE_SUBMITS is disabled and queue submits are asynchronous, since there are delays between the initial calls to vkQueueSubmit()/vkQueuePresentKHR() and the actual start of operations.

For instance, waitSubmitCommandBuffers allows you to determine the delay time before the start of command buffer encoding if the thread is already busy encoding the previous frame. And waitPresentSwapchains allows you to predict the delay between calling vkQueuePresentKHR() and the actual swapchain image showing up on the screen - e.g. waitPresentSwapchains + presentSwapchains.

Here are screen grabs of the embedded Optick profiler running inside RBDoom3BFG with MoltenVK. You can see the Submit_Wait times (corresponds to waitSubmitCommandBuffers), Acquire_Wait times (corresponds to retrieveCAMetalDrawable), and Present_Wait times (corresponds to waitPresentSwapchains) and their relationship to command buffer encoding and execution on the GPU.

I have now completed my work on this PR and it is ready for review.

billhollings

Thanks for updating. Some documentation suggestions, and a refactoring request.

MoltenVK/MoltenVK/Vulkan/mvk_api.mm

MoltenVK/MoltenVK/GPUObjects/MVKQueue.mm

MoltenVK/MoltenVK/GPUObjects/MVKQueue.h

MoltenVK/MoltenVK/API/mvk_private_api.h

…sentSwapchains perf counters

…ents of execute() function

billhollings

LGTM now. Thanks for all the updates.

Add "previous" to MVKPerformanceTracker and save value before capturi…

11bd581

…ng "latest"

billhollings requested changes Mar 13, 2024

View reviewed changes

SRSaunders added 2 commits March 17, 2024 11:34

Fix mvkCopyGrowingStruct() to return success only if struct and buffe…

3ec155a

…r sizes match

Increment MVK_PRIVATE_API_VERSION to 41

08c1ad7

Add two new counters in MVKQueuePerformance for async queue submit wa…

10810f4

…it times

billhollings requested changes Mar 19, 2024

View reviewed changes

SRSaunders added 2 commits March 19, 2024 23:19

Update documentation for the new waitSubmitCommandBuffers and waitPre…

1d3fe52

…sentSwapchains perf counters

Capture perf start times in MVKQueueSubmission constructors vs. argum…

0cf9f7f

…ents of execute() function

billhollings approved these changes Mar 20, 2024

View reviewed changes

billhollings merged commit 73f2b8c into KhronosGroup:main Mar 20, 2024
5 checks passed

SRSaunders mentioned this pull request Mar 23, 2024

macOS: Show MoltenVK submit, image acquire, and metal encode on Optick trace RobertBeckebans/RBDOOM-3-BFG#871

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add "previous" member to MVKPerformanceTracker structure #2183

Add "previous" member to MVKPerformanceTracker structure #2183

SRSaunders commented Mar 13, 2024 •

edited

Loading

billhollings left a comment •

edited

Loading

SRSaunders commented Mar 14, 2024 •

edited

Loading

SRSaunders commented Mar 17, 2024 •

edited

Loading

SRSaunders commented Mar 19, 2024

billhollings left a comment

billhollings left a comment

Add "previous" member to MVKPerformanceTracker structure #2183

Add "previous" member to MVKPerformanceTracker structure #2183

Conversation

SRSaunders commented Mar 13, 2024 • edited Loading

billhollings left a comment • edited Loading

Choose a reason for hiding this comment

SRSaunders commented Mar 14, 2024 • edited Loading

SRSaunders commented Mar 17, 2024 • edited Loading

SRSaunders commented Mar 19, 2024

billhollings left a comment

Choose a reason for hiding this comment

billhollings left a comment

Choose a reason for hiding this comment

SRSaunders commented Mar 13, 2024 •

edited

Loading

billhollings left a comment •

edited

Loading

SRSaunders commented Mar 14, 2024 •

edited

Loading

SRSaunders commented Mar 17, 2024 •

edited

Loading