Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For NPU Sample, add flexibility in device creation options and fix Generic ML Device logic #624

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

Zu-shi
Copy link
Contributor

@Zu-shi Zu-shi commented Aug 13, 2024

Expand NPU sample's capabilities for creating devices based on attributes.

More specifically, allow options to filter based on allowed, unallowed, and required attributes. Then add some flags for the most commonly needed options for this sample.

The logic for how each interacts with CreateAdapterList is a bit unconventional since CreateAdapterList ANDs the passed in attributes.

Tested locally on Intel NPU.

…neric ML Device logic

Expand NPU sample's capabilities for creating devices based on attributes.

More specifically, allow options to filter based on allowed, unallowed, and required attributes. Then add some flags for the most commonly needed options for this sample.

The logic for how each interacts with CreateAdapterList is a bit unconventional since CreateAdapterList ANDs the passed in attributes.

Tested locally on Intel NPU.
…atch

For NPU Sample, add flexibility in device creation options and fix Generic ML Device logic
{
size_t adapterNameSize = 0;
THROW_IF_FAILED(adapter->GetPropertySize(DXCoreAdapterProperty::DriverDescription, &adapterNameSize));
char* adapterName = (char*)malloc(adapterNameSize);
Copy link
Contributor

@fdwr fdwr Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the memory released?
Better to just use std::string and almost never touch malloc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool TryGetProperty(DXCoreAdapterProperty prop, std::string& outputValue)
{
if (m_adapter->IsPropertySupported(prop))
{
size_t propSize;
THROW_IF_FAILED(m_adapter->GetPropertySize(prop, &propSize));
outputValue.resize(propSize);
THROW_IF_FAILED(m_adapter->GetProperty(prop, propSize, outputValue.data()));
// Trim any trailing nul characters.
while (!outputValue.empty() && outputValue.back() == '\0')
{
outputValue.pop_back();
}
return true;
}
return false;
}

Copy link
Contributor

@jstoecker jstoecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this sample should be kept as simple as possible and only show logic to unambiguously select an NPU. I don't think we need to illustrate the flexibility and numerous ways adapters can be filtered (this isn't a DXCore sample); I think this flexibility actually makes the sample more confusing, because we're not guiding users on the precise attributes required to leverage NPU.

In short, I suggest having something like GENERIC_ML || (CORE_COMPUTE && !GRAPHICS).

// filter out adapters with disallowed attributes
for (auto& disallowedGuid : dxGuidDisallowedAttributes)
{
if (currentGpuAdapter->IsAttributeSupported(disallowedGuid)) { isAdapterValid = false; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to continue checking all the remaining attributes if any of them is false? One can early-out and just break.

                for (auto& disallowedGuid : dxGuidDisallowedAttributes)
                {
                    if (currentGpuAdapter->IsAttributeSupported(disallowedGuid))
                    {
                        isAdapterValid = false;
                        break;
                    }
                }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, if using C++20 (I'm unsure the minimum C++ version used in this sample), you could use:

isAdapterValid = std::ranges::all_of(dxGuidRequiredAttributes, [&](const& guid) {return currentGpuAdapter->IsAttributeSupported(guid)})
isAdapterValid &= std::ranges::all_of(dxGuidDisallowedAttributes, [&](const& guid) {return !currentGpuAdapter->IsAttributeSupported(guid)})

THROW_IF_FAILED(factory->CreateAdapterList(ARRAYSIZE(dxGUIDs), dxGUIDs, IID_PPV_ARGS(&adapterList)));
for (uint32_t i = 0, adapterCount = adapterList->GetAdapterCount(); i < adapterCount; i++)
// If there's any required attributes, save time by only passing in required attributes instead.
std::vector<GUID> iteratingAdapterList = dxGuidRequireAllAttributes.empty() ? dxGuidAllowedAttributes : dxGuidRequireAllAttributes;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::vector<GUID> iteratingAdapterList = dxGuidRequireAllAttributes.empty() ? dxGuidAllowedAttributes : dxGuidRequireAllAttributes;
const std::vector<GUID>& iteratingAdapterList = dxGuidRequireAllAttributes.empty() ? dxGuidAllowedAttributes : dxGuidRequireAllAttributes;

No need to make a copy of it, since it's not being mutated.

OR if using C++20, std::span.

@hez2010
Copy link

hez2010 commented Aug 29, 2024

Can this run on NPU of Snapdragon X series as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants