Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GetEdmModel Performance Issue #2743

Open
caviyacht opened this issue Jan 4, 2023 · 7 comments
Open

GetEdmModel Performance Issue #2743

caviyacht opened this issue Jan 4, 2023 · 7 comments
Assignees

Comments

@caviyacht
Copy link

Very slow performance when calling GetEdmModel(), specifically in the ApplyNavigationSourceConventions() method. This was not a problem when using v6.0.0-alpha1 (pre-recursive complex support).

Assemblies affected

Microsoft.AspetNetCore.OData, v7.6.3

Reproduce steps

I'm not sure how to reproduce the issue exactly, but this seems to be happening because of the nesting logic that was added. The same size that is being used is:

  • Navigation Sources: 53
  • Structural Types: 1255
  • Operations: 171

Expected result

The GetEdmModel() call takes less than 5s (or faster) vs taking 20s+.

Actual result

The GetEdmModel() call is taking upwards of 20s to execute.

Additional detail

Below shows the time spent in the ApplyNavigationSourceConventions() method.

image

@habbes
Copy link
Contributor

habbes commented Jan 5, 2023

20 seconds definitely sounds excessive. Is the GetEdmModel() method getting called once or multiple times in your app? (Ideally it should be called once and cached if your model doesn't change after construction).

@caviyacht
Copy link
Author

So this actually gets called twice during startup as there are different groups of APIs (this is using Sitecore Commerce, not something I can control). But it still doesn't make sense why it takes so long to run the command in v7 vs v6, other than the new nested complex types logic exists, which does this recursive call.

@habbes
Copy link
Contributor

habbes commented Jan 24, 2023

@caviyacht by any chance, would it be possible to share the model classes that were used to generate the EdmModel?

@caviyacht
Copy link
Author

@habbes , unfortunately I cannot share them as most are not mine (Sitecore Commerce) and the rest are inherited on those. Also, there are ~1200 of them that make up the model (out-of-the-box + custom).

Below are some of the core classes that the system uses, and most inherit from these.

// Total inherited count: ~136
public class Model
{
    public IList<Policy> Policies { get; set; }
}
// Total inherited count: ~355
public class Policy
{
    public List<Model> Models { get; set; }
}
// Total inherited count: ~146
public class Component
{
    public IList<Policy> Policies { get; set; }
    public IList<Component> ChildComponents { get; set; }
}
// Total inherited count: ~66
public class CommerceEntity
{
    public IReadOnlyList<Component> EntityComponents { get; set; }
    public IReadOnlyList<Policy> EntityPolicies { get; set; }
}

This makes up around ~703 of the objects added to the Edm model. The other ~500 are just normal classes that do not inherit from these, but could have these as nested properties.

Let me know if there is something else I can try to provide.

@habbes
Copy link
Contributor

habbes commented Feb 3, 2023

Hi @caviyacht

Thanks for sharing the additional information. I've gone through the implementations of the methods in the call stack you shared to try and understand what the issue could be, and I have the following follow-up questions:

  • Do you have deep inheritance trees (like multiple levels of inheritance)? I see that there are calls to ThisAndBaseTypes() and DerivedTypes() which could be expensive if you have a lot of structural types and deep inheritance hierarchies.
  • Do you have long relationship chains between complex types? Like ComplexTypeA has a property of type ComplexTypeB, which has a property of ComplexTypeC, which has a ComplexTypeD, etc.

I think at first glance caching the results of DerivedTypes() and ThisAndBaseTypes() could be a good place to start optimizing, so that repeated lookups of the same type don't have to recompute the inheritance tree each time.

The other potential issue I've observed is that we can visit the same type multiple times. There's code to ensure that we don't process the same type twice if it occurs in the same subgraph to avoid infinite recursions, but there's not code to ensure we don't visit the same type more than once in separate subgraphs. And looking at the code, this appears to be intentional. Let me elaborate more in the next comment.

@habbes
Copy link
Contributor

habbes commented Feb 3, 2023

Let's assume we have a model based on the following classes:

public class EntityA
{
    public int Id { get; set; }
    public EntityB EntityB { get; set; }
    public ComplexA ComplexA { get; set; }
    public ComplexB ComplexB { get; set; }
}

public class EntityB
{
    public int Id { get; set; }
}

public class ComplexA
{
    public ComplexB ComplexB { get; set; }
}

public class ComplexB
{
    public EntityB EntityB { get; set; }
    public ComplexA ComplexA { get; set; }
}

Now the AssociationSetDiscoveryConvention will try to find all the navigation properties for entity EntityA, both direct and indirect. For navigation properties of EntityA, it will simply add them to the search results. But for complex properties, it will perform a recursive search of all its properties. For each navigation property found, it will also return the property path from the root EntityA to that navigation property.

To make it concrete, let's consider the code will handle the different complex and navigation properties of EntityA.

The Id property is skipped cause it's neither a navigation property, complex property or collection property.

The EntityA.EntityB property is a navigation property. A corresponding entry will be added to the list of found navigation properties, the entry will include a reference to the EntityB and the property path [EntityB].

The EntityA.ComplexA property is a complex property. In this case we perform a recursive search. We first add the ComplexA type to the set of processed types, this is to ensure that we detect cycles and avoid infinite recursions. We'll recursively visit the following paths:

  • ComplexA -> ComplexB -> EntityB. We terminate at the navigation property EntityB and add it to the result list with property path [ComplexA, ComplexB, EntityB].
  • ComplexA -> ComplexB -> ComplexA. We detect that ComplexA is already in the set of processed types and skip it to avoid infinite recursion.

And finally, we visit property EntityA.ComplexB and we'll visit the paths:

  • ComplexB -> EntityB. We add the navigation property to the list with property path [ComplexB, EntityB]
  • ComplexB -> ComplexA -> ComplexB. We detect cycle and terminate.

So, the final result will have, the following (structuralType, path from root entity, navigation property) tuples:

  • (EntityA, [EntityB], EntityB)
  • (ComplexB, [ComplexA, ComplexB, EntityB], EntityB)
  • (ComplexB, [ComplexB, EntityB], EntityB)

We see that ComplexA and ComplexB were processed multiple times from different sub-graphs of EntityA and they may return different results. For example, ComplexB produces the EntityB navigation property with different paths depending on which subgraph it was visited from (EntityA->ComplexA subgraph vs EntityA->ComplexB).

This means we can't simply cache the navigation properties on each complex type and add them to the result when we visit the type. We have to keep track of the path from the complex type to the navigation property. Then we visit the complex property we combine the path from the root entity to the current property with the path from the complex type to the navigation property.

If we create a cache, we'd have to balance between storing too much and computing too much. For may have to use a hybrid of lists and linked list nodes to store the cached navigation properties as well as their paths. Each complex type will need an efficient cache of its navigation properties as well as those of its the nested complex properties, and it will need to prepend the nested properties to the property paths of its deeply nested navigation property. However, we go about this, we should be able to save the cost of looking up deeply nested complex property chains which don't have navigation properties (these will be efficient to cache since the results are empty).

@habbes
Copy link
Contributor

habbes commented Feb 3, 2023

The cache idea in the previous comment may not be beneficial if your model has many complex types but no nesting of complex types. In such case it would probably even make things worse. I don't whether the issue you experience is due to deep nesting or just many complex types in a flat schema. For this reason, I propose we first try caching and optimizing the ThisAndBaseTypes() and DerivedTypes() methods and see whether that has material impact. I'd propose created a context object to pass around the ODataModelBuilder internal static methods, and put the cache in that context object.

I don't have bandwidth to implement this at the moment so I can't provide an ETA. But once it's properly assigned in the backlog, we'll get to it. Alternatively, if you're in a position to contribute an optimization, we'll be happy to review your PR @caviyacht

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants