Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bevy_reflect: Function Overloading (Generic & Variadic Functions) #15074

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

MrGVSV
Copy link
Member

@MrGVSV MrGVSV commented Sep 7, 2024

Objective

Currently function reflection requires users to manually monomorphize their generic functions. For example:

fn add<T: Add<Output=T>>(a: T, b: T) -> T {
    a + b
}

// We have to specify the type of `T`:
let reflect_add = add::<i32>.into_function();

This PR doesn't aim to solve that problem—this is just a limitation in Rust. However, it also means that reflected functions can only ever work for a single monomorphization. If we wanted to support other types for T, we'd have to create a separate function for each one:

let reflect_add_i32 = add::<i32>.into_function();
let reflect_add_u32 = add::<u32>.into_function();
let reflect_add_f32 = add::<f32>.into_function();
// ...

So in addition to requiring manual monomorphization, we also lose the benefit of having a single function handle multiple argument types.

If a user wanted to create a small modding script that utilized function reflection, they'd have to either:

  • Store all sets of supported monomorphizations and require users to call the correct one
  • Write out some logic to find the correct function based on the given arguments

While the first option would work, it wouldn't be very ergonomic. The second option is better, but it adds additional complexity to the user's logic—complexity that bevy_reflect could instead take on.

Solution

Introduce function overloading.

A DynamicFunction can now be overloaded with other DynamicFunctions. We can rewrite the above code like so:

let reflect_add = add::<i32>
    .into_function()
    .with_overload(add::<u32>)
    .with_overload(add::<f32>);

When invoked, the DynamicFunction will attempt to find a matching overload for the given set of arguments.

And while I went into this PR only looking to improve generic function reflection, I accidentally added support for variadic functions as well (hence why I use the broader term "overload" over "generic").

// Supports 1 to 4 arguments
let multiply_all = (|a: i32| a)
    .into_function()
    .with_overload(|a: i32, b: i32| a * b)
    .with_overload(|a: i32, b: i32, c: i32| a * b * c)
    .with_overload(|a: i32, b: i32, c: i32, d: i32| a * b * c * d);

This is simply an added bonus to this particular implementation. Full variadic support (i.e. allowing for an indefinite number of arguments) will be added in a later PR.

Alternatives & Rationale

I explored a few options for handling generic functions. This PR is the one I feel the most confident in, but I feel I should mention the others and why I ultimately didn't move forward with them.

Adding GenericDynamicFunction

TL;DR: Adding a distinct GenericDynamicFunction type unnecessarily splits and complicates the API.

Details

My initial explorations involved a dedicated GenericDynamicFunction to contain and handle the mappings.

This was initially started back when DynamicFunction was distinct from DynamicClosure. My goal was to not prevent us from being able to somehow make DynamicFunction implement Copy. But once we reverted back to a single DynamicFunction, that became a non-issue.

But that aside, the real problem was that it created a split in the API. If I'm using a third-party library that uses function reflection, I have to know whether to request a DynamicFunction or a GenericDynamicFunction. I might not even know ahead of time which one I want. It might need to be determined at runtime.

And if I'm creating a library, I might want a type to contain both DynamicFunction and GenericDynamicFunction. This might not be possible if, for example, I need to store the function in a HashMap.

The other concern is with IntoFunction. Right now DynamicFunction trivially implements IntoFunction since it can just return itself. But what should GenericDynamicFunction do? It could return itself wrapped into a DynamicFunction, but then the API for DynamicFunction would have to account for this. So then what was the point of having a separate GenericDynamicFunction anyways?

And even apart from IntoFunction, there's nothing stopping someone from manually creating a generic DynamicFunction through lying about its FunctionInfo and wrapping a GenericDynamicFunction.

That being said, this is probably the "best" alternative if we added a Function trait and stored functions as Box<dyn Function>.

However, I'm not convinced we gain much from this. Sure, we could keep the API for DynamicFunction the same, but consumers of Function will need to account for GenericDynamicFunction regardless (e.g. handling multiple FunctionInfo, a ranged argument count, etc.). And for all cases, except where using DynamicFunction directly, you end up treating them all like GenericDynamicFunction.

Right now, if we did go with GenericDynamicFunction, the only major benefit we'd gain would be saving 24 bytes. If memory ever does become an issue here, we could swap over. But I think for the time being it's better for us to pursue a clearer mental model and end-user ergonomics through unification.

Using the FunctionRegistry

TL;DR: Having overloads only exist in the FunctionRegistry unnecessarily splits and complicates the API.

Details

Another idea was to store the overloads in the FunctionRegistry. Users would then just call functions directly through the registry (i.e. registry.call("my_func", my_args)).

I didn't go with this option because of how it specifically relies on the functions being registered. You'd not only always need access to the registry, but you'd need to ensure that the functions you want to call are even registered.

It also means you can't just store a generic DynamicFunction on a type. Instead, you'll need to store the function's name and use that to look up the function in the registry—even if it's only ever used by that type.

Doing so also removes all the benefits of DynamicFunction, such as the ability to pass it to functions accepting IntoFunction, modify it if needed, and so on.

Like GenericDynamicFunction this introduces a split in the ecosystem: you either store DynamicFunction, store a string to look up the function, or force DynamicFunction to wrap your generic function anyways. Or worse yet: have DynamicFunction wrap the lookup function using FunctionRegistryArc.

Generic ArgInfo

TL;DR: Allowing ArgInfo and ReturnInfo to store the generic information introduces a footgun when interpreting FunctionInfo.

Details

Regardless of how we represent a generic function, one thing is clear: we need to be able to represent the information for such a function.

This PR does so by introducing a FunctionInfoType enum to wrap one or more FunctionInfo values.

Originally, I didn't do this. I had ArgInfo and ReturnInfo allow for generic types. This allowed us to have a single FunctionInfo to represent our function, but then I realized that it actually lies about our function.

If we have two ArgInfo that both allow for either i32 or u32, what does this tell us about our function? It turns out: nothing! We can't know whether our function takes (i32, i32), (u32, u32), (i32, u32), or (u32, i32).

It therefore makes more sense to just represent a function with multiple FunctionInfo since that's really what it's made up of.

Flatten FunctionInfo

TL;DR: Flattening removes additional per-overload information some users may desire and prevents us from adding more information in the future.

Details

Why don't we just flatten multiple FunctionInfo into just one that can contain multiple signatures?

This is something we could do, but I decided against it for a few reasons:

  • The only thing we'd be able to get rid of for each signature would be the name. While not enough to not do it, it doesn't really suggest we have to either.
  • Some consumers may want access to the names of the functions that make up the overloaded function. For example, to track a bug where an undesirable function is being added as an overload. Or to more easily locate the original function of an overload.
  • We may eventually allow for more information to be stored on FunctionInfo. For example, we may allow for documentation to be stored like we do for TypeInfo. Consumers of this documentation may want access to the documentation of each overload as they may provide documentation specific to that overload.

Testing

This PR adds lots of tests and benchmarks, and also adds to the example.

To run the tests:

cargo test --package bevy_reflect --all-features

To run the benchmarks:

cargo bench --bench reflect_function --all-features

To run the example:

cargo run --package bevy --example function_reflection --all-features

Benchmarks

One of my goals with this PR was to leave the typical case of non-overloaded functions largely unaffected by the changes introduced in this PR. And while the static size of DynamicFunction has increased by 17% (from 136 to 160 bytes), the performance has generally stayed the same:

main 7d293ab
into/function 37 ns 46 ns
with_overload/01_simple_overload - 149 ns
with_overload/01_complex_overload - 332 ns
with_overload/10_simple_overload - 1266 ns
with_overload/10_complex_overload - 2544 ns
call/function 57 ns 58 ns
call/01_simple_overload - 255 ns
call/01_complex_overload - 595 ns
call/10_simple_overload - 740 ns
call/10_complex_overload - 1824 ns

For the overloaded function tests, the leading number indicates how many overloads there are: 01 indicates 1 overload, 10 indicates 10 overloads. The complex cases have 10 unique generic types and 10 arguments, compared to the simple 1 generic type and 2 arguments.

I aimed to prioritize the performance of calling the functions over creating them, hence creation speed tends to be a bit slower.

There may be other optimizations we can look into but that's probably best saved for a future PR.

The important bit is that the standard into/function and call/function benchmarks show minimal regressions.


Showcase

Function reflection now supports function overloading! This can be used to simulate generic functions:

fn add<T: Add<Output=T>>(a: T, b: T) -> T {
    a + b
}

let reflect_add = add::<i32>
    .into_function()
    .with_overload(add::<u32>)
    .with_overload(add::<f32>);

let args = ArgList::default().push_owned(25_i32).push_owned(75_i32);  
let result = func.call(args).unwrap().unwrap_owned();  
assert_eq!(result.try_take::<i32>().unwrap(), 100);  
  
let args = ArgList::default().push_owned(25.0_f32).push_owned(75.0_f32);  
let result = func.call(args).unwrap().unwrap_owned();  
assert_eq!(result.try_take::<f32>().unwrap(), 100.0);

You can also simulate variadic functions:

#[derive(Reflect, PartialEq, Debug)]
struct Player {
    name: Option<String>,
    health: u32,
}

// Creates a `Player` with one of the following:  
// - No name and 100 health  
// - A name and 100 health  
// - No name and custom health  
// - A name and custom health
let create_player = (|| Player {
        name: None,
        health: 100,
    })
    .into_function()
    .with_overload(|name: String| Player {
        name: Some(name),
        health: 100,
    })
    .with_overload(|health: u32| Player {
        name: None,
        health
    })
    .with_overload(|name: String, health: u32| Player {
        name: Some(name),
        health,
    });

let args = ArgList::default()
    .push_owned(String::from("Urist"))
    .push_owned(55_u32);
    
let player = create_player
    .call(args)
    .unwrap()
    .unwrap_owned()
    .try_take::<Player>()
    .unwrap();
	
assert_eq!(
    player,
    Player {
        name: Some(String::from("Urist")),
        health: 55
    }
);

@MrGVSV MrGVSV added C-Feature A new feature, making something new possible A-Reflection Runtime information about types D-Complex Quite challenging from either a design or technical perspective. Ask for help! S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Sep 7, 2024
@MrGVSV MrGVSV added this to the 0.15 milestone Sep 7, 2024
This will be a foundational piece to enabling overloaded functions.
Allows for manually defining basic generic function reflection
Function reflection doesn't really support dynamic types currently
so it makes more sense to start stricter with Reflect.

This will make it easier to ensure all arguments have valid types
for generic function reflection.
Since ArgumentSignature is just a wrapper around Box<[Type]>,
it should already contain a "high-quality hash"
Turns out that NoOpHash works by only using the last u64 hash,
making collisions very likely when used with ArgumentSignature
Mainly usage of HashMap::insert_unique_unchecked
Used to help reduce code duplication for overloaded functions
and to give users the option to pretty-print FunctionInfo
Added the `simple_` qualifier
@MrGVSV MrGVSV force-pushed the mrgvsv/reflect/reflect-function-overloading branch from 7d293ab to 0ecda65 Compare September 7, 2024 01:38
@nixpulvis
Copy link
Contributor

nixpulvis commented Sep 7, 2024

My first thought while reading the showcase code:

let reflect_add = add::<i32>
    .into_function()
    .with_overload(add::<u32>)
    .with_overload(add::<f32>);

was that it’s a little strange to have the i32 version of the function first then overload it, when we really want to describe a function valid for i32, u32, and f32 the same way.

More generally, does the order of with_overload matter? And if so, could users depend on this for functionality?

@MrGVSV
Copy link
Member Author

MrGVSV commented Sep 7, 2024

My first thought while reading the showcase code:

let reflect_add = add::<i32>

    .into_function()

    .with_overload(add::<u32>)

    .with_overload(add::<f32>);

was that it’s a little strange to have the i32 version of the function first then overload it, when we really want to describe a function valid for i32, u32, and f32 the same way.

Yeah that makes sense. Unfortunately we need DynamicFunction to be in a valid state (i.e. callable with correct FunctionInfo) in the case no overload is given.

We could potentially solve this by moving the overload creation to a builder that can validate that at least one function is given.

Would that be a better/clearer way of doing this?

Something like:

let reflect_add = DynamicFunctionBuilder::new()
    .with(add::<i32>)
    .with(add::<u32>)
    .with(add::<f32>)
    .unwrap();

More generally, does the order of with_overload matter? And if so, could users depend on this for functionality?

Great question! No the order does not matter: the correct overload is chosen based on the arguments given to the function when calling it (and returns an error if none was found).

@nixpulvis
Copy link
Contributor

So my initial thought was essentially DynamicFunction::overloaded((add::<i32>, add::<u32>, ...)), or what you have for a builder notation. I like this because it doesn't make any one type feel special, but then I read the example you wrote where the user just creates a normal ol' closure, calls into_function() on it, and then uses that as the builder. I worry it's not worth the extra overhead just to make the operations a bit more clear, though I'm not sure how people will want to use this feature myself.

No the order does not matter...

I'm sorry for just asking instead of testing it myself (I'll check out the code and play with it later I promise), but what happens if I overload two conflicting functions? Is it a static error?

@MrGVSV
Copy link
Member Author

MrGVSV commented Sep 7, 2024

I'm sorry for just asking instead of testing it myself (I'll check out the code and play with it later I promise), but what happens if I overload two conflicting functions? Is it a static error?

DynamicFunction[Mut]::with_overload will panic if there are conflicting signatures. I included a DynamicFunction[Mut]::try_with_overload for a non-panicking version that will just return an error (along with the original function since we take by value).

@pablo-lua
Copy link
Contributor

We could potentially solve this by moving the overload creation to a builder that can validate that at least one function is given.

Would that be a better/clearer way of doing this?

I Think I like the builder the most. But that depends: Do we want to support adding overloads at runtime? I think this is something useful to have for modding, so maybe this is the better way.
But yeah, would be good to have a method for adding all the overloads at once.

Copy link
Contributor

@bushrat011899 bushrat011899 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really interesting change and very well documented!

Copy link
Contributor

@pablo-lua pablo-lua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great API for Reflected functions that will greatly improve this feature. LGTM

@pablo-lua pablo-lua added S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it and removed S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Oct 2, 2024
@alice-i-cecile
Copy link
Member

@MrGVSV I'm down to merge this, but this needs merge conflict resolution :)

@alice-i-cecile alice-i-cecile modified the milestones: 0.15, 0.16 Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Reflection Runtime information about types C-Feature A new feature, making something new possible D-Complex Quite challenging from either a design or technical perspective. Ask for help! S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

5 participants