-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize GLES3 CanvasLight calculations and make it precise. #161
Optimize GLES3 CanvasLight calculations and make it precise. #161
Conversation
Can you upload a project with that test scene? I would like to benchmark it. |
I have edited the project to use the default physics engine, and freezed all the physics nodes after 0.5 seconds because godot 2d physics performance will affect the benchmark. |
e8f7665
to
309cdc1
Compare
I am getting slightly worse performance in the test with the changes. I am going to try production builds next to see if it changes anything. Before
After
|
With
After
|
I see that after is faster by nearly 10 frames most of the time, also we are just skipping some useless calculactions that happens 4 times for each light update frame. // Before
for (int i = 0; i < 4; i++) {
Vector3 cam_target = Basis::from_euler(Vector3(0, 0, Math_TAU * ((i + 3) / 4.0))).xform(Vector3(0, 1, 0));
projection = projection * Projection(Transform3D().looking_at(cam_target, Vector3(0, 0, -1)).affine_inverse());
} // After
Projection projections[4] = {
Projection(Vector4(0, 0, -1, 0), Vector4(1, 0, 0, 0), Vector4(0, -1, 0, 0), Vector4(0, 0, 0, 1)),
Projection(Vector4(-1, 0, 0, 0), Vector4(0, 0, -1, 0), Vector4(0, -1, 0, 0), Vector4(0, 0, 0, 1)),
Projection(Vector4(0, 0, 1, 0), Vector4(-1, 0, 0, 0), Vector4(0, -1, 0, 0), Vector4(0, 0, 0, 1)),
Projection(Vector4(1, 0, 0, 0), Vector4(0, 0, 1, 0), Vector4(0, -1, 0, 0), Vector4(0, 0, 0, 1))
};
for (int i = 0; i < 4; i++) {
projection = projection * projections[i];
} |
309cdc1
to
15574c6
Compare
I have done some more changes to make it slightly faster. // Before // After |
15574c6
to
ca21b41
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have done some more changes to make it slightly faster. Building both locally with
optimize=none
.
I tried it with optimize=none
as well, and the performance improvement is cleanly past margin of error.
Before
Project FPS: 668 (1.49 mspf)
Project FPS: 670 (1.49 mspf)
Project FPS: 669 (1.49 mspf)
Project FPS: 670 (1.49 mspf)
Project FPS: 670 (1.49 mspf)
Project FPS: 664 (1.50 mspf)
Project FPS: 670 (1.49 mspf)
After
Project FPS: 684 (1.46 mspf)
Project FPS: 681 (1.46 mspf)
Project FPS: 681 (1.46 mspf)
Project FPS: 683 (1.46 mspf)
Project FPS: 685 (1.45 mspf)
Project FPS: 684 (1.46 mspf)
Project FPS: 684 (1.46 mspf)
production=yes
still just barely performs better, though within margin of error. I think this will be most beneficial on older systems. Those do need performance improvements the most anyways.
test-light.mp4