Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Math: Optimise 16-bit matrix multiplication functions. #9088

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
16 changes: 8 additions & 8 deletions src/math/matrix.c
Original file line number Diff line number Diff line change
Expand Up @@ -51,12 +51,12 @@ int mat_multiply(struct mat_matrix_16b *a, struct mat_matrix_16b *b, struct mat_
x = a->data + a->columns * i;
y = b->data + j;
for (k = 0; k < b->rows; k++) {
s += (int32_t)(*x) * (*y);
x++;
/* Enhanced pointer arithmetic */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I don't understand this commend. "Enhanced" as compared to what or when? Does it help understanding the code? Or is "enhanced point" a Maths term that I don't know?

s += (int32_t)(*x++) * (*y);
y += y_inc;
}
*z = (int16_t)s; /* For Q16.0 */
z++;
/* Enhanced pointer arithmetic */
*z++ = (int16_t)s;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally also like expressions like

*x++ = value;

but I don't think they actually improve readability, they're just more compact, so help scan the code faster. And I certainly don't expect these changes to affect performance. If you really want to convince me, you need to compare assembly before and after this change.

}
}

Expand All @@ -69,12 +69,12 @@ int mat_multiply(struct mat_matrix_16b *a, struct mat_matrix_16b *b, struct mat_
x = a->data + a->columns * i;
y = b->data + j;
for (k = 0; k < b->rows; k++) {
s += (int32_t)(*x) * (*y);
x++;
/* Enhanced pointer arithmetic */
s += (int32_t)(*x++) * (*y);
y += y_inc;
}
*z = (int16_t)(((s >> shift_minus_one) + 1) >> 1); /*Shift to Qx.y */
z++;
/* Enhanced pointer arithmetic */
*z++ = (int16_t)(((s >> shift_minus_one) + 1) >> 1); /*Shift to Qx.y */
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of not really helpful comments, very confusing whether this changes anything. With this commit I don't think I can approve the PR

return 0;
Expand Down