Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Math: Optimise 16-bit matrix multiplication functions. #9088

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
67 changes: 39 additions & 28 deletions src/math/matrix.c
Original file line number Diff line number Diff line change
Expand Up @@ -31,50 +31,61 @@ int mat_multiply(struct mat_matrix_16b *a, struct mat_matrix_16b *b, struct mat_
if (a->columns != b->rows || a->rows != c->rows || b->columns != c->columns)
return -EINVAL;

int32_t s; /* Changed from int64_t to int32_t */
int16_t *x;
int16_t *y;
int16_t *z = c->data;
int i, j, k;
int y_inc = b->columns;
const int shift_minus_one = a->fractions + b->fractions - c->fractions - 1;
int32_t acc; /* Accumulator for dot product calculation */
int16_t *x, *y, *z = c->data; /* Pointers for matrices a, b, and c */
int i, j, k; /* Loop counters */
int y_inc = b->columns; /* Column increment for matrix b elements */
/* Calculate shift amount for adjusting fractional bits in the result */
const int shift = a->fractions + b->fractions - c->fractions - 1;
cujomalainey marked this conversation as resolved.
Show resolved Hide resolved

/* Check shift to ensure no integer overflow occurs during shifting */
if (shift_minus_one < -1 || shift_minus_one > 31)
if (shift < -1 || shift > 31)
return -ERANGE;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your "cycles" figures before and after are very similar to the previous commit. This tells me that you probably measured a state before several changes (possibly the whole PR) and after them, not before and after each commit. So "cycles" don't seem correct.


/* If all data is Q0 */
if (shift_minus_one == -1) {
/* Special case when shift is -1 (Q0 data) */
if (shift == -1) {
/* Matrix multiplication loop */
for (i = 0; i < a->rows; i++) {
for (j = 0; j < b->columns; j++) {
s = 0;
/* Initialize accumulator for each element */
acc = 0;
/* Set x at the start of ith row of a */
x = a->data + a->columns * i;
/* Set y at the top of jth column of b */
y = b->data + j;
/* Dot product loop */
for (k = 0; k < b->rows; k++) {
/* Enhanced pointer arithmetic */
s += (int32_t)(*x++) * (*y);
/* Multiply & accumulate */
acc += (int32_t)(*x++) * (*y);
/* Move to next row in the current column of b */
y += y_inc;
}
/* Enhanced pointer arithmetic */
*z++ = (int16_t)s;
*z = (int16_t)acc;
z++; /* Move to the next element in the output matrix */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you just changed *z = x; z++; to *z++ = x; in the previous commit and claimed that it was an improvement.

}
}

return 0;
}

for (i = 0; i < a->rows; i++) {
for (j = 0; j < b->columns; j++) {
s = 0;
x = a->data + a->columns * i;
y = b->data + j;
for (k = 0; k < b->rows; k++) {
} else {
/* General case for other shift values */
for (i = 0; i < a->rows; i++) {
for (j = 0; j < b->columns; j++) {
/* Initialize accumulator for each element */
acc = 0;
/* Set x at the start of ith row of a */
x = a->data + a->columns * i;
/* Set y at the top of jth column of b */
y = b->data + j;
/* Dot product loop */
for (k = 0; k < b->rows; k++) {
/* Multiply & accumulate Enhanced pointer arithmetic */
acc += (int32_t)(*x++) * (*y);
/* Move to next row in the current column of b */
y += y_inc;
}
/* Enhanced pointer arithmetic */
s += (int32_t)(*x++) * (*y);
y += y_inc;
*z = (int16_t)(((acc >> shift) + 1) >> 1); /*Shift to Qx.y */
z++; /* Move to the next element in the output matrix */
}
/* Enhanced pointer arithmetic */
*z++ = (int16_t)(((s >> shift_minus_one) + 1) >> 1); /*Shift to Qx.y */
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of not really helpful comments, very confusing whether this changes anything. With this commit I don't think I can approve the PR

return 0;
Expand Down