-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xcp ref implementation #2895
xcp ref implementation #2895
Conversation
This routine computes the matrix of cross product of data stored in column major format, in batches. For matrix X of dimensions p x n, the i,j th entry of the cross product matrix is C_ij = \sum_k (x_ik-\mu_i) (x_jk-\mu_k) where x_ij is the jth element of the ith row, of the matrix X. Implementation uses the BLAS routine GEMM. Signed-off-by: Dhanus M Lal <Dhanus.MLal@fujitsu.com>
Signed-off-by: Dhanus M Lal <Dhanus.MLal@fujitsu.com>
double alpha; | ||
if (accWtOld != 0) | ||
{ | ||
double * sumOld = daal::services::internal::service_malloc<double, cpu>(nFeatures, sizeof(double)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double * sumOld = daal::services::internal::service_malloc<double, cpu>(nFeatures, sizeof(double)); | |
double* const sumOld = daal::services::internal::service_malloc<double, cpu>(nFeatures, sizeof(double)); |
sumOld[i] = sum[i]; | ||
} | ||
// S_old S_old^t/accWtOld | ||
alpha = 1.0 / accWtOld; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any checks for overflow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does onedal have some macros to check floating point overflow?
transb = 'T'; | ||
alpha = 1.0; | ||
beta = 1.0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would recommend not to use the same variables. It's confusing and also can be dangerous if someone will forgot to change them.
transb = 'T'; | ||
alpha = 1.0; | ||
beta = 1.0; | ||
blasInst.xgemm(&transa, &transb, &nFeatures, &nFeatures, &nVectors, &alpha, data, &nFeatures, data, &nFeatures, &beta, crossProduct, | ||
&nFeatures); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
transb = 'T'; | |
alpha = 1.0; | |
beta = 1.0; | |
blasInst.xgemm(&transa, &transb, &nFeatures, &nFeatures, &nVectors, &alpha, data, &nFeatures, data, &nFeatures, &beta, crossProduct, | |
&nFeatures); | |
{ | |
constexpr char transb = 'T'; | |
constexpr double alpha = 1.0; | |
constexpr double beta = 1.0; | |
blasInst.xgemm(&transa, &transb, &nFeatures, &nFeatures, &nVectors, &alpha, data, &nFeatures, data, &nFeatures, &beta, crossProduct, | |
&nFeatures); | |
} |
Signed-off-by: Dhanus M Lal <Dhanus.MLal@fujitsu.com>
/intelci: run |
/intelci: run |
@KulikovNikita, were your comments addressed? |
/intelci: run |
/intelci: run |
Can you provide details of the check that is failing? |
I suppose these are sporadic failures. Restarting CI |
/intelci: run |
/intelci: run |
1 similar comment
/intelci: run |
Notation
Data is a matrix$X\in\mathbb{R}^{p\times n}$ . Each column is a $p$ -dimensional vector sampled independently. The matrix $X$ is assumed to be stored in column-major fashion.
xcp
Matrix of cross products is a$p\times p$ matrix whose $(i,j)$th entry is given by
$$C_{ij} = \sum_{k=1}^n(x_{ik}-\mu_i)(x_{jk}-\mu_j).$$ $x_{ik}$ is the $i$th component of the $k$th random vector.
Here
The implementation requires this to be computed in batches of$X$ , where the raw sum is passed between the batch computations. This can be done using the following:
Let$\mu'$ be the mean of the data until the previous batch and let $\mu$ be the mean including the data from current batch. Let $n'$ be the total number of data points until the previous batch and $S'$ be their sum (i.e., $\mu' = S'/n'$ and $\mu = S/n$ ).
Then the$(i,j)$ entry of $C$ can be computed using
$$C_{ij}\leftarrow C_{ij}+(\mu_j'-\mu_j)S_i' + (\mu_i'-\mu_i)S_j' + (\mu_i\mu_j-\mu_i'\mu_j')n' + \sum_{k=n'+1}^n(x_{ik}-\mu_i)(x_{jk}-\mu_j).$$
This can be simplified to
$$C_{ij} \leftarrow C_{ij} + \frac{S_i' S_j'}{n'} - \frac{S_i S_j}{n} + \sum_{k=n'+1}^n x_{ik}x_{jk}$$
$$C \leftarrow C + \frac{S'S'^t}{n'} - \frac{SS^t}{n} + XX^t$$
or
The level3 BLAS routine GEMM is used to compute the matrix products.