Matlab code for matrix sketching using Frequent Directions (FD) variants. Implements the original and fast FD algorithms (Liberty, 2013), a parameterization that varies smoothly between iterative SVD and FD (Desai et al, 2016), as well as a randomized variant suited for sparse inputs (Teng & Chu, 2017).
Add the single file FrequentDirections.m
to your Matlab path. Adding the directory Examples
to the path is useful for running the examples. Unit tests can be run from the Testing
directory.
To sketch an in-memory matrix:
k = 16; % sketch size
sketcher = FrequentDirections(k); % Initialize object
d = 64; % data dimensionality
data = randn(1000,d);
sketcher(data); % process samples
get(sketcher) % return sketch
sketcher.coverr(data) % covariance error
sketcher.projerr(data) % projection error
To sketch streaming data:
d = 512; % different data dimensionality
sketcher = FrequentDirections(32); % Initialize object
count = 0;
while count < 1000
data = randn(1,d); % random sample
sketcher(data); % consume sample
count = count + 1;
end
get(sketcher) % return sketch
The script exampleDesai.m
reproduces a figure from Desai et al. 2016 illustrating the performance and runtime of different FD variants:
- Desai, A., Ghashami, M., & Phillips, J. M. (2016). Improved practical matrix sketching with guarantees. IEEE Transactions on Knowledge and Data Engineering, 28(7), 1678-1690.
- Ghashami, M., Liberty, E., Phillips, J. M., & Woodruff, D. P. (2016). Frequent directions: Simple and deterministic matrix sketching. SIAM Journal on Computing, 45(5), 1762-1792.
- Liberty, E. (2013). Simple and deterministic matrix sketching. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 581-588).
- Teng, D. & Chu, D. (2017). Low-Rank approximation via sparse frequent directions. arXiv preprint arXiv:1705.07140.
Copyright (c) 2017 Brian Lau brian.lau@upmc.fr, see LICENSE
Please feel free to fork and contribute!