Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tile and repeat_values procedures #648

Merged
merged 3 commits into from
May 7, 2024

Conversation

AngelEzquerra
Copy link
Contributor

Add the following new procedures:

  • repeat_values: Procedures that let you repeat the values of a Tensor multiple times. This functionality exists both in numpy (repeat) and in Matlab (repelem).
  • tile: Procedure that lets you construct a new tensor by repeating the input tensor a number of times on one or more axes. This functionality exists both in numpy (tile) and in Matlab (repmat).

I didn't follow numpy's naming convention for repeat_values to avoid a confusing with nim's sequtils.repeat, which does not repeat individual values but the whole sequence (like tile does).

I measured the performance of these functions and it is comparable to numpy's. repeat_values is faster than numpy in --d:danger mode and a bit slower in --d:release mode. tile is slower in both cases (but the difference is not too large). tile's implementation could be improved (to avoid intermediate tensor allocations) in the future.

These procedures let you repeat the values of a Tensor multiple times. This functionality exists both in numpy (`repeat`) and in Matlab (`repelem`).
A different name was chosen here to avoid confusion with nim's `repeat` function, which behaves differently (it repeats the whole input sequence, like numpy's `tile` or Matlab's `repmat` functions), and to make the name more self explanatory.

There are two versions of this procedure (with multiple overloads):

- One that repeats all values the same amount of times over a given axis.
- One that repeats each value a different amount of times, but returns a rank-1 tensor.

Note that the second one is implemented as 2 procs with different argument types (openArray[int] and Tensor[int]).

I measured the performance using the timeit library. The results show that the performance is comparable to numpy's `repeat` function. In particular, a small example which takes numpy's `repeat` ~2-3 usec per iteration, takes ~4 usec in --d:release mode, and ~1-2 usec in --d:danger mode.
This procedure lets you construct a new tensor by repeating the input tensor a number of times on one or more axes. This is similar to numpy's `tile` and Matlab's `repmat` functions.

I measured the performance using the `timeit` library. The results show that the performance is comparable to (but not as good as) numpy's `tile`. In particular, a small example which takes numpy's `tile` ~3-4 usec per iteration, takes ~8-9 usec in --d:release mode, and ~5-6 usec in --d:danger mode.

I believe that the performance could be improved further by preallocating the result Tensor before the tiling operation. The current implementation is not as efficient as it could be because it is based on calling `concat` multiple times, which requires at least as many tensor allocations (of increasing size).
@Vindaar Vindaar merged commit 53afa73 into mratsim:master May 7, 2024
6 checks passed
@AngelEzquerra AngelEzquerra deleted the tile_and_repeat_values branch May 7, 2024 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants