use-case question: performance comparison #209

RedLhek · 2021-08-05T06:17:20Z

RedLhek
Aug 5, 2021

Hi,
First of all - great thanks for the library, it looks really amazing and well-written (from the framework architecture point or view).

I want to use it as my core component in personal TA and investing, nevertheless, I am a bit confused about the main focus. I was experimenting with vectorbt and ta-lib speed comparison and found out, that ta-lib is about 10 times faster for signal preparation.
I am not sure it's a fair comparison, so... could you please tell a bit more about the top use-cases for vectorbt? Portfolio optimization? Strategy backtesting? Could we speed up all of them with pure ta-lib usage? Or when should I use vectorbt?

Here is the link on performance comparison - https://colab.research.google.com/drive/1szojVeKW_HWHr6iXEJogD0PsPJ6Caa7s?usp=sharing

Thanks in advance!

Answered by polakowo

Aug 5, 2021

Answering your first question: vectorbt does diverse pre- and postprocessing on your inputs, parameters, and outputs. If it simply called the particular talib function, the vbt.talib method would yield the same performance. But what happens is this in order to broadcast inputs and parameters, convert outputs back to pandas, build multi-level columns, prepare mappers for indexing, etc. This all introduces an ms overhead. In fact, if you run the same on an input of length 1e6, talib function executes in 27.2 ms while vectorbt in 48 ms. The difference will narrow with the number of elements.

Remember that a major focus of vectorbt is in hyperparameter optimization through the construction of…

View full answer

RedLhek · 2021-08-05T06:29:19Z

RedLhek
Aug 5, 2021
Author

A follow-up question could be: why should we use numba rather than numpy for vector/matrix-based operations?
Or is numba faster for pandas columns operations?

0 replies

polakowo · 2021-08-05T14:39:08Z

polakowo
Aug 5, 2021
Maintainer

Answering your first question: vectorbt does diverse pre- and postprocessing on your inputs, parameters, and outputs. If it simply called the particular talib function, the vbt.talib method would yield the same performance. But what happens is this in order to broadcast inputs and parameters, convert outputs back to pandas, build multi-level columns, prepare mappers for indexing, etc. This all introduces an ms overhead. In fact, if you run the same on an input of length 1e6, talib function executes in 27.2 ms while vectorbt in 48 ms. The difference will narrow with the number of elements.

Remember that a major focus of vectorbt is in hyperparameter optimization through the construction of large hyperparameter grids, which involve wide 2-dimensional arrays. Talib doesn't handle 2D so you must explicitly loop through columns and call the talib function on each. This is automatically handled by vectorbt, so all you have to do is give input of any shape and a list of parameters, and the rest is done for you. The price you pay for this is a bit slower execution. If you don't want all these features, of course, you should use talib library and nothing else.

The reason why run_unique is slower is that it only makes sense when you have repeating hyperparameters. For example, to compare a single faster moving average with hundreds of slower-moving averages, you wouldn't want to recalculate the faster MA hundreds of times. You just tell vectorbt to run the calculation on the unique parameters only and stack them in the final output data frame. The stacking operation both in NumPy and pandas is pretty slow, but still mostly worth it (just look at SMA.run_combs execution time in https://github.com/polakowo/vectorbt/blob/master/tests/notebooks/indicators.ipynb).

Answering the question on what the use case of vectorbt is: everything you mentioned. Talib is just a library with technical indicators - many sophisticated strategies don't use those at all. Running indicators is not vectorbt's main function. The main competitive advantage of vectorbt is the backtesting speed, that is, the amount of time needed to convert an array of signals into a performance metric. Without exaggeration, there is nothing comparable in the open-source. The only backtesters that can beat vectorbt are written in pure Java and C(++), but they don't have the flexibility of Python that vectorbt enjoys.

On the last question: because in Numba you can write any operation as a loop and it will have a slightly lower but still similar speed to a vectorized operation in NumPy. Except that your code is now more readable and allows a lot of use cases, such as calling functions inside functions. Plus you cannot or it's not worth the effort to express some backtesting operations in NumPy (just try implementing expanding standard deviation).

6 replies

polakowo Aug 6, 2021
Maintainer

You seem to have a very simple strategy on your mind that is easily realizable with NumPy + multiprocessing. But backtesting is usually much more complex than this from an implementation perspective. You can look at the pairs trading example here, where vectorbt runs ~1000x faster than backtrader and yields the same results; and this using 3 different approaches, all without parallelization. And you won't be able to recreate backtrader with pure NumPy because backtesting is a path-dependent process, that is, the previous data points affect the current one so you must process data points iteratively one by one - say bye to vectorization. Numba, on the other hand, allows you to write this iterative code in Python but execute it extremely fast.

BTW running your example on Apple MacBook M1 yields a faster execution for vectorbt, even if the code is slightly unoptimized and your comparison is not fair because your _run_talib_macd is indeed very simplified and vectorbt does a lot more than simply running an indicator and updating cash - it generates classes and other artifacts along the way that let you analyze your strategy and produces a list of filled orders. You can skip the analysis part by performing all of these operations using Numba alone (both vbt.MACD and vbt.Portfolio.from_signals have Numba equivalents), which would result in even faster execution. You could then split your parameters and run the vectorbt pipeline in parallel. Then the benchmarks will be slightly more comparable.

shiosai Aug 6, 2021

while TA-lib (or other TA-signal processing libraries) could be a bit faster than vbt, vbt provides a much faster interface for strategies parameters backtesting thanks to numba and high RAM usage

I think its always worth it to profile/benchmark for your scenario. When I "discovered" vbt I made some simple benchmarks with my data and arithmetic operations e.g. like a * b vs a.vbt * b and was amazed how they were suddenly ten times faster.

When I changed my pipeline and checked the whole thing again though, it was suddenly over 30% slower. If I understand correctly, that's because of caching which doesn't work via vbt accessors (https://vectorbt.dev/docs/base/accessors.html). In any case, still supper fast :)

polakowo Aug 6, 2021
Maintainer

@shiosai yeah, some operations may be slower due to excessive memory usage (running out of memory can force your OS to use disk space which is slow) or because you re-compute some of the attributes again and again. Accessors do not support caching because users tend to change their Series/DataFrames in place and this would make cached attributes outdated and potentially lead to incorrect results. But if you think something runs particularly slow, you can post it here and I'll take a look.

RedLhek Aug 6, 2021
Author

@polakowo thanks again for your answers,
yeah, the backtesting was VERY simplified to suit minimal example

speaking of advanced backtesting strategies, what other advanced strategies does vbt support? I think you have much more than simple pairs trading

moreover, do you have any design doc for the library architecture? or any contribution guide/overview about the main abstractions? to understand the ideas better

polakowo Aug 6, 2021
Maintainer

@RedLhek I will write such an overview section soon, but need to finish some other features first. On advanced strategies, anything that doesn't involve options and futures can be done.

PilotGFX · 2023-12-20T19:41:51Z

PilotGFX
Dec 20, 2023

Some smart guy make a very fast backtest and decides to make a library. More and more features get added, other people request other features, you cannot argue against it, it is only a few ms induced delay, but they mass up, and suddenly its not very fast any more.
The following code, which will go through 10800 candles , takes an incredibly 7,000 ms! Im sure in general, vectorBT is not that slow at all, but as it is an official example i think its fair to compare it to my own test, but note its probably an extreme case.

df["GC"] = df.ta.sma(50, append=True) > df.ta.sma(200, append=True)
golden = df.ta.tsignals(df.GC, asbool=True, append=True)
pf = vbt.Portfolio.from_signals(df.Close, entries=golden.TS_Entries, exits=golden.TS_Exits, freq="D", init_cash=100_000, fees=0.0025, slippage=0.0025)

for comparison, using pandas/numpy exclusively i can backtest 4 million candles in 100 ms(thats 25,900 times as fast), on a test using twice as many bools for entry, and i know that you can even enhance this speed by multiples(this is just on a single core, so trust me when i say multiples)

PS sorry for commenting on an old thread, im not really sure whether the community think that is appropriate to do or not.

1 reply

polakowo Dec 21, 2023
Maintainer

Not sure what the point of your comment is, for me your code runs in 30ms on exactly 10800 candles. You'll only get this slow performance if you disable numba or count in compilation time, but you have to compile the from_signals call only once since it's cached across runtimes. Regarding 4 million candles in 100ms, from_signals can do 4 million candles in 500ms where each bar is either entry or exit (basically the worst case scenario), and this with a single line of code and by enabling logic that is much more capable than you can ever enable with pandas, I guess I shouldn't explain why (keyword: path dependency)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use-case question: performance comparison #209

{{title}}

Replies: 3 comments 7 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

use-case question: performance comparison #209

RedLhek Aug 5, 2021

Replies: 3 comments · 7 replies

RedLhek Aug 5, 2021 Author

polakowo Aug 5, 2021 Maintainer

polakowo Aug 6, 2021 Maintainer

shiosai Aug 6, 2021

polakowo Aug 6, 2021 Maintainer

RedLhek Aug 6, 2021 Author

polakowo Aug 6, 2021 Maintainer

PilotGFX Dec 20, 2023

polakowo Dec 21, 2023 Maintainer

RedLhek
Aug 5, 2021

Replies: 3 comments 7 replies

RedLhek
Aug 5, 2021
Author

polakowo
Aug 5, 2021
Maintainer

polakowo Aug 6, 2021
Maintainer

polakowo Aug 6, 2021
Maintainer

RedLhek Aug 6, 2021
Author

polakowo Aug 6, 2021
Maintainer

PilotGFX
Dec 20, 2023

polakowo Dec 21, 2023
Maintainer