Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

semi-endless loop around add_candle call #447

Closed
movy opened this issue May 19, 2024 · 27 comments
Closed

semi-endless loop around add_candle call #447

movy opened this issue May 19, 2024 · 27 comments
Labels
bug Something isn't working stale No recent activity.

Comments

@movy
Copy link
Contributor

movy commented May 19, 2024

Describe the bug
Until this week I'd been running 0.44 w/o any problems, but 0.48 added some welcome changes, so I switched my backtesting to 0.48 and encountered a very long loop around https://github.com/jesse-ai/jesse/blob/master/jesse/store/state_candles.py#L102 call, which sometimes causes a backtest to hang for 10-15 minutes. This happens sporadically, maybe one out of 100-200 backtests run, but it effectively it blocks the whole pipeline until problematic test is finished with this loop.

This happens on all symbols and tfs.

When I say 'around' it means impossible to track down exact line that hangs, but what I could gather via py-spy is:

  %Own   %Total  OwnTime  TotalTime  Function (filename)
 48.00% 100.00%   627.62s    652.15s   add_candle (jesse/store/state_candles.py)
 52.00%  52.00%   624.50s    624.50s   __getitem__ (jesse/libs/dynamic_numpy_array/__init__.py)
  0.00%   0.00%   0.310s    0.320s   _var (numpy/core/_methods.py)
  0.00%   0.00%   0.070s    0.180s   mean (numpy/core/fromnumeric.py)
  0.00%   0.00%   0.070s    0.110s   _mean (numpy/core/_methods.py)
  0.00%   0.00%   0.070s    0.390s   _std (numpy/core/_methods.py)
  0.00%   0.00%   0.060s    0.060s   timeframe_to_one_minutes (jesse/helpers.py)
  0.00%   0.00%   0.050s    0.050s   _count_reduce_items (numpy/core/_methods.py)
  0.00% 100.00%   0.040s    53.00s   _step_simulator (jesse/modes/backtest_mode.py)
  0.00%   0.00%   0.030s    0.630s   peaks (jesse-bot/custom_indicators/peaks.py)
  0.00%  52.00%   0.020s    25.74s   _simulate_price_change_effect (jesse/modes/backtest_mode.py)
  0.00%   0.00%   0.020s    0.020s   stochrsi (tulipy/__init__.py)
  0.00%   0.00%   0.020s    0.020s   is_collecting_data (jesse/helpers.py)
  0.00%   0.00%   0.020s    0.020s   wrapper (talib/__init__.py)
  0.00%   0.00%   0.010s    0.190s   mean (<__array_function__ internals>)
  0.00%   0.00%   0.010s    0.010s   _sum (numpy/core/_methods.py)
  0.00%   0.00%   0.010s    0.010s   is_live (jesse/helpers.py)
  0.00%   0.00%   0.010s    0.400s   std (numpy/core/fromnumeric.py)
  0.00%   0.00%   0.010s    0.010s   get_candles (jesse/store/state_candles.py)
  0.00%   0.00%   0.010s    0.010s   get_candle_source (jesse/helpers.py)
  0.00%   0.00%   0.010s    0.410s   std (<__array_function__ internals>)
  0.00%   0.00%   0.010s    0.010s   is_debuggable (jesse/helpers.py)
  0.00%   0.00%   0.010s    0.010s   _get_fixed_jumped_candle (jesse/modes/backtest_mode.py)
  0.00%   0.00%   0.010s    0.700s   _check (jesse/strategies/Strategy.py)
  0.00% 100.00%   0.000s    53.00s   backtest_rungs (jesse-bot/backtest_new.py)
  0.00% 100.00%   0.000s    53.00s   _resume_span (ray/util/tracing/tracing_helper.py)
  0.00%   0.00%   0.000s    0.690s   decorated (jesse/services/cache.py)

I've added some logging:

    def get_storage(self, exchange: str, symbol: str, timeframe: str) -> DynamicNumpyArray:
        key = jh.key(exchange, symbol, timeframe)
        print(key) <----
[...]


    def add_candle(
        [...]
        print('getting arr') <---
        arr: DynamicNumpyArray = self.get_storage(exchange, symbol, timeframe)
        print('got arr', len(arr)) <---

And from these prints we can see this method is called millions of times by a single backtest process:
image

Eventually number of calls falls:
image

Which makes me think there's loop inside loop somewhere.

I could not instantly pinpoint recent changes responsible for this. Will roll back to 0.44 for now and slowly work my way to 0.48, maybe I can find where regression first appeared.

Any help from those familiar with recent changes is appreciated @yakir4123

@movy movy added the bug Something isn't working label May 19, 2024
@yakir4123
Copy link
Contributor

yakir4123 commented May 19, 2024

@movy
yesterday i pushed a bug fix, it may be the problem but i cabt be sure.
#446

please pull the new master version and try it out.
let me know if it fixes it. if not we need to investigate it 👍
ty for the issue

can you also tell me. what tf your example was? the number of routes and the simulation start / end time?

@movy
Copy link
Contributor Author

movy commented May 19, 2024

Thanks for prompt response,
I use one route + one extra_route, and not using new fast_mode for now, as it fails for different reasons with my strategies (will investigate this later, as it's just barely faster for strategies with complex computation-heavy indicators).

I import candles for ~6 months, then split them into 40 days periods and get candles for each period. Basic code:

for index, date_range in enumerate(backtest_dates, start=0):
    print(f"Importing candles for {date_range[0]} - {date_range[1]}...")
    try:
        candles_warmup, candles_period = research.get_candles(exchange, symbol, tf, int((datetime.strptime(date_range[0], '%Y-%m-%d')).timestamp() * 1000), int((datetime.strptime(date_range[1], '%Y-%m-%d')).timestamp() * 1000), warmup_candles_num=config["warm_up_candles"], caching=True, is_for_jesse=True)
        period_candles = {
            jh.key(exchange, symbol): {
                'exchange': exchange,
                'symbol': symbol,
                # backtest candles tf is always 1m
                'candles': candles_period
            }
        }
        warmup_candles = {
            jh.key(exchange, symbol): {
                'exchange': exchange,
                'symbol': symbol,
                'candles': candles_warmup
            }
        }
        candles_refs.append(ray.put(period_candles))
        # print(candles_refs)
        warmup_candles_refs.append(ray.put(warmup_candles))
[...]

        for candles_ref in candles_refs:
            result = backtest(config=ray.get(config_ref),  
                            routes=ray.get(routes_refs[params["tf"]]),  
                            extra_routes=ray.get(extra_routes_refs[anchor_timeframe(params["tf"])]), 
                            candles=ray.get(candles_ref),  # candles within backtest date range
                            warmup_candles=ray.get(warmup_candles_refs[candles_refs.index(candles_ref)]), 
                            generate_csv=False,
                            generate_json=True,
                            hyperparameters=params,
                            fast_mode=False)

Will try master branch now.

@yakir4123
Copy link
Contributor

the higher the timeframe you use the better theperformance of fast_mose.
If you use 5m or 15m the changes will be noticable (as long as the indicators calculation is not big enough) but not as 1h or 4h

@saleh-mir
Copy link
Member

Hey Movy,

I pushed the patch that Yakir is talking about. It is version 0.48.1. Please give it a try and see if it fixes your issue.

@movy
Copy link
Contributor Author

movy commented May 19, 2024

So, good news and bad news: this regression was not caused by 0.48 (i.e. 0.48.1. did not fix it). The last version that does not hang is 0.45, so I'm sticking to it for now.

I guess some change related to warmup_candles passed as a parameter to backtest() call that was introduced in 0.46 causes this behaviour. It is also worth looking into why add_candle() is called millions of times with ever increasing arr length: even if this is intended behaviour, this is highly unoptimal (especially in python) in my view and should be replaced with a single call.

p.s. inconsistent params names 'warm_up_candles' vs 'warmup_candles' gave me some avoidable pain while I was refactoring backtest() calls. Can be improved as well, I think warm_up_* was used from the beginning, maybe we can stick with it?

@yakir4123
Copy link
Contributor

@movy
do you maybe cancel a lot of orders in your strategy?

@movy
Copy link
Contributor Author

movy commented May 20, 2024

Yes, I cancel on every candle if order was not executed. This never was a problem prior to 0.46.

@yakir4123
Copy link
Contributor

@movy
Cool I didnt know that v0.46 didnt has this problem

please try this PR
I did some fixes on these cases exactly!

#450

@movy
Copy link
Contributor Author

movy commented May 22, 2024

Thank you, @yakir4123 , I tried your branch, unfort same result. I will try to create a small script that consistently reproduces problematic behaviour, will post it here later.

@yakir4123
Copy link
Contributor

@movy , Yeah im facing the same problemim trying to solve it. I know the root of the problem. I guess i need to go back to v0.46 to see what was there

@movy
Copy link
Contributor Author

movy commented May 22, 2024

0.45 is the final version working w/o hanging, 0.46 introduced this bug

@yakir4123
Copy link
Contributor

yakir4123 commented May 24, 2024

@movy
do you mind try this branch with fixes on these cases that i believe you have too.
#455

https://github.com/yakir4123/jesse/tree/yakir/feat/delete-orders

@movy
Copy link
Contributor Author

movy commented May 25, 2024

@yakir4123 , still hangs unfort.
But I think you're moving in the right direction with identifying orders that are updated on each candle. My strategy logic is such that I update take_profit on each def update_position(self) and cancel non-executed orders on each candle via def should_cancel_entry(self):. I really will try to find time next week and post a simple repro.

@yakir4123
Copy link
Contributor

@movy
I updated the branch 1 hour ago, if you fetch the last update before that please give another try.
I do something very similar to you and now it workings for me really fast

@movy
Copy link
Contributor Author

movy commented May 25, 2024

Last commit I tested was 5ed48de
·
2 hours ago

@yakir4123
Copy link
Contributor

:/ I wonder what is it, please give me a simple strategy that repreduce that ill probably find the problem

@movy
Copy link
Contributor Author

movy commented May 26, 2024

DUAL_THRUST.py.zip

Here's one of Jesse's default strategies. I added trailing take-profit and offset-entries (i.e. entry not at current price, but at price - offset % not to enter at the candle start). These changes do not necessarily make this particular strategy more profitable, but they offer an example that hangs backtests after 70-80 iterations with Jesse after v0.45, including @yakir4123 fixes.

Tested on multiple coins and 4h candles for speed, same result. I run tests using Ray.tune (to enable more complex hyperparams perturbations and schedulers), but this particular code should run with jesse out of the box.

Copy link

stale bot commented Jul 25, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale No recent activity. label Jul 25, 2024
@stale stale bot closed this as completed Aug 2, 2024
@movy
Copy link
Contributor Author

movy commented Aug 30, 2024

Hello,
Sorry for delay in testing, but unfortunately the bug still exists in v1.1. I am testing same strategy in v.0.44 and v.1.1 and 0.44 never gets stuck, but 1.1 almost guaranteed to be looping for hours sometimes. This strategy is brand new, i.e. not related to my previous tests in any way, entry and exit rules are quite different. should_cancel_entry is true on each candle.

@yakir4123
Copy link
Contributor

yakir4123 commented Sep 1, 2024

Ill profile your strategy this week, @movy if you can provide symbol, timeframe and timestapms that would help

@movy
Copy link
Contributor Author

movy commented Sep 2, 2024

I believe any symbol / tf will cause looping, but you can try ALICE-USDT, HIGH-USDT, ASTR-USDT 4h at Binance Perp Futures

@yakir4123
Copy link
Contributor

@movy
Hey mate, cant reproduce that,
as i asked it would really helped if you give me the right parameters you use
image

@movy
Copy link
Contributor Author

movy commented Sep 4, 2024

Okay, I think I found the culprit, it was a mistake in my code (thanks to Python's archaic scoping rules) when migrating to updated backtest function, but finding this mistake was so hard due to difference between backtest() function handling of warmup_candles argument when fast_mode set to True or False.

Consider the following generic backtest:

import importlib
from datetime import datetime

import jesse.helpers as jh
from jesse.research import backtest, import_candles, get_candles
from jesse.utils import anchor_timeframe

exchange = 'Binance Perpetual Futures'
symbol = 'ACH-USDT'
tf = '1h'
strategy_name = 'DUAL_THRUST'
strategy = getattr(importlib.import_module(f"strategies.{strategy_name}"), f"{strategy_name}")
timeframes = [tf]
config = {
    'starting_balance': 1000,
    'fee': 0.001,
    'type': 'futures',
    'futures_leverage': 1,
    'futures_leverage_mode': 'cross',
    'exchange': exchange,
    'warm_up_candles': 255,
}

tests_start_date = '2024-01-01'
tests_end_date = '2024-08-30'

import_candles(exchange, symbol, tests_start_date, show_progressbar=False)
warmup_candles, candles = get_candles(exchange, symbol, tf, int((datetime.strptime(tests_start_date, '%Y-%m-%d')).timestamp() * 1000), int((datetime.strptime(tests_end_date, '%Y-%m-%d')).timestamp() * 1000), warmup_candles_num=config["warm_up_candles"], caching=True, is_for_jesse=True)

wrong_tests_start_date = '2024-05-01' # !!! IMPORTING WRONG CANDLES !!!
wrong_warmup_candles, wrong_candles = get_candles(exchange, symbol, tf, int((datetime.strptime(wrong_tests_start_date, '%Y-%m-%d')).timestamp() * 1000), int((datetime.strptime(tests_end_date, '%Y-%m-%d')).timestamp() * 1000), warmup_candles_num=config["warm_up_candles"], caching=True, is_for_jesse=True)

params={'stop_loss_atr_rate': 0.8, 'down_length': 30, 'up_length': 24, 'down_coeff': 1.8000000000000003, 'up_coeff': 0.5, 'tp_pc': 5.0, 'sl_pc': 0.5, 'trailing_profit': 5.0}

result = backtest(config, 
        [ {'exchange': exchange, 'strategy': strategy, 'symbol': symbol, 'timeframe': tf} ], 
        [ {'exchange': exchange, 'strategy': strategy, 'symbol': symbol, 'timeframe': anchor_timeframe(tf)} ],
        candles = {
            jh.key(exchange, symbol): {
                'exchange': exchange,
                'symbol': symbol,
                'candles': candles,
            }
        },  
        warmup_candles = {
            jh.key(exchange, symbol): {
                'exchange': exchange,
                'symbol': symbol,
                'candles': wrong_warmup_candles, # !!! USING WRONG CANDLES !!!
            }
        }, 
        generate_csv=False,
        generate_json=True,
        hyperparameters=params,
        fast_mode=False
        ) 

print(result)

As you can see, here I deliberately offer wrong_warmup_candles to backtest() call, and with fast_mode=False the test will SILENTLY run, although very slowly, but in many cases it will finish the run successfully (!).

With fast_mode=True however, backtest fails with:

  File "/home/ubuntu/.local/lib/python3.11/site-packages/jesse/modes/backtest_mode.py", line 681, in _skip_simulator
    _simulate_new_candles(candles, i, candles_step)
  File "/home/ubuntu/.local/lib/python3.11/site-packages/jesse/modes/backtest_mode.py", line 754, in _simulate_new_candles
    _simulate_price_change_effect_multiple_candles(
  File "/home/ubuntu/.local/lib/python3.11/site-packages/jesse/modes/backtest_mode.py", line 856, in _simulate_price_change_effect_multiple_candles
    store.candles.add_multiple_1m_candles(
  File "/home/ubuntu/.local/lib/python3.11/site-packages/jesse/store/state_candles.py", line 429, in add_multiple_1m_candles
    raise IndexError(f"Could not find the candle with timestamp {jh.timestamp_to_time(candles[0, 0])} in the storage. Last candle's timestamp: {jh.timestamp_to_time(arr[-1][0])}. exchange: {exchange}, symbol: {symbol}")
IndexError: Could not find the candle with timestamp 2023-12-31T00:00:00+00:00 in the storage. Last candle's timestamp: 2024-04-29T23:59:00+00:00. exchange: Binance Perpetual Futures, symbol: ACH-USDT

This error message is expected with wrong_warmup_candles but is somewhat misleading. A better error would be Could not find the candle [...] First candle's timestamp [...]. Make sure your warmup_candles are imported properly!

I believe, this difference in handling of warmup_candles can lead us to some interesting discoveries of how fast_mode works. It is really fast, I can confirm, but I will run further tests to see whether results match "slow mode" with complex exit strategies.

@movy
Copy link
Contributor Author

movy commented Sep 7, 2024

I have run a number of backtests using v0.46 and v1.1.1, and there're a slight, but important differences in tests outcome, even though resulting charts look similar to human eye (but not identical). In addition, I found that with v1.1.1 (or any version above v0.46 I assume) results get really weird when importing 1-minute candles to use with different timeframes, i.e. research.get_candles(exchange, symbol, '1m'... for 1h for example. Is this expected?

With v0.46 and below I always imported 1m candles for any timeframe and backtest results matched live trading quite well, that was why I liked Jesse so much -- good backtests gave me confidence to expect similar results in live trading. With new versions 1m candles produce absurd backtests (with fast and slow modes alike), and timeframe candles (i.e. 1h, 2h etc) produce slightly inaccurate results (fast_mode even more so), similar to what TradingView does, but it's a bad thing in such case, as TV simulates linear OHLC movement and it produces backtests that are detached from reality on any timeframe > 5m.

It might look insignificant for basic strategies, single symbol or short perioda, but with longer periods, portfolio of symbols and complex entry/exit rules such backtests lead to false conclusions. I am really at loss here, as I was very happy after finally figuring the problem with new backtests, but now I am quite frustrated with these regressions.

Any particular change around v.47+ could have caused such behaviour? I think it's related to how Jesse handles on_reduced_position or something else intra-candle.

Below are comparisons between Jesse versions, I used my private strategy here (not Donchian), but I'm sure any strategy with LIMIT entries or trailing stops will have similar issues.

1.1.1 slow mode:

image

1.1.1 fast mode:

image

0.46:

image

@yakir4123
Copy link
Contributor

yakir4123 commented Sep 7, 2024

@movy
I understand your frustration, if i recall in v0.47 there wasn't yet the fast-mode, so there may be 2 issues in here.
Jesse work by take 1 step of 1 candle each iteration, if your backtest trading route is 1m than there will be no difference in runtime.
But im not sure why there is such a difference in fast/slow mode. Ill give a look to check if some how there is a difference in the main loops.

about the differences between the versions, i have found few very minor and rare bugs, so it might be this issue. (is your strategy has a lot of entries / exit orders concurrently?)

@movy
Copy link
Contributor Author

movy commented Sep 8, 2024

To clarify: I use 1h routes, but in versions prior 0.47 we always had to request 1m candles via research.get_candles to build any tf from 1m. I think, once it has changed, results began to differ from live trading. This made tests much slower, but I never had problem with their accuracy.

I think, fast_mode is very useful for many use-cases, but slow_mode should fall back to whatever logic used prior 0.47.

In my strategies I use only one entry, but build a "ladder" of exits similar to

[(qty / 10, entry_price * (1 + self.tp_pc / 100)), *[(qty / 10, exit_price * (1 + self.trailing_pc / 100 * i)) for i in range(1,10)]]

@yakir4123
Copy link
Contributor

im not sure i understand, because fast-mode introduced in 0.49, and in 0.48 there was an update on how to iterate on closed trades (both on fast and slow mode (because there wasn't fast- mode in that time))
maybe this was this change

lets talk on discord to find the root of this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale No recent activity.
Projects
None yet
Development

No branches or pull requests

3 participants