We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hello maintainers, I want to understand why this scenario happens, I have the following timeseries
import pandas as pd data = { 'date': pd.date_range(start='2023-01-01', periods=10, freq='MS'), 'value': [1, 3, 3, 4, 3, 2, 1, 1, 3, 2] } df = pd.DataFrame(data) df.set_index('date', inplace=True)
Which yields this ts
value date 2023-01-01 1 2023-02-01 3 2023-03-01 3 2023-04-01 4 2023-05-01 3 2023-06-01 2 2023-07-01 1 2023-08-01 1 2023-09-01 3 2023-10-01 2
and when I try and fit the model, it yields these information:
fitted_model = auto_arima( y=df['value'], max_iter=15, max_d=1, method='nm', seasonal=False) fitted_model
ARIMA(2,0,2)(0,0,0)[0]
Then I try to predict
fitted_model.predict( n_periods=2, return_conf_int=False)
and shows below error
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In [1047], line 1 ----> 1 fitted_model.predict( 2 n_periods=2, 3 return_conf_int=False) File ~/cluster-env/clonedenv/lib/python3.10/site-packages/pmdarima/arima/arima.py:791, in ARIMA.predict(self, n_periods, X, return_conf_int, alpha, **kwargs) 788 arima = self.arima_res_ 789 end = arima.nobs + n_periods - 1 --> 791 f, conf_int = _seasonal_prediction_with_confidence( 792 arima_res=arima, 793 start=arima.nobs, 794 end=end, 795 X=X, 796 alpha=alpha) 798 if return_conf_int: 799 # The confidence intervals may be a Pandas frame if it comes from 800 # SARIMAX & we want Numpy. We will to duck type it so we don't add 801 # new explicit requirements for the package 802 return f, check_array(conf_int, force_all_finite=False) File ~/cluster-env/clonedenv/lib/python3.10/site-packages/pmdarima/arima/arima.py:203, in _seasonal_prediction_with_confidence(arima_res, start, end, X, alpha, **kwargs) 199 conf_int[:, 0] = f - q * np.sqrt(var) 200 conf_int[:, 1] = f + q * np.sqrt(var) 202 return check_endog(f, dtype=None, copy=False), \ --> 203 check_array(conf_int, copy=False, dtype=None) File ~/cluster-env/clonedenv/lib/python3.10/site-packages/sklearn/utils/validation.py:899, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator, input_name) 893 raise ValueError( 894 "Found array with dim %d. %s expected <= 2." 895 % (array.ndim, estimator_name) 896 ) 898 if force_all_finite: --> 899 _assert_all_finite( 900 array, 901 input_name=input_name, 902 estimator_name=estimator_name, 903 allow_nan=force_all_finite == "allow-nan", 904 ) 906 if ensure_min_samples > 0: 907 n_samples = _num_samples(array) File ~/cluster-env/clonedenv/lib/python3.10/site-packages/sklearn/utils/validation.py:146, in _assert_all_finite(X, allow_nan, msg_dtype, estimator_name, input_name) 124 if ( 125 not allow_nan 126 and estimator_name (...) 130 # Improve the error message on how to handle missing values in 131 # scikit-learn. 132 msg_err += ( 133 f"\n{estimator_name} does not accept missing values" 134 " encoded as NaN natively. For supervised learning, you might want" (...) 144 "#estimators-that-handle-nan-values" 145 ) --> 146 raise ValueError(msg_err) 148 # for object dtype data, we only check for NaNs (GH-13254) 149 elif X.dtype == np.dtype("object") and not allow_nan: ValueError: Input contains NaN.
However when I increase the data by one data point
data = { 'date': pd.date_range(start='2023-01-01', periods=11, freq='MS'), 'value': [1, 3, 3, 4, 3, 2, 1, 1, 3, 2, 2] }
or when I change to these values
data = { 'date': pd.date_range(start='2023-01-01', periods=10, freq='MS'), 'value': [5, 8, 11, 4, 6, 6, 6, 5, 6, 9] }
or when setting the seasonal parameter to True for the same exact data
seasonal
True
The model returned is ARIMA(0,0,0)(0,0,0)[0] intercept and the predictions are fine without errors
ARIMA(0,0,0)(0,0,0)[0] intercept
Another work around is to put a guradrail of maximum p, q, d to be 1 and it also works.
Can you help me understand why this happens? Is placing a guardrail the correct way to fix this?
Thank you in advance :)
Here is a video of a cute Otter as a digital bribe: https://www.youtube.com/watch?v=8O8iEz2p7rQ Can you help me understand this behaviour?
System: python: 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:35:26) [GCC 10.4.0] executable: /home/trusted-service-user/cluster-env/clonedenv/bin/python machine: Linux-4.15.0-1174-azure-x86_64-with-glibc2.27 Python dependencies: pip: 23.3 setuptools: 65.5.1 sklearn: 1.1.3 statsmodels: 0.14.0 numpy: 1.23.4 scipy: 1.10.1 Cython: 0.29.32 pandas: 1.5.3 joblib: 1.3.2 pmdarima: 1.8.5 Linux-4.15.0-1174-azure-x86_64-with-glibc2.27 Python 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:35:26) [GCC 10.4.0] pmdarima 1.8.5 NumPy 1.23.4 SciPy 1.10.1 Scikit-Learn 1.1.3 Statsmodels 0.14.0 /home/trusted-service-user/cluster-env/clonedenv/lib/python3.10/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.")
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Describe the question you have
Hello maintainers, I want to understand why this scenario happens, I have the following timeseries
Which yields this ts
and when I try and fit the model, it yields these information:
and when I try and fit the model, it yields these information:
Then I try to predict
and shows below error
However when I increase the data by one data point
or when I change to these values
or when setting the
seasonal
parameter toTrue
for the same exact dataThe model returned is
ARIMA(0,0,0)(0,0,0)[0] intercept
and the predictions are fine without errorsAnother work around is to put a guradrail of maximum p, q, d to be 1 and it also works.
Can you help me understand why this happens? Is placing a guardrail the correct way to fix this?
Thank you in advance :)
Here is a video of a cute Otter as a digital bribe: https://www.youtube.com/watch?v=8O8iEz2p7rQ
Can you help me understand this behaviour?
Versions (if necessary)
The text was updated successfully, but these errors were encountered: