-
-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add time series forecasting support #611
Conversation
frameworks/FEDOT/exec_ts.py
Outdated
training_params.update({k: v for k, v in config.framework_params.items() if not k.startswith('_')}) | ||
n_jobs = training_params["n_jobs"] | ||
|
||
log.info('Running FEDOT with a maximum time of %ss on %s cores, optimizing %s.', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: f-strings are a bit more readable and are compatible with all Python versions used by AMLB
f"Running FEDOT with a maximum time of {config.max_runtime_seconds}s on {n_jobs} cores, optimizing {scoring_metric}"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
frameworks/FEDOT/exec_ts.py
Outdated
predictions = [] | ||
|
||
for label in train_df[id_column].unique(): | ||
train_sub_df = train_df[train_df[id_column] == label].drop(columns=[id_column], axis=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be very inefficient for large dataframes with many time series (scales as O(N^2)). A better option would be to use groupby, e.g.
for label, ts in train_df.groupby(id_column, sort=False):
train_series = ts[dataset.target].to_numpy()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, tried that: imo using zip may be inappropriate here, so I would stick with a janky bad-asymptotic solution to ensure that the labels are matching
e6f19e7
frameworks/FEDOT/exec_ts.py
Outdated
train_sub_df = train_df[train_df[id_column] == label].drop(columns=[id_column], axis=1) | ||
train_series = np.array(train_sub_df[dataset.target]) | ||
train_input = InputData( | ||
idx=train_sub_df.index.to_numpy(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code uses the range index [0, 1, 2, ..., n-1]
instead of the timestamps of the time series. Should this be train_sub_df[timestamp_column]
instead, or does FEDOT ignore the timestamps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FEDOT ignores timestamps, so that idx should be fine
frameworks/FEDOT/exec_ts.py
Outdated
) | ||
|
||
test_sub_df = test_df[test_df[id_column] == label].drop(columns=[id_column], axis=1) | ||
test_series = np.array(test_sub_df[dataset.target]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This contains the future values of the time series, I doubt that these values need to be fed to the model at predictions time.
For example, if training data contains time steps [1, 2, 3, ..., T]
and the goal is to predict [T+1, ..., T+H]
, then based on your current code
train_series
contains timesteps[1, 2, 3, ..., T]
test_series
contains timesteps[T+1, ..., T+H]
My guess is that we need to pass train_series
as input to both fit()
and predict()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in essence predict
and forecast
are the same thing, but we could pass test_input
with features
from train_series
and target
from test_series
to get the prediction horizon in a predict
changed to a clear forecast
method
a357726
frameworks/FEDOT/exec_ts.py
Outdated
timeout=runtime_min, | ||
metric=scoring_metric, | ||
seed=config.seed, | ||
max_pipeline_fit_time=runtime_min / 10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is / 10
necessary here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally speaking, this is a small safety measure to ensure that the training time of one pipeline is exactly within the total timeout. the classification and regression #563 uses the same empirical approach. it should be patched in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to split the time limit evenly across the series? Right now it seems that 10% of the total time limit is given to each series, which may lead to overruns if >10 series are available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
frameworks/FEDOT/exec_ts.py
Outdated
training_duration=training_duration, | ||
predict_duration=predict_duration) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think it would be cleaner to remove the line training_duration, predict_duration = 0, 0
above and just return training_duration=training.duration, predict_duration=predict.duration
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we train the FEDOT model for each series (for each label), training.duration
value isn't cumulative and will only reflect the time spent on the last iteration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, I didn't realize this was happening inside the loop over individual series
models_count += fedot.current_pipeline.length | ||
|
||
save_artifacts(fedot, config) | ||
return result(output_file=config.output_predictions_file, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's necessary to return dataset.repeated_item_id
and dataset.repeated_abs_seasonal_error
as optional_columns
in the result for MASE computation to work correctly (see https://github.com/openml/automlbenchmark/blob/master/frameworks/AutoGluon/exec_ts.py#L63C1-L67C1).
This is a rather ugly hack that is necessary to make history-dependent metrics like MASE compatible with the AMLB results API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shchu, thanks for the initial review :) |
@shchur sorry it took ages to finally return to this PR. |
@shchur would you be available for another look?:) |
Hi @Lopa10ko @PGijsbers, I will try to have a look at the PR today |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two small comments that probably need to be addressed, but otherwise looks good to me.
frameworks/FEDOT/exec_ts.py
Outdated
timeout=runtime_min, | ||
metric=scoring_metric, | ||
seed=config.seed, | ||
max_pipeline_fit_time=runtime_min / 10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to split the time limit evenly across the series? Right now it seems that 10% of the total time limit is given to each series, which may lead to overruns if >10 series are available.
|
||
fedot = Fedot( | ||
problem=TaskTypesEnum.ts_forecasting.value, | ||
task_params=task.task_params, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand, Fedot currently only supports point forecasting but AMLB may also include probabilistic forecasting tasks (see https://github.com/openml/automlbenchmark/blob/master/amlb/results.py#L767-L792). Probably it would make sense to raise an exception if someone tries to evaluate FEDOT on such a probabilistic forecasting task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a way to distinguish a probabilistic forecasting task based on the benchmark run config?
the get_fedot_metrics
function already emits logs in case of unsupported metrics (like mql
, wql
, and sql
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that filtering by the mql, wql, sql metrics is the simplest way to accomplish this.
Another option is to repeat the point forecast for each of the quantile levels and raise a warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shchur thanks for a review, |
In principle it looks fine. Does FEDOT support provided memory constraints? If so, I would appreciate it if you add that to the integration script ( |
@PGijsbers sorry, FEDOT does't have the capability to handle memory limitations. |
Sorry for the delay, I thought this was merged. |
Summary
Add support for time series forecasting functionality from FEDOT framework.
Context
closes #610