[Bug]: The truth value of a DataFrame is ambiguous. #1298

DRMPN · 2024-06-01T19:14:12Z

Expected Behavior

Pipeline starts tuning with provided input data.

tuned_pipiline = auto_model.tune(input_data=orig_data, timeout=10, cv_folds=10, n_jobs=4)

Current Behavior

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[17], line 1
----> 1 tuned_pipiline = auto_model.tune(input_data=train, timeout=10, cv_folds=10, n_jobs=4)

File [c:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\fedot\api\main.py:230](file:///C:/Users/nnikitin-user/AppData/Local/Programs/Python/Python310/lib/site-packages/fedot/api/main.py:230), in Fedot.tune(self, input_data, metric_name, iterations, timeout, cv_folds, n_jobs, show_progress)
    227     raise ValueError(NOT_FITTED_ERR_MSG)
    229 with fedot_composer_timer.launch_tuning('post'):
--> 230     if not input_data: 
    231         input_data = self.train_data
    232     cv_folds = cv_folds or self.params.get('cv_folds')

File [c:\Users\nnikitin-user\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\generic.py:1527](file:///C:/Users/nnikitin-user/AppData/Local/Programs/Python/Python310/lib/site-packages/pandas/core/generic.py:1527), in NDFrame.__nonzero__(self)
   1525 @final
   1526 def __nonzero__(self) -> NoReturn:
-> 1527     raise ValueError(
   1528         f"The truth value of a {type(self).__name__} is ambiguous. "
   1529         "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
   1530     )

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Possible Solution

Change line 230 in fedot/api/main.py to the following:

if not input_data: 
    input_data = self.train_data

Steps to Reproduce

Data from https://www.kaggle.com/competitions/playground-series-s4e6

from fedot.api.main import Fedot
import pandas as pd

train = pd.read_csv("/automl-june/playground-series-s4e6/train.csv")
test = pd.read_csv("/automl-june/playground-series-s4e6/test.csv")

train.drop(columns=["id"], inplace=True)
test.drop(columns=["id"], inplace=True)

auto_model = Fedot(
    problem="classification",
    metric=["precision", "accuracy", "roc_auc"],
    preset="best_quality",
    with_tuning=True,
    timeout=60,
    cv_folds=10,
    seed=42,
    n_jobs=1,
    logging_level=10,
    use_pipelines_cache=False,
    use_auto_preprocessing=False,
)

auto_model.fit(features=train, target="Target")

prediction = auto_model.predict(features=test, save_predictions=True)

print(auto_model.return_report().head(10))

print(auto_model.get_metrics(target=train.Target))

tuned_pipiline = auto_model.tune(input_data=train, timeout=10, cv_folds=10, n_jobs=4)

Context [OPTIONAL]

Participating in a Kaggle competition PS4E6.

The text was updated successfully, but these errors were encountered:

Lopa10ko · 2024-07-19T10:10:59Z

Note

closed as irrelevant (can be reissued if necessary)

the signature of the tune function indicates that it expects an instance of InputData, but in a snippet from the Steps to Reproduce, a pd.DataFrame object is passed.

for this particular launch, you can do the following:

from fedot.core.data.data import array_to_input_data

...

input_data = array_to_input_data(features_array=train.loc[:, train.columns != 'Target'].values,
                                 target_array=train.Target.values)
tuned_pipiline = auto_model.tune(input_data=input_data, timeout=2, cv_folds=10, n_jobs=4)

seems to work as expected on kaggle data

aPovidlo · 2024-07-22T10:01:16Z

@Lopa10ko Я думаю, что лучше сделать по аналогии с другими API методами. Например, в fit() ожидается features: FeaturesType. Поэтому думаю, что и для tune() стоит сделать по аналогии.

DRMPN added bug Something isn't working api Anything related to user-facing interfaces & parameter passing labels Jun 1, 2024

Lopa10ko closed this as completed Jul 19, 2024

Lopa10ko mentioned this issue Jul 22, 2024

hotfix: go for FeaturesType instead of InputData in a pipeline tuning #1311

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: The truth value of a DataFrame is ambiguous. #1298

[Bug]: The truth value of a DataFrame is ambiguous. #1298

DRMPN commented Jun 1, 2024 •

edited

Loading

Lopa10ko commented Jul 19, 2024

aPovidlo commented Jul 22, 2024

[Bug]: The truth value of a DataFrame is ambiguous. #1298

[Bug]: The truth value of a DataFrame is ambiguous. #1298

Comments

DRMPN commented Jun 1, 2024 • edited Loading

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce

Context [OPTIONAL]

Lopa10ko commented Jul 19, 2024

aPovidlo commented Jul 22, 2024

DRMPN commented Jun 1, 2024 •

edited

Loading