edit sort and loop for fitted pipeline #159

perib · 2024-10-31T03:49:39Z

[please review the Contribution Guidelines prior to submitting your pull request. go ahead and delete this line if you've already reviewed said guidelines.]

What does this PR do?

Previously, the best fitted pipeline was selected by identifying a single pipeline with the max of the first objective function. If that pipeline failed, TPOT crashes without fitted_estimator_

Two changes

a) Pipelines are now sorted with all objective functions, in order. Now when multiple pipeline have the same score, they are also sorted by the second score, and so on. Previously, a random pipeline with the best score was selected, which may not have been the optimal pipeline given the other scores.

b) There is a very rare but not impossible chance that a pipeline will work correctly on in the objective functions, but fail on the full dataset. For example, a selector function that happens to select only positive values during when evaluated on the cv folds, might then select a different column that does include negative values when trained on the full dataset. If the final estimator is MultinomialNB, this will execute correctly on the objective function, but throw an error on the full dataset as it cannot accept negative values. This could cause TPOT to crash

To resolve this, TPOT will now loop through the best pipelines. If a pipeline fails, it will catch the error and try the next best pipeline. This prevents a terminal error from occurring.

Where should the reviewer start?

How should this PR be tested?

Double check that the sort order is correct and it runs without issue on test data.

edit sort and loop for fitted pipeline

1f2027f

nickotto merged commit 94af584 into EpistasisLab:main Nov 5, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

edit sort and loop for fitted pipeline #159

edit sort and loop for fitted pipeline #159

perib commented Oct 31, 2024

edit sort and loop for fitted pipeline #159

edit sort and loop for fitted pipeline #159

Conversation

perib commented Oct 31, 2024

What does this PR do?

Where should the reviewer start?

How should this PR be tested?