`input_fn` called multiple times in `Estimator.train` #143

shendiaomo · 2020-01-13T03:01:02Z

Lines 896 to 900 in 712bc8e

 result = temp_estimator.train( 

 input_fn=input_fn, 

 hooks=hooks, 

 max_steps=max_steps, 

 saving_listeners=saving_listeners)

It seems to be problematic because adanet.Estimator.train would load data from scratch at every iteration.
As tensorflow/tensorflow#19062 (comment) said, in canned TF estimators train is called once.

The text was updated successfully, but these errors were encountered:

shendiaomo · 2020-01-13T14:12:35Z

There seem to be two negative effects of this:

A repeated Dataset (dataset.repeat(10) for example) cannot stop training via OutOfRangeError or StopIteration, we have to set steps or max_steps, which is inconsistent with canned Estimators.
If a user doesn't shuffle the dataset, AdaNet may repeatedly use the first max_iteration_steps * batch_size samples each time, thus fitting to a subset of the training data.

Am I right? @cweill

cweill · 2020-01-14T19:55:31Z

@shendiaomo: You are correct on both counts. For this reason, we request that the user configures the max_iteration_steps to be the number of repetitions desired, which unfortunately requires the user to do some extra math (max_iteration_steps = num_examples / batch_size * num_epochs_per_iteration).

Assuming each adanet iteration trains over several epochs, 2. should be less of an issue in practice if your base learners are randomly initialized. They will tend to learn different biases, and form a strong ensemble regardless.

shendiaomo · 2020-01-15T15:09:34Z

@shendiaomo: You are correct on both counts. For this reason, we request that the user configures the max_iteration_steps to be the number of repetitions desired, which unfortunately requires the user to do some extra math (max_iteration_steps = num_examples / batch_size * num_epochs_per_iteration).

Assuming each adanet iteration trains over several epochs, 2. should be less of an issue in practice if your base learners are randomly initialized. They will tend to learn different biases, and form a strong ensemble regardless.

Great! Thanks for the explanation. However, that's sort of not handy to do the math, imagine someone wants to replace the DNNClassifier in her application into adanet.Estimator, there may be lots of work. Will you have a plan to improve this? Or will the Keras version avoid the same situation?

le-dawg · 2020-04-19T23:06:54Z

@cweill

@shendiaomo: You are correct on both counts. For this reason, we request that the user configures the max_iteration_steps to be the number of repetitions desired, which unfortunately requires the user to do some extra math (max_iteration_steps = num_examples / batch_size * num_epochs_per_iteration).

Assuming each adanet iteration trains over several epochs, 2. should be less of an issue in practice if your base learners are randomly initialized. They will tend to learn different biases, and form a strong ensemble regardless.

From the tutorials:

max_iteration_steps=TRAIN_STEPS // ADANET_ITERATIONS,

If I want to train with 100 epochs over one Adanet iteration, meaning num_examples/batch_size steps per epoch, should I set max_iteration_steps to that value?

I have a sample size of 5265 and batch sizes of 50, so I have about 105 update steps per epoch. Should my max_iteration_steps be 10500?

le-dawg · 2020-04-29T12:12:19Z

Pinging

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`input_fn` called multiple times in `Estimator.train` #143

`input_fn` called multiple times in `Estimator.train` #143

shendiaomo commented Jan 13, 2020

shendiaomo commented Jan 13, 2020

cweill commented Jan 14, 2020 •

edited

Loading

shendiaomo commented Jan 15, 2020

le-dawg commented Apr 19, 2020 •

edited

Loading

le-dawg commented Apr 29, 2020

input_fn called multiple times in Estimator.train #143

input_fn called multiple times in Estimator.train #143

Comments

shendiaomo commented Jan 13, 2020

shendiaomo commented Jan 13, 2020

cweill commented Jan 14, 2020 • edited Loading

shendiaomo commented Jan 15, 2020

le-dawg commented Apr 19, 2020 • edited Loading

le-dawg commented Apr 29, 2020

`input_fn` called multiple times in `Estimator.train` #143

`input_fn` called multiple times in `Estimator.train` #143

cweill commented Jan 14, 2020 •

edited

Loading

le-dawg commented Apr 19, 2020 •

edited

Loading