Hyperparameter 'train_size' is bigger than actual train size

timmy-ops · October 11, 2021, 5:15pm

I am following the tutorial on this link:

In the task.py file (https://github.com/GoogleCloudPlatform/tensorflow-lifetime-value/blob/0f8c16ea70a2e7da370965e23e9e2154978364fa/clv_mle/trainer/task.py)
of the used github Repo are the default hyperparameters given. They are using the parameter ‘train_size’ = 100’000, but the actual length of the trainingset is only 883.

It is used for the following:

checkpoint_steps = int((train_size/batch_size) * (num_epochs/num_eval))
train_steps = (train_size/batch_size) * num_epochs

which in turn are used like this:

lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(               
  initial_learning_rate = 0.096505, 
  decay_steps = checkpoint_steps, 
  decay_rate = 0.7, 
  staircase=True)

…and this:

train_spec = tf.estimator.TrainSpec(
    input_fn = read_train,
    max_steps = train_steps)

I don’t get it. Why is the given parameter so huge?

(p.s.: I have some more issues, in case anyone is interested in helping out a lost student.)