python - understanding loss in tf keras model fit - Stack Overflow|Programmer puzzle solving

I am trying to replicate the loss reported after training a tf.keras model. From my understanding, history.history['loss'] should return the average loss per epoch after calling model.fit. However, it doesn't seem to match what I expected.

Here's the code snippet:

history = model.fit(X, y, batch_size=32, shuffle=True, epochs=1, verbose=1)

Output: 4050/4050 ━━━━━━━━━━━━━━━━━━━━ 66s 13ms/step - loss: 290.9271

print(history.history)    # Output: {'loss': [56.9575309753418]}

I created a custom callback to collect losses from each batch:

batch_losses = []
collect_losses = LambdaCallback(on_batch_end=lambda batch, logs: batch_losses.append(logs['loss']))
model.fit(X, y, batch_size=32, shuffle=True, epochs=1, verbose=1, callbacks=[collect_losses])
batch_losses_array = np.array(batch_losses, dtype=np.float32)

Here are the results:

print(float(np.mean(batch_losses_array)))  # Output: 290.98486328125
print(batch_losses_array[-1])              # Output: 56.95753

It appears that history.history['loss'] returns the loss for the last batch, not the average per epoch. A simple average of the batch losses is very close to the reported value after training, but there is still a discrepancy.

All my data are in np.float32. TensorFlow version is 2.18.0. Batch sizes are consistent (32), meaning no smaller final batch; hence, the weighted average and simple average should be identical.

My questions are:

Is history.history['loss'] supposed to return the last batch's loss or the average loss per epoch?
How can I directly obtain the reported loss value after training?

Thanks in advance for any insights!

Programmer puzzle solving

python - understanding loss in tf keras model fit - Stack Overflow

Articles related to this article

comment list (0)