Warm tip: This article is reproduced from serverfault.com, please click

Number of batches and epoch

发布于 2020-12-06 18:18:42

I have been tring to understand the concept of epoch and batch size. You can see training resulf of my CNN in the below:

Epoch 160/170
32/32 [==============================] - 90s 3s/step - loss: 0.5461 - accuracy: 0.8200 - val_loss: 0.6561 - val_accuracy: 0.7882
Epoch 161/170
32/32 [==============================] - 92s 3s/step - loss: 0.5057 - accuracy: 0.8356 - val_loss: 0.62020 - val_accuracy: 0.7882
Epoch 162/170
32/32 [==============================] - 90s 3s/step - loss: 0.5178 - accuracy: 0.8521 - val_loss: 0.6652 - val_accuracy: 0.7774
Epoch 163/170
32/32 [==============================] - 94s 3s/step - loss: 0.5377 - accuracy: 0.8418 - val_loss: 0.6733 - val_accuracy: 0.7822

So there are 163 epoches with 32 batch size. Since batch size is number of samples for each epoch, there has be 163*32 = 5216 samples but there only 3459 samples in the dataset. So does it start to take image from the beginning of the dataset when they are not enoght?

Questioner
Cagatayemm
Viewed
0
Rika 2020-12-07 13:17:04

batch size is the number of samples for each iteration that you feed to your model.
For example, if you have a dataset that has 10,000 samples and you use a batch-size of 100, then it will take 10,000 / 100 = 100 iterations to reach an epoch.

What you see in your log is the number of epochs and the number of iterations.
Epoch 160/170 denotes that you are currently running epoch 160 out of a total 170 epochs. Each epoch of yours takes 32 iterations.

knowing that your samples are only 3,459, each batch-size would be 3459 / 32 = 108.
You should already know what batch-size you have set but this should give you the answer as well.

As to how the batch is constructed, it depends on the implementation, some just don't use the items that do not make a full batch, some use smaller batches (whatever is left is made into a batch), and some incorporate the images from previous iterations to make up for the missing count.