Understanding token count in gpt model fine tuning
We recently experimented with the fine tuning the gpt4-o model in OpenAI Studio. We trained with the same data and the same prompt varying the batch size and epochs and are very confused about the token counts we got after the model finished training:
Batch SizeEpochsTokens15223 million32492.2 millionWe are under the impression that the token count for the 4-epoch model should just be two times the amount as the one for the 2 epoch training, since it's running the exact same records, just twice as long. Can you explain to me why the tokens are actually 4x? Also if you can explain how these token counts are calculated that would be great!