Num examples = 243
Num Epochs = 100 Instantaneous batch size per device = 4
Total train batch size (w. parallel, distributed & accumulation) = 16
Gradient Accumulation steps = 4
Total optimization steps = 1,500
Number of trainable parameters = 1,949,696
这些参数是深度学习模型训练过程中...
日期:2024-01-06
浏览:226次
评论:
0次
阅读全文