Num examples = 243 Num Epochs = 100 Instantaneous batch size per device = 4 Total train batch size (w. parallel, distributed & accumulation) = 16 Gradient Accumulation steps = 4 Total optimization steps = 1,500 Number of trainable parameters = 1,949,696 这些参数是深度学习模型训练过程中...
日期:2024-01-06 浏览:226次 评论: 0 阅读全文