Introduction
Part I Deep Learning
resnet
Part II Deeper Tricks
超收敛
Published with GitBook
Part II Deeper Tricks
Part II Deeper Tricks
AdamW + 超收敛
目前训练神经网络最快的方式:AdamW优化算法+超收敛。为什么当前经典的论文都是用SGDM来训练,为什么大家都觉得SGD比Adam收敛更好?Adam出现问题的原因是什么?
results matching "
"
No results matching "
"