Abstract: We demonstrate a very universal frequency principle that deep neural networks learn low frequency faster on high-dimensional benchmark datasets such as MNIST/CIFAR10 and deep neural networks such as VGG16. Then, we utilize the frequency principle to empirically provide a promising mechanism to understand why deeper learning is faster, that is, we propose a deep frequency principle that the effective target function for a deeper hidden layer biases towards lower frequency during the training.