89.89% on CIFAR-10 in Pytorch

The full code is available here, just clone it to your machine and it's ready to play. As a former Torch7 user, I attempt to reproduce the results from the Torch7 post.

My friends Wu Jun and Zhang Yujing claimed Batch Normalization[1] useless. I want to prove them wrong (打他们脸), and CIFAR-10 is a nice playground to start.

CIFAR-10 images

CIFAR-10 contains 60 000 labeled for 10 classes images 32x32 in size, train set has 50 000 and test set 10 000.

The dataset is quite small by today's standards, but still a good playground for machine learning algorithms. I just use horizontal flips to augment data. One would need an NVIDIA GPU with at least 3 GB of memory.

The post and the code consist of 2 parts/files:

  • model definition
  • training

The model Vgg.py

It's a VGG16-like[2] (not identical, I remove the first FC layer) network with many 3x3 filters and padding 1,1 so the sizes of feature maps after them are unchanged. They are only changed after max-pooling. Weights of convolutional layers are initialized MSR-style. Batch Normalization and Dropout are used together.

Training train.py

That's it, you can start training:

python train.py

The parameters with which models achieves the best performance are default in the code. I used SGD (a little out-of-date) with cross-entropy loss with learning 0.01, momentum 0.9 and weight decay 0.0005, dropping learning rate every 25 epochs. After a few hours you will have the model. The accuracy record and models at each checkpoint are saved in 'save' folder.

How accuracy improves:


CIFAR-10 Accuracy

The best accuracy is 89.89%, removing BN or Dropout results in 88.67% and 88.73% accuracy, respectively. Batch Normalization can accelerate deep network training. Removing BN and Dropout results in 86.65% accuracy and we can observe the overfitting.

References

  1. Sergey Ioffe, Christian Szegedy. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. [arxiv]
  2. K. Simonyan, A. Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition [arxiv]
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • 没有人是一座孤岛,因为有爱,所以彼此连接。 《岛上书店》是我最喜欢的一本书,每一次重读,都是在心情低落,状态较差的...
    哲萱阅读 413评论 0 0
  • 《跃迁》——高手的进阶之路 不兜圈子,直接上图。 这本书目前正在预售,现在读到的只是抢读版(作为橙子学院会员的福利...
    想吃热豆腐的心急boy阅读 601评论 0 8