Pushing CIFAR-10 SOTA using ResNets.
This project aimed to replicate and improve ResNet SOTA results on CIFAR10. I achieved a 6.90% error rate using a 20-layer ResNet with 0.27M parameters, improving upon the original ResNet20’s 8.75% error rate and matching ResNet56’s 6.97% (presented here). Replacing ResNet blocks with ResNeXt further reduced the error rate to 5.32%.
MODEL | TEST ERR | TEST ACC |
---|---|---|
ResNet20 | 8.75 | 91.25 |
XResNet20 | 8.18 | 91.82 |
MXResNet20 | 7.93 | 92.07 |
SE-MXResNet20 | 7.81 | 92.19 |
+ cosine decay | 7.53 | 92.47 |
+ label smoothing | 7.49 | 92.51 |
+ mixup | 7.03 | 92.97 |
+ reflection padding | 6.90 | 93.10 |
Model Architecture updates
Updates to the Training procedure
MODEL | PARAMS | TEST ERR | TEST ACC |
---|---|---|---|
SE-MXResNet20 | 0.27M | 6.90 | 93.10 |
SE-MXResNet32 | 0.47M | 6.20 | 93.80 |
SE-MXResNet44 | 0.67M | 6.12 | 93.88 |
SE-MXResNet56 | 0.86M | 5.64 | 94.36 |
I have updated the repository with ResNeXt based models to assess their influence in improving the performance. For this purpose, I have modified the original ResNeXt models presented here, such that they have roughly the same complexities as their ResNet counterparts.
Cardinality - C | Bottleneck width - d | Group Conv width |
---|---|---|
1 | 16 | 16 |
2 | 10 | 20 |
4 | 6 | 24 |
16 | 2 | 32 |
Following this process, I developed a model called XResNeXt29_16x2d. This model has 0.32M parameters, comparable to XResNet20. The extra 9 layers are a result of using the bottleneck blocks, which consist of 3 conv layers as opposed to the basic block’s 2. The performance of this model is shown below.
| MODEL | PARAMS | TEST ERR | TEST ACC |
| :—————————: | :——: | :———: | :———-: |
| XResNet29 | 0.31M | 7.62 | 92.38 |
| XResNeXt29_16x2d | 0.32M | 6.70 | 93.30 |
| SE-MXResNeXt29_16x2d | 0.36M | 5.32 | 94.68 |
After adding all the updates, this 29 layer model outperforms the 56 layer SE-MXResNet, while using less than half the number of parameters.
$ git clone https://github.com/iamVarunAnand/image_classification.git
$ cd image_classification
$ pip install -r requirements.txt
MODEL_NAME = “xresnet20” # model to be used for training
EPOCHS = 180 # number of training epochs
START_EPOCH = 0 # epoch to start training at (useful for stop-start training)
BS = 128 # batch size to be used while training
INIT_LR = 1e-1 # starting learning rate. (original ResNet paper recommends setting this to 1e-1)
USE_LBL_SMOOTH = False # determines if label smoothing is used while training
USE_COSINE = False # determines if the learning rate is to be scheduled using the cosine decay policy.
[**NOTE**] For the complete list of supported models, refer to the *dispatcher.py* file in the *utils* folder. This file consists a dictionary, mapping model names to the corresponding *tf.keras.Model* object.
- After setting all the necessary parameters in the configuration file, to train the model, run the following command ***from the base directory of the project***.
$ python train.py
```
During training, calls to the following callbacks are made either at the end of every batch or every epoch, dependent on the particular callback.