Accuracy is ~80 after 350 epochs #4

ChesterAiGo · 2017-08-04T00:08:37Z

hi vibrantabhi19 :

Thank you for sharing your code! That's very helpful for me to understand All-CNN.

In addition, I've trained it last with your model night with 350 epochs, however found its accuracy (i.e. val_acc) became stable (about 0.81) after epoch 49 and remained the same to the end

Any ideas? :) 👍

The model I used:

`
model = Sequential()

model.add(Conv2D(96, (3, 3), padding="same", input_shape=(32, 32, 3)))
model.add(Activation('relu'))
model.add(Conv2D(96, (3, 3), padding="same"))
model.add(Activation('relu'))
model.add(Conv2D(96, (3, 3), padding="same", strides=2))
model.add(Dropout(0.5))

model.add(Conv2D(192, (3, 3), padding="same"))
model.add(Activation('relu'))
model.add(Conv2D(192, (3, 3), padding="same"))
model.add(Activation('relu'))
model.add(Conv2D(192, (3, 3), padding="same", strides=2))
model.add(Dropout(0.5))

model.add(Conv2D(192, (3, 3), padding="same"))
model.add(Activation('relu'))
model.add(Conv2D(192, (1, 1), padding="valid"))
model.add(Activation('relu'))
model.add(Conv2D(10, (1, 1), padding="valid"))

model.add(GlobalAveragePooling2D())
model.add(Activation('softmax'))

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])`

The text was updated successfully, but these errors were encountered:

iabhi7 · 2017-08-04T07:04:30Z

Hi @ChesterAiGo
Thanks.
As far as I can tell, you should try with a different set of learning parameter, maybe try Adam as your optimizer because the network is not able to converge.
Also in the original paper scheduler S = "e1 ,e2 , e3" were used in which γ is multiplied by a fixed multiplier of 0.1 after e1. e2 and e3 epochs respectively. (where e1 = 200, e2 = 250, e3 = 300).
Maybe you can have a go at that.
What's your training_accuracy? A measure of training accuracy might ensure that the model is not overfitting.

ChesterAiGo · 2017-08-04T09:45:09Z

Hi @vibrantabhi19

Thanks for your prompt reply! I will have a try of different optimizers as well as try vary γ during training(I think that's probably why)

In addition, there was something very interesting about the accuracies..i.e. the training accuracy keeps increasing steadily (from epoch 1 to epoch 350) while the validation accuracy became stable (was not increasing but was not decreasing as well..that's weird xD) after epoch 49..

Something looks like:

Epoch 1: Val: 0.1, Train: 0.1
...
Epoch 49: Val: 0.8, Train: 0.8
...
Epoch 450: Val: 0.8, Train: 0.94

Thanks again ! :)

iabhi7 · 2017-08-04T10:20:42Z

Oh, that's weird, the network cannot overfit, we are already using a dropout of 0.5.
Since the network is converging (train_acc=0.94 is a proof of that), I don't think trying out different optimizers will help, anyways go ahead with the experiment and post your results here.
I will try investigating on my end (the same code has worked for a lot people so I am not able to figure the exact error)

marcj · 2017-09-23T19:55:23Z

I can confirm, that using the original code (with the fix in #5) and removal of multi_gpu code reveals an accuracy over 81%. My best after 350 epochs using the code of this repository was 90.88%. However, it cracked 90% already in epoch 140.

See accuracy (as CSV):

And loss (as CSV):

The learning rate decay produced this (as CSV):

See also full console log.

and all source code + weights here: https://aetros.com/marcj/keras:all-conv/view/refs/aetros/job/92fcd671c6814c375edd404a65edc66c00ba5aec or in the analytics tool at https://trainer.aetros.com/model/marcj/keras:all-conv/job/92fcd671c6814c375edd404a65edc66c00ba5aec (requires login first)

Hyper parameter and other information here:

So what I can say: I can not reproduce the stuck at 81%. @ChesterAiGo, you can fork my model at https://aetros.com/marcj/keras:all-conv and try to run it on your hardware, so we have all information to debug it.

However, I'd also like to know why this code does not produce the results from the linked paper and what you need concretely to achieve 95.59% for cifar10 using all-conv.

JaeDukSeo · 2019-05-28T16:30:43Z

this is some sexy plots 90 percent accuracy

gasparka mentioned this issue Sep 27, 2017

Strided convolutions missing activation? #6

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accuracy is ~80 after 350 epochs #4

Accuracy is ~80 after 350 epochs #4

ChesterAiGo commented Aug 4, 2017 •

edited

Loading

iabhi7 commented Aug 4, 2017

ChesterAiGo commented Aug 4, 2017

iabhi7 commented Aug 4, 2017

marcj commented Sep 23, 2017 •

edited

Loading

JaeDukSeo commented May 28, 2019

Accuracy is ~80 after 350 epochs #4

Accuracy is ~80 after 350 epochs #4

Comments

ChesterAiGo commented Aug 4, 2017 • edited Loading

iabhi7 commented Aug 4, 2017

ChesterAiGo commented Aug 4, 2017

iabhi7 commented Aug 4, 2017

marcj commented Sep 23, 2017 • edited Loading

JaeDukSeo commented May 28, 2019

ChesterAiGo commented Aug 4, 2017 •

edited

Loading

marcj commented Sep 23, 2017 •

edited

Loading