Performance Study of InceptionV3 & CNN
A Comparative Study of the performance of CNN from scratch compared to a transfer learning approach (InceptionV3)
| Properties | Details |
|---|---|
| Name | Intel Image Classification |
| Source | Kaggle |
| Author | PUNEET BANSAL |
| Data Type | Images |
| Problem Type | Image Classification |
| Files | 3 Folders (14K Training, 3K Testing, 7K Prediction) |
| Classes | 6 Classes (Buildings, Forest, Glacier, Mountain, Sea, Street) |
Preprocessing Steps
- Image Resizing: Image has been resized to
(150, 150). - Image Normalisation: Normalising pixel values from
0-255to between0-1. - Encoding Image Labels: Transforming labels from categorical format to numerical using
Label Encoder.
Modelling
In this part, we will go through the different CNN versions that we created from scratch and using transfer learning with different versions, comparing the results including fine-tuning:
- Simple CNN:
- 1 convolutional layer
(32 Units) - 1 MaxPooling layer
- 2 Dense Layers
(128 Units, 6) - 10 Epochs
- 1 convolutional layer
- Deep CNN:
- 4 convolutional layers
(2 * 32 Units + 64 Units + 128 Units) - 4 MaxPooling Layers
- 3 Dense Layers
(128 Units + 64 Units, 6) - 10 Epochs
- 4 convolutional layers
- Deep CNN + Fine-tuning:
- Same previous architecture
- Change optimizer to
Adamax - Added Callbacks (
early stopping+LR Scheduler) - 20 Epochs
- InceptionV3 I:
- Excluding original classification layers (
include_top = False) - Using ImageNet weights (
weights='imagenet') - Unfreezing the first 25 layers (
inception1.layers[-25:]) - Adding custom classification head:
- Dense layer
(1024 units) - Dropout layer
(Dropping 20% of units, 0.2) - Classification layer (last dense layer)
(6)
- Dense layer
- Optimizer
Adamaxwith a starting learning rate of0.0001 - Callbacks (
early stopping+retrieve best weights) - 20 Epochs
- Excluding original classification layers (
- InceptionV3 II:
- Same previous architecture
- Unfreezing the last 25 layers instead (
inception2.layers[25:]) - 20 Epochs
- InceptionV3 III:
- Using the full architecture (
48 Layers) - Different custom classification head:
- Batch Normalization Layer
- 2 Dense Layers
(256 Units + 128 Units) - 1 Classification layer
(6)
- Same Callbacks
- 20 Epochs
- Using the full architecture (
Results
I - Simple CNN
- Training: The Simple CNN did a horrible job with an evaluation of
68%Accuracy &140%Loss.
- Prediction: The bad training explains the bad predictions done by the Simple CNN, failing to classify a lot of images.
II - Deep CNN
- Training: The Deep CNN did a better job than the normal CNN with an evaluation of
81%Accuracy but still had errors with82%Loss.
- Prediction: The defective training explains the wrong predictions done by the Deep CNN, failing to classify some images.
III - Deep CNN + Fine-tuning
- Training: The Fine-tuned Deep CNN did better than the previous models with an evaluation of
82%Accuracy but still had errors with49%Loss (still high loss).
- Prediction: The improvement in the training results can be seen in the predictions done by the Fine-tuned CNN.
IV - InceptionV3 (ver 1)
- Training: We can already see the supremacy of transfer learning here with an evaluation of
90%Accuracy but still having errors with27%Loss.
- Prediction: The difference cannot be really seen here, but overall it's doing way better than previous models.
IV - InceptionV3 (ver 2)
- Training: Freezing the first 25 layers instead showed a difference with
91%Accuracy but still having errors with25%Loss.
- Prediction: The difference cannot be really seen here, but overall it's doing way better than previous models.
IV - InceptionV3 (ver 3)
- Training: Using the whole model made the performance inferior to previous versions of Inception with an evaluation of
89%Accuracy but having more errors with30%Loss.
- Prediction: The difference cannot be really seen here, but overall it's doing a bit worse than the other Inception versions.
Bonus
This is the confusion matrices of each model, giving better context to the overall performances: