Performance Study of InceptionV3 & CNN

repo-cover

Properties	Details
Name	Intel Image Classification
Source	Kaggle
Author	PUNEET BANSAL
Data Type	Images
Problem Type	Image Classification
Files	3 Folders (14K Training, 3K Testing, 7K Prediction)
Classes	6 Classes (Buildings, Forest, Glacier, Mountain, Sea, Street)

Preprocessing Steps

Image Resizing: Image has been resized to (150, 150).
Image Normalisation: Normalising pixel values from 0-255 to between 0-1.
Encoding Image Labels: Transforming labels from categorical format to numerical using Label Encoder.

Modelling

In this part, we will go through the different CNN versions that we created from scratch and using transfer learning with different versions, comparing the results including fine-tuning:

Simple CNN:
- 1 convolutional layer (32 Units)
- 1 MaxPooling layer
- 2 Dense Layers (128 Units, 6)
- 10 Epochs

Deep CNN:
- 4 convolutional layers (2 * 32 Units + 64 Units + 128 Units)
- 4 MaxPooling Layers
- 3 Dense Layers (128 Units + 64 Units, 6)
- 10 Epochs

Deep CNN + Fine-tuning:
- Same previous architecture
- Change optimizer to Adamax
- Added Callbacks (early stopping + LR Scheduler)
- 20 Epochs

InceptionV3 I:
- Excluding original classification layers (include_top = False)
- Using ImageNet weights (weights='imagenet')
- Unfreezing the first 25 layers (inception1.layers[-25:])
- Adding custom classification head:
  - Dense layer (1024 units)
  - Dropout layer (Dropping 20% of units, 0.2)
  - Classification layer (last dense layer) (6)
- Optimizer Adamax with a starting learning rate of 0.0001
- Callbacks (early stopping + retrieve best weights)
- 20 Epochs

InceptionV3 II:
- Same previous architecture
- Unfreezing the last 25 layers instead (inception2.layers[25:])
- 20 Epochs

InceptionV3 III:
- Using the full architecture (48 Layers)
- Different custom classification head:
  - Batch Normalization Layer
  - 2 Dense Layers (256 Units + 128 Units)
  - 1 Classification layer (6)
- Same Callbacks
- 20 Epochs

Results

I - Simple CNN

Training: The Simple CNN did a horrible job with an evaluation of 68% Accuracy & 140% Loss.
Prediction: The bad training explains the bad predictions done by the Simple CNN, failing to classify a lot of images.

II - Deep CNN

Training: The Deep CNN did a better job than the normal CNN with an evaluation of 81% Accuracy but still had errors with 82% Loss.

Prediction: The defective training explains the wrong predictions done by the Deep CNN, failing to classify some images.

III - Deep CNN + Fine-tuning

Training: The Fine-tuned Deep CNN did better than the previous models with an evaluation of 82% Accuracy but still had errors with 49% Loss (still high loss).

Prediction: The improvement in the training results can be seen in the predictions done by the Fine-tuned CNN.

IV - InceptionV3 (ver 1)

Training: We can already see the supremacy of transfer learning here with an evaluation of 90% Accuracy but still having errors with 27% Loss.

Prediction: The difference cannot be really seen here, but overall it's doing way better than previous models.

IV - InceptionV3 (ver 2)

Training: Freezing the first 25 layers instead showed a difference with 91% Accuracy but still having errors with 25% Loss.

Prediction: The difference cannot be really seen here, but overall it's doing way better than previous models.

IV - InceptionV3 (ver 3)

Training: Using the whole model made the performance inferior to previous versions of Inception with an evaluation of 89% Accuracy but having more errors with 30% Loss.

Prediction: The difference cannot be really seen here, but overall it's doing a bit worse than the other Inception versions.

Bonus

This is the confusion matrices of each model, giving better context to the overall performances: