GitHub - TreB1eN/Dog-breed-competition-with-only-the-pretrained-models-from-imagenet

Dog breed competition with only the pretrained models from imagenet

This post describe the method and tips I got from participanting in the Dog Breed challenge[Dog Breed challenge] in Kaggle. I managed to get an final score of 0.13783, which sets me in the 158 postion, considering a lot of competitors are leveraging the 3-rd party dataset(which already contain the test data), I believe my approach worth sharing cause there is nothing else being used except the pretrained imagenet models.(However, there are a ensembles :O) [Dog Breed challenge]: http://www.kaggle.com/c/dog-breed-identification

import os
os.chdir('D:\Machine Learning\Kaggle\Dog Breed Identification\pytorch')

I delve into this problem firtsly using keras. However I cannot find powerful pretrained models like nasnet in Keras. Then I found this awesome package of [Pytorch pretrained models], all the models I have tryed are actually coming from this package. [Pytorch pretrained models]: http://github.com/Cadene/pretrained-models.pytorch

from PIL import  Image
import torch
from torch.utils.data import Dataset,DataLoader,TensorDataset,ConcatDataset
from torchvision import transforms as trans
from torchvision import models,utils
from pathlib import Path
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
from tqdm import tqdm
import pretrainedmodels
from torch import nn
from torch import optim
from torch.autograd import Variable

A dog image reading class is created using Pytorch Dataset. Please refer to the file for detail.

from dataset.dataset import Dogs

Setting all the hyperparameters,you can set batch_size larger if you have enough GPU memory

work_folder = Path('D:\Machine Learning\Kaggle\Dog Breed Identification')
train_image_folder = work_folder/'train'
test_image_folder = work_folder/'test'
bottlenecks_folder = work_folder/'pytorch'/'bottlenecks'
pred_folder = work_folder/'pred'
df_train = pd.read_csv(work_folder/'labels.csv',index_col=0)
df_test = pd.read_csv(work_folder/'sample_submission.csv',index_col=0)
img_size = 331
batch_size = 4
batch_size_top = 4096
use_cuda = torch.cuda.is_available()
date = '0222'
model_name = 'nasnet'
learning_rate = 0.0001
dropout_ratio = 0.5
input_shape = 331
crop_mode = 'center'
use_bias = True
name = '{}__model={}__lr={}__input_shape={}__drop={}__crop_mode={}__bias={}'.format(date,model_name,learning_rate,input_shape,dropout_ratio,crop_mode,use_bias)

I found out there 2 ways to preprocess the diffrent size image into same shape, resize and center cropping. The diffrence is subtle between them, hence it become part of hyperparameters, following transforms is tested for image preprocessing, however, after several checks I can say center cropping can gain a better result than resize.It looks like at least in this dataset it's better to keep the original image height and width ratio than keep the image margin.

if crop_mode == 'center':
    transforms = trans.Compose([
        trans.Resize(input_shape),
        trans.CenterCrop(input_shape),
        trans.ToTensor(),
        trans.Normalize([0.5, 0.5, 0.5],[0.5, 0.5, 0.5])])
elif crop_mode == 'resize':
    transforms = trans.Compose([
        trans.Resize((input_shape,input_shape)),
        trans.ToTensor(),
        trans.Normalize([0.5, 0.5, 0.5],[0.5, 0.5, 0.5])])

Create the corresponding datasets and dataloader

train_dataset = Dogs(train_image_folder,df_train,df_test,is_train=True,resize=False,transforms=transforms)
test_dataset = Dogs(test_image_folder,df_train,df_test,False,resize=False,transforms=transforms)
train_dataset_resize = Dogs(train_image_folder,df_train,df_test,is_train=True,resize=True,transforms=transforms)

train_loader = DataLoader(train_dataset,batch_size,num_workers=0,shuffle=False)
test_loader = DataLoader(test_dataset,batch_size,num_workers=0,shuffle=False)

We can see the diffrence between center crop and resize here

img_center_crop = train_dataset.__getitem__(0)[0]*0.5 + 0.5

transforms_resize = trans.Compose([
        trans.Resize((input_shape,input_shape)),
        trans.ToTensor(),
        trans.Normalize([0.5, 0.5, 0.5],[0.5, 0.5, 0.5])])

train_dataset_resize = Dogs(train_image_folder,df_train,df_test,is_train=True,resize=False,transforms=transforms_resize)

img_resize = train_dataset_resize.__getitem__(0)[0]*0.5 + 0.5

trans.ToPILImage()(img_center_crop)

trans.ToPILImage()(img_resize)

The key to transfer learning is to get the bottleneck outputs.Normally we are using the second last layer before the final softmax classifier.
For nasnet in the predefined model,we can simply realize this by changing the last 2 layers of the original model into an identity mapping.

def get_extraction_model():
    nasnet = pretrainedmodels.nasnetalarge(num_classes=1000)
    nasnet = nasnet.eval()
    nasnet.avg_pool = nn.AdaptiveAvgPool2d(1)
    del nasnet.dropout
    del nasnet.last_linear
    nasnet.dropout = lambda x:x
    nasnet.last_linear = lambda x:x
    return nasnet

extraction_nasnet = get_extraction_model()

if use_cuda:
    extraction_nasnet.cuda()

function to get the bottleneck output, notice that we keep the dataloader not shuffled so the output is sequential

def get_bottlenecks(data_loader,extration_model,test_mode=False):
    x_pieces = []
    y_pieces = []
    for x,y in tqdm(iter(data_loader)):
        if use_cuda:
            x = Variable(x)
            y = Variable(y) if not test_mode else y
            x = x.cuda()
            y = y.cuda() if not test_mode else y
        x_pieces.append(extration_model(x).cpu().data.numpy())
        y_pieces.append(y.cpu().data.numpy()) if not test_mode else y_pieces
    bottlenecks_x = np.concatenate(x_pieces)
    bottlenecks_y = np.concatenate(y_pieces) if not test_mode else None
    return bottlenecks_x,bottlenecks_y

bottlenecks_x,bottlenecks_y= get_bottlenecks(train_loader,extraction_nasnet)

# np.save(bottlenecks_folder/(name+'_x'),bottlenecks_x)
# np.save(bottlenecks_folder/(name+'_y'),bottlenecks_y)

# bottlenecks_x = np.load(bottlenecks_folder/(name + '_x.npy'))
# bottlenecks_y = np.load(bottlenecks_folder/(name + '_y.npy'))

delete the model to save GPU memory

del extraction_nasnet

Create the linear layer whose input is the bottleneck features, output is the 120 classes.

class TopModule(nn.Module):
    def __init__(self,dropout_ratio):
        super(TopModule, self).__init__()
        self.aff = nn.Linear(4032, 120,bias=use_bias)
        self.dropout_ratio = dropout_ratio
    def forward(self,x):
        x = nn.Dropout(p = dropout_ratio)(x)
        x = self.aff(x)
        return x

criterion = nn.CrossEntropyLoss()
criterion = criterion.cuda()

train and validation data split

permutation = np.random.permutation(bottlenecks_x.shape[0])

x_train = bottlenecks_x[permutation][:-int(bottlenecks_x.shape[0]//5)]
x_val = bottlenecks_x[permutation][-int(bottlenecks_x.shape[0]//5):]
y_train = bottlenecks_y[permutation][:-int(bottlenecks_y.shape[0]//5)]
y_val = bottlenecks_y[permutation][-int(bottlenecks_y.shape[0]//5):]

top_only_train_dataset = TensorDataset(torch.FloatTensor(x_train),torch.LongTensor(y_train))

top_only_val_dataset = TensorDataset(torch.FloatTensor(x_val),torch.LongTensor(y_val))

top_only_train_loader = DataLoader(top_only_train_dataset,batch_size=batch_size_top,shuffle=True)
top_only_val_loader = DataLoader(top_only_val_dataset,batch_size=batch_size_top,shuffle=True)

total_dataset = ConcatDataset([top_only_train_dataset,top_only_val_dataset])
total_loader = DataLoader(total_dataset,batch_size=batch_size_top,shuffle=True)

training function.

def fit(loader,optimizer,criterion,model=top_only_model,epochs=1500,evaluate=True):
    val_loss_history = []
    val_acc_history = []
    for epoch in range(epochs):  
        running_loss = 0.0
        for i, data in enumerate(loader, 0):
        
            inputs, labels = data
            inputs, labels = Variable(inputs), Variable(labels)
            if use_cuda:
                inputs = inputs.cuda()
                labels = labels.cuda()
        
            optimizer.zero_grad()
        
            # forward + backward 
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            loss.backward()   
        
            optimizer.step()
        
        running_loss += loss.data[0]
    
        print('[%d, %5d] Train_loss: %.3f' \% (epoch+1, i+1, running_loss / len(loader)))
    
        if evaluate:
            model.eval()
            outputs = model(Variable(torch.from_numpy(x_val),volatile=True).cuda() if use_cuda else Variable(torch.from_numpy(x_val),volatile=True))
            labels = torch.from_numpy(y_val).cuda() if use_cuda else torch.from_numpy(y_val)
            labels = Variable(labels,volatile=True)
            loss = criterion(outputs,labels)
            x_val_v = Variable(torch.FloatTensor(x_val),volatile=True).cuda() if use_cuda else Variable(torch.FloatTensor(x_val),volatile=True)
            _,pred = torch.max(model(x_val_v),1)
            val_acc = np.mean(pred.cpu().data.numpy() == labels.cpu().data.numpy())
            val_loss_history.append(loss.cpu().data.numpy())
            val_acc_history.append(val_acc)
    
            print('[%d] Val_loss: %.3f'% (epoch+1, loss))
            print('[%d] Val_acc: %.3f'% (epoch+1, val_acc))
            model.train()
    
    print('Finished Training')
    return val_loss_history,val_acc_history

val_loss_history,val_acc_history = fit(top_only_train_loader,optimizer,criterion,top_only_model,epochs=1500,evaluate=True)

get the best_epochs, and use it to train all the data(yes,I 'd like any tiny bit of improvement :)

best_epochs = np.argmin(np.array(val_loss_history))

best_epochs

best_val_loss = min(val_loss_history)

best_val_loss

top_only_model = TopModule(dropout_ratio)
if use_cuda:
    top_only_model = top_only_model.cuda()
optimizer = optim.Adam(top_only_model.parameters(),lr=learning_rate)

fit(total_loader,optimizer,criterion,top_only_model,epochs=best_epochs,evaluate=False)

extraction_nasnet = get_extraction_model()

if use_cuda:
    extraction_nasnet.cuda()

get the test bottleneck features.

bottlenecks_test_x,test_y = get_bottlenecks(test_loader,extraction_nasnet,True)

np.save(bottlenecks_folder/(name+'_test_x'),bottlenecks_test_x)

del extraction_nasnet

remember to switch the top model to eval mode, cause it used dropout

top_only_model.eval()

generate the final prediction.

x_test = Variable(torch.FloatTensor(bottlenecks_test_x),volatile=True).cuda() if use_cuda else Variable(torch.FloatTensor(bottlenecks_test_x),volatile=True)

pred_np = (nn.Softmax(1)(top_only_model(x_test))).cpu().data.numpy()

df_pred = pd.DataFrame(pred_np,index=df_test.index,columns=df_test.columns)

df_pred.to_csv(pred_folder/(name+'.csv'))

By simply using nasnet pretrained model, I got a score of 0.157.
Final score of 0.137 is achieved through [psuedo labeling] and ensemble with results from other models.

some tips I got:

center cropping is better than resize
I tried data augmentation, which is not helping, I think this is because there is already a lot of dog pictures of diffrent dog species in the imagenet, hence the model already learned enough feature format in the upper layer
I tried nasnet,inceptionv4,inceptionresnetv2,dpn107,xception,resnet152,inceptionv3 and some other models, the comparison of their performance in this task is identical to their result in the imagenet. Hence, I guess better model in imagenet can get better transfer learning performance, at least in here, fair enough
I also tried to bind the bottleneck features from diffrent models together and train a linear classifier on it,it works, and helped as an important ensemble portion.

some other things I should try if got more time:

found out the classes have the highest error rate, do something about it
more playing with the input resolution, I just used the original imagenet input size, I wonder using clearer picture whether would help
another round of pseudo labeling
K fold validation

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
dataset		dataset
doc		doc
docs		docs
README.md		README.md
_config.yml		_config.yml
dogbreed_pytorch_nasnet.html		dogbreed_pytorch_nasnet.html
dogbreed_pytorch_nasnet.ipynb		dogbreed_pytorch_nasnet.ipynb
output_15_0.png		output_15_0.png
output_16_0.png		output_16_0.png
previously used keras version.ipynb		previously used keras version.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dog breed competition with only the pretrained models from imagenet

some tips I got:

some other things I should try if got more time:

About

Releases

Packages

Languages

TreB1eN/Dog-breed-competition-with-only-the-pretrained-models-from-imagenet

Folders and files

Latest commit

History

Repository files navigation

Dog breed competition with only the pretrained models from imagenet

some tips I got:

some other things I should try if got more time:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages