Skip to content

Commit

Permalink
rename repo and add instructions in README file
Browse files Browse the repository at this point in the history
  • Loading branch information
RocketFlash committed Sep 7, 2019
1 parent 5b0f3c5 commit 0ab9710
Show file tree
Hide file tree
Showing 17 changed files with 9,478 additions and 9,249 deletions.
83 changes: 81 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,81 @@
# SiameseNet
Siamese network for image classification
# Siamese and Triplet networks for image classification

This repository contains implementation of a deep neural networks for embeddings learning using Siamese and Triplets approaches with different negative samples mining strategies.

# Installation

## Install dependencies

### Requirements

- keras
- tensorflow
- scikit-learn
- opencv
- matplotlib
- plotly - for interactive t-SNE plot visualization
- [albumentations](https://github.com/albu/albumentations) - for online augmentation during training
- [image-classifiers](https://github.com/qubvel/classification_models) - for different backbone models
- [keras-rectified-adam](https://github.com/CyberZHG/keras-radam) - for cool state-of-the-art optimization

```bash
$ pip3 install -r requirements.txt
```

# Train

In the training dataset, the data for training and validation should be in separate folders, in each of which folders with images for each class. Dataset should have the following structure:

```
Dataset
└───train
│ └───class_1
│ │ image1.jpg
│ │ image2.jpg
│ │ ...
│ └───class_2
│ | image1.jpg
│ │ image2.jpg
│ │ ...
│ └───class_N
│ │ ...
└───val
│ └───class_1
│ │ image1.jpg
│ │ image2.jpg
│ │ ...
│ └───class_2
│ | image1.jpg
│ │ image2.jpg
│ │ ...
│ └───class_N
│ │ ...
```

For training, it is necessary to create a configuration file in which all network parameters and training parameters will be indicated. Examples of configuration files can be found in the **configs** folder.

After the configuration file is created, you can modify **train.py** file, and then start training:

```bash
$ python3 train.py
```

# Test

The trained model can be tested using the following command:

```bash
$ python3 test.py [--weights (path to trained model weights file)]
[--encodings (path to trained model encodings file)]
[--image (path to image file)]
```

Is is also possible to use [test_network.ipynb](https://github.com/RocketFlash/SiameseNet/blob/master/test_network.ipynb) notebook to test the trained network and visualize input data as well as output encodings.

# Embeddings visualization

Result encodings could be visualized interactively using **plot_tsne_interactive** function in [utils.py](https://github.com/RocketFlash/SiameseNet/blob/master/embedding_net/utils.py).

t-SNE plot of russian traffic sign images embeddings (107 classes):
![t-SNE example](images/t-sne.png)
5 changes: 3 additions & 2 deletions configs/road_signs_resnet18.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
input_shape : [48, 48, 3]
encodings_len: 1024
encodings_len: 256
margin: 0.5
mode : 'triplet'
distance_type : 'l1'
backbone : 'resnet18'
Expand All @@ -16,4 +17,4 @@ tensorboard_log_path : 'tf_log/'
weights_save_path : 'weights/'
plots_path : 'plots/'
encodings_path : 'encodings/'
model_save_name : 'best_model.h5'
model_save_name : 'best_model_resnet18.h5'
20 changes: 20 additions & 0 deletions configs/road_signs_resnet18_merged_dataset.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
input_shape : [48, 48, 3]
encodings_len: 256
margin: 0.7
mode : 'triplet'
distance_type : 'l1'
backbone : 'resnet18'
backbone_weights : 'imagenet'
optimizer : 'radam'
learning_rate : 0.0001
project_name : 'road_signs/'
freeze_backbone : False
embeddings_normalization: True

#paths
dataset_path : '/home/rauf/datasets/road_signs_merged/'
tensorboard_log_path : 'tf_log/'
weights_save_path : 'weights/'
plots_path : 'plots/'
encodings_path : 'encodings/'
model_save_name : 'best_model_resnet18_merged.h5'
2 changes: 1 addition & 1 deletion configs/road_signs_resnext50_merged_dataset.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
input_shape : [48, 48, 3]
encodings_len: 256
margin: 0.5
margin: 0.7
mode : 'triplet'
distance_type : 'l1'
backbone : 'resnext50'
Expand Down
File renamed without changes.
File renamed without changes.
7 changes: 2 additions & 5 deletions siamese_net/backbones.py → embedding_net/backbones.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
from keras.layers import Dense, Input, Lambda, Dropout, Flatten
from keras.layers import Conv2D, MaxPool2D, BatchNormalization, concatenate
from classification_models import Classifiers
from keras.models import Model
from keras.regularizers import l2
import keras.backend as K
Expand Down Expand Up @@ -71,6 +70,7 @@ def get_backbone(input_shape,
base_model = Model(
inputs=[input_image], outputs=[encoded_output])
else:
from classification_models import Classifiers
classifier, preprocess_input = Classifiers.get(backbone_type)
backbone_model = classifier(input_shape=input_shape,
weights=backbone_weights,
Expand All @@ -82,10 +82,7 @@ def get_backbone(input_shape,

after_backbone = backbone_model.output
x = Flatten()(after_backbone)
# x = Dense(512, activation="relu")(x)
# x = Dropout(0.5)(x)
# x = Dense(512, activation="relu")(x)
# x = Dropout(0.5)(x)

encoded_output = Dense(encodings_len, activation="relu")(x)
if embeddings_normalization:
encoded_output = Lambda(lambda x: K.l2_normalize(
Expand Down
29 changes: 13 additions & 16 deletions siamese_net/data_loader.py → embedding_net/data_loader.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,13 @@
from itertools import combinations
from sklearn.metrics import pairwise_distances

# TODO
# -[ ] Implement online semi-hard triplets mining
# -[ ] Implement offline semi-hard triplets mining


class SiameseImageLoader:
class EmbeddingNetImageLoader:
"""
Image loader for Siamese network
Image loader for Embedding network
"""

def __init__(self, dataset_path, margin=0.5, input_shape=None, augmentations=None, data_subsets=['train', 'val']):
def __init__(self, dataset_path, input_shape=None, augmentations=None, data_subsets=['train', 'val']):
self.dataset_path = dataset_path
self.data_subsets = data_subsets
self.images_paths = {}
Expand All @@ -31,7 +27,6 @@ def __init__(self, dataset_path, margin=0.5, input_shape=None, augmentations=Non
self.n_samples = {d: len(self.images_paths[d]) for d in data_subsets}
self.indexes = {d: {cl: np.where(np.array(self.images_labels[d]) == cl)[
0] for cl in self.classes[d]} for d in data_subsets}
self.margin = margin

def _load_images_paths(self):
for d in self.data_subsets:
Expand Down Expand Up @@ -142,23 +137,24 @@ def get_batch_triplets(self, batch_size, s='train'):
def get_batch_triplets_batch_all(self):
pass

def hardest_negative(self, loss_values):
def hardest_negative(self, loss_values, margin=0.5):
hard_negative = np.argmax(loss_values)
return hard_negative if loss_values[hard_negative] > 0 else None

def random_hard_negative(self, loss_values):
def random_hard_negative(self, loss_values, margin=0.5):
hard_negatives = np.where(loss_values > 0)[0]
return np.random.choice(hard_negatives) if len(hard_negatives) > 0 else None

def semihard_negative(self, loss_values):
semihard_negatives = np.where(np.logical_and(loss_values < self.margin, loss_values > 0))[0]
def semihard_negative(self, loss_values, margin=0.5):
semihard_negatives = np.where(np.logical_and(loss_values < margin, loss_values > 0))[0]
return np.random.choice(semihard_negatives) if len(semihard_negatives) > 0 else None


def get_batch_triplets_mining(self,
embedding_model,
n_classes,
n_samples,
n_samples,
margin = 0.5,
negative_selection_mode='semihard',
s='train'):
if negative_selection_mode == 'semihard':
Expand Down Expand Up @@ -206,9 +202,9 @@ def get_batch_triplets_mining(self,

ap_distances = distance_matrix[anchor_positives[:,0], anchor_positives[:,1]]
for anchor_positive, ap_distance in zip(anchor_positives, ap_distances):
loss_values = ap_distance - distance_matrix[anchor_positive[0], negative_indices] + self.margin
loss_values = ap_distance - distance_matrix[anchor_positive[0], negative_indices] + margin
loss_values = np.array(loss_values)
hard_negative = negative_selection_fn(loss_values)
hard_negative = negative_selection_fn(loss_values, margin = margin)
if hard_negative is not None:
hard_negative = negative_indices[hard_negative]
triplet_anchors.append(all_images[anchor_positive[0]])
Expand Down Expand Up @@ -239,11 +235,12 @@ def generate(self, batch_size, mode='siamese', s='train'):
data, targets = self.get_batch_triplets(batch_size, s)
yield (data, targets)

def generate_mining(self, embedding_model, n_classes, n_samples, negative_selection_mode='semihard', s='train'):
def generate_mining(self, embedding_model, n_classes, n_samples, margin = 0.5, negative_selection_mode='semihard', s='train'):
while True:
data, targets = self.get_batch_triplets_mining(embedding_model,
n_classes,
n_samples,
margin = margin,
negative_selection_mode='semihard',
s=s)
yield (data, targets)
Expand Down
File renamed without changes.
10 changes: 6 additions & 4 deletions siamese_net/model.py → embedding_net/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
import matplotlib.pyplot as plt
from sklearn.neighbors import KNeighborsClassifier

class SiameseNet:
class EmbeddingNet:
"""
SiameseNet for image classification
distance_type = 'l1' -> l1_loss
Expand Down Expand Up @@ -60,6 +60,8 @@ def __init__(self, cfg_file=None):
self._create_model_triplet()

self.encoded_training_data = {}
else:
self.margin = 0.5


def _create_base_model(self):
Expand Down Expand Up @@ -100,7 +102,6 @@ def _create_model_siamese(self):
prediction = distance
metric = lac.accuracy

# self.l_model = Model(inputs=[image_encoding_1, image_encoding_2], outputs=[prediction])
self.model = Model(
inputs=[input_image_1, input_image_2], outputs=prediction)

Expand Down Expand Up @@ -170,12 +171,13 @@ def train_generator_mining(self,
with_val=True,
n_classes=4,
n_samples=4,
val_batch=8,
negative_selection_mode='semihard',
verbose=1):

train_generator = self.data_loader.generate_mining(self.base_model, n_classes, n_samples, negative_selection_mode=negative_selection_mode, s="train")
train_generator = self.data_loader.generate_mining(self.base_model, n_classes, n_samples, margin=self.margin, negative_selection_mode=negative_selection_mode, s="train")
# val_generator = self.data_loader.generate_mining(self.base_model, n_classes, n_samples, negative_selection_mode=negative_selection_mode, s="val")
val_generator = self.data_loader.generate(8, mode=self.mode, s="val")
val_generator = self.data_loader.generate(val_batch, mode=self.mode, s="val")

history = self.model.fit_generator(train_generator, steps_per_epoch=steps_per_epoch, epochs=epochs,
verbose=verbose, validation_data = val_generator, validation_steps = val_steps, callbacks=callbacks)
Expand Down
5 changes: 2 additions & 3 deletions siamese_net/utils.py → embedding_net/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import yaml
from keras import optimizers
from .augmentations import get_aug
from .data_loader import SiameseImageLoader
from .data_loader import EmbeddingNetImageLoader


def load_encodings(path_to_encodings):
Expand Down Expand Up @@ -110,8 +110,7 @@ def parse_net_params(filename='configs/road_signs.yml'):
cfg['project_name'])
params['model_save_name'] = cfg['model_save_name']
if 'dataset_path' in cfg:
params['loader'] = SiameseImageLoader(cfg['dataset_path'],
margin = cfg['margin'],
params['loader'] = EmbeddingNetImageLoader(cfg['dataset_path'],
input_shape=cfg['input_shape'],
augmentations=augmentations)

Expand Down
Binary file added images/._t-sne.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/t-sne.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 3 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,6 @@ keras
tensorflow-gpu
matplotlib
albumentations
pydot
scikit-learn
opencv-python
keras-rectified-adam
27 changes: 20 additions & 7 deletions test.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,22 @@
from siamese_net.model import SiameseNet
from embedding_net.model import EmbeddingNet
import argparse

model = SiameseNet()
model.load_model('weights/road_signs/best_model_4.h5')
model.load_encodings('encodings/road_signs/encodings.pkl')
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--weights", type=str,
help="path to trained model weights file")
parser.add_argument("--encodings", type=str,
help="path to trained model encodings file")
parser.add_argument("--image", type=str, help="path to image file")
opt = parser.parse_args()

image_path = '/home/rauf/datasets/road_signs/road_signs_separated/val/1_1/rtsd-r1_train_006470.png'
model_prediction = model.predict(image_path)
print('Model prediction: {}'.format(model_prediction))
weights_path = opt.weights
encodings_path = opt.encodings
image_path = opt.image

model = EmbeddingNet()
model.load_model(weights_path)
model.load_encodings(encodings_path)

model_prediction = model.predict(image_path)
print('Model prediction: {}'.format(model_prediction))
Loading

0 comments on commit 0ab9710

Please sign in to comment.