Skip to content

Commit

Permalink
Training added to README
Browse files Browse the repository at this point in the history
  • Loading branch information
AmeyaWagh committed Jun 3, 2018
1 parent 68d484d commit 20395e9
Showing 1 changed file with 90 additions and 5 deletions.
95 changes: 90 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,6 @@

# Lyft Perception Challenge


<!-- ## About the challenge -->
The [lyft Perception challenge](https://www.udacity.com/lyft-challenge) in association with Udacity had an image segmentation task where the candidates had to submit their algorithm which could segment road and cars pixels precisely in real time. The challenge started on *May 1st,2018* and went through *June 3rd, 2018*.

## Approach
Expand All @@ -32,17 +30,104 @@ The loss function consists of 3 losses *L = L<sub>class</sub> + L<sub>box</sub>

The masks are predicted by a [Fully Connected Network](https://arxiv.org/pdf/1605.06211.pdf) for each RoI. This maintains the mxm dimension for each mask and thus for each instance of the object we get distinct masks.

The model output after compiling the keras model can be found at [model](./assets/model.png)
The model description after compiliation can be found at [model](./assets/model.png)

## Training

#### Pre-processing Data
The CARLA datset provided contains train and test images in png format where the masks are in the RED channel of the ground truth mask.
```python
def process_labels(self,labels):

# label 6 - lane lines pixels
# label 7 - lane pixels
# label 10 - car

labels_new = np.zeros(labels.shape)
labels_new_car = np.zeros(labels.shape)

lane_line_idx = (labels == 6).nonzero()
lane_idx = (labels == 7).nonzero()
car_pixels = (labels == 10).nonzero()

# remove car hood pixels
car_hood_idx = (car_pixels[0] >= 495).nonzero()[0]
car_hood_pixels = (car_pixels[0][car_hood_idx], \
car_pixels[1][car_hood_idx])

labels_new[lane_line_idx] = 1
labels_new[lane_idx] = 1

labels_new_car[car_pixels] = 1
labels_new_car[car_hood_pixels] = 0


return np.dstack([labels_new,labels_new_car])
```

#### MaskRCNN Configuration
For this application Resnet-50 was used by setting `BACKBONE = "resnet50"` in config.

#### Processing Data
```python
class LyftChallengeConfig(Config):
"""Configuration for training on the toy shapes dataset.
Derives from the base Config class and overrides values specific
to the toy shapes dataset.
"""
# Give the configuration a recognizable name
NAME = "shapes"

# Backbone network architecture
# Supported values are: resnet50, resnet101
# BACKBONE = "resnet101"
BACKBONE = "resnet50"

# Train on 1 GPU and 8 images per GPU. We can put multiple images on each
# GPU because the images are small. Batch size is 8 (GPUs * images/GPU).
GPU_COUNT = 1
IMAGES_PER_GPU = 1

# Number of classes (including background)
NUM_CLASSES = 1 + 2 # background + 2 shapes

# Use small images for faster training. Set the limits of the small side
# the large side, and that determines the image shape.
IMAGE_MIN_DIM = 128
IMAGE_MAX_DIM = 1024

# Use smaller anchors because our image and objects are small
RPN_ANCHOR_SCALES = (8, 16, 32, 64, 128) # anchor side in pixels

# Reduce training ROIs per image because the images are small and have
# few objects. Aim to allow ROI sampling to pick 33% positive ROIs.
TRAIN_ROIS_PER_IMAGE = 32

# Use a small epoch since the data is simple
STEPS_PER_EPOCH = 100

# use small validation steps since the epoch is small
VALIDATION_STEPS = 5
```


#### Data Augmentation
As the samples provided were very less (1K), data augmentation was necessary to avoid overfitting. [imgaug](https://imgaug.readthedocs.io/en/latest/) is a python module which came handy in adding augmentation to the dataset
As the samples provided were very less (1K), data augmentation was necessary to avoid overfitting. [imgaug](https://imgaug.readthedocs.io/en/latest/) is a python module which came handy in adding augmentation to the dataset. The train function of the MaskRCNN takes augmentation object and augments the images in its generator according to the rules defined in the augmentation object

```python
augmentation = iaa.SomeOf((0, None), [
iaa.Fliplr(0.5),
iaa.Flipud(0.5),
iaa.OneOf([iaa.Affine(rotate=45),
iaa.Affine(rotate=90),
iaa.Affine(rotate=135)]),
iaa.Multiply((0.8, 1.5)),
iaa.GaussianBlur(sigma=(0.0, 5.0)),
iaa.Affine(scale=(0.5, 1.5)),
iaa.Affine(scale={"x": (0.5, 1.5), "y": (0.5, 1.5)}),
])
```

Examples of augmentation are given in [https://github.com/matterport/Mask_RCNN](https://github.com/matterport/Mask_RCNN)

#### Training Loss

Expand Down

0 comments on commit 20395e9

Please sign in to comment.