Training added to README

AmeyaWagh · Jun 3, 2018 · 20395e9 · 20395e9
1 parent 68d484d
commit 20395e9
Showing 1 changed file with 90 additions and 5 deletions.
diff --git a/README.md b/README.md
@@ -8,8 +8,6 @@
 
 # Lyft Perception Challenge
 
-
-<!-- ## About the challenge -->
 The [lyft Perception challenge](https://www.udacity.com/lyft-challenge) in association with Udacity had an image segmentation task where the candidates had to submit their algorithm which could segment road and cars pixels precisely in real time. The challenge started on *May 1st,2018* and went through *June 3rd, 2018*.
 
 ## Approach
@@ -32,17 +30,104 @@ The loss function consists of 3 losses *L = L<sub>class</sub> + L<sub>box</sub>
 
 The masks are predicted by a [Fully Connected Network](https://arxiv.org/pdf/1605.06211.pdf) for each RoI. This maintains the mxm dimension for each mask and thus for each instance of the object we get distinct masks. 
 
-The model output after compiling the keras model can be found at [model](./assets/model.png)
+The model description after compiliation can be found at [model](./assets/model.png)
 
 ## Training
 
+#### Pre-processing Data
+The CARLA datset provided contains train and test images in png format where the masks are in the RED channel of the ground truth mask. 
+```python
+def process_labels(self,labels):
+
+        # label 6 - lane lines pixels
+        # label 7 - lane pixels
+        # label 10 - car
+
+        labels_new = np.zeros(labels.shape)
+        labels_new_car = np.zeros(labels.shape)
+
+        lane_line_idx = (labels == 6).nonzero()
+        lane_idx = (labels == 7).nonzero()
+        car_pixels = (labels == 10).nonzero()
+
+        # remove car hood pixels
+        car_hood_idx = (car_pixels[0] >= 495).nonzero()[0]
+        car_hood_pixels = (car_pixels[0][car_hood_idx], \
+                       car_pixels[1][car_hood_idx])
+
+        labels_new[lane_line_idx] = 1
+        labels_new[lane_idx] = 1
+
+        labels_new_car[car_pixels] = 1
+        labels_new_car[car_hood_pixels] = 0
+
+
+        return np.dstack([labels_new,labels_new_car])
+```
+
 #### MaskRCNN Configuration
 For this application Resnet-50 was used by setting `BACKBONE = "resnet50"` in config.
 
-#### Processing Data
+```python
+class LyftChallengeConfig(Config):
+    """Configuration for training on the toy shapes dataset.
+    Derives from the base Config class and overrides values specific
+    to the toy shapes dataset.
+    """
+    # Give the configuration a recognizable name
+    NAME = "shapes"
+
+    # Backbone network architecture
+    # Supported values are: resnet50, resnet101
+    # BACKBONE = "resnet101"
+    BACKBONE = "resnet50"
+
+    # Train on 1 GPU and 8 images per GPU. We can put multiple images on each
+    # GPU because the images are small. Batch size is 8 (GPUs * images/GPU).
+    GPU_COUNT = 1
+    IMAGES_PER_GPU = 1
+
+    # Number of classes (including background)
+    NUM_CLASSES = 1 + 2  # background + 2 shapes
+
+    # Use small images for faster training. Set the limits of the small side
+    # the large side, and that determines the image shape.
+    IMAGE_MIN_DIM = 128
+    IMAGE_MAX_DIM = 1024
+
+    # Use smaller anchors because our image and objects are small
+    RPN_ANCHOR_SCALES = (8, 16, 32, 64, 128)  # anchor side in pixels
+
+    # Reduce training ROIs per image because the images are small and have
+    # few objects. Aim to allow ROI sampling to pick 33% positive ROIs.
+    TRAIN_ROIS_PER_IMAGE = 32
+
+    # Use a small epoch since the data is simple
+    STEPS_PER_EPOCH = 100
+
+    # use small validation steps since the epoch is small
+    VALIDATION_STEPS = 5
+```
+
 
 #### Data Augmentation
-As the samples provided were very less (1K), data augmentation was necessary to avoid overfitting. [imgaug](https://imgaug.readthedocs.io/en/latest/) is a python module which came handy in adding augmentation to the dataset
+As the samples provided were very less (1K), data augmentation was necessary to avoid overfitting. [imgaug](https://imgaug.readthedocs.io/en/latest/) is a python module which came handy in adding augmentation to the dataset. The train function of the MaskRCNN takes augmentation object and augments the images in its generator according to the rules defined in the augmentation object 
+
+```python
+augmentation = iaa.SomeOf((0, None), [
+        iaa.Fliplr(0.5),
+        iaa.Flipud(0.5),
+        iaa.OneOf([iaa.Affine(rotate=45),
+                   iaa.Affine(rotate=90),
+                   iaa.Affine(rotate=135)]),
+        iaa.Multiply((0.8, 1.5)),
+        iaa.GaussianBlur(sigma=(0.0, 5.0)),
+        iaa.Affine(scale=(0.5, 1.5)),
+        iaa.Affine(scale={"x": (0.5, 1.5), "y": (0.5, 1.5)}),
+    ])
+```
+
+Examples of augmentation are given in [https://github.com/matterport/Mask_RCNN](https://github.com/matterport/Mask_RCNN)
 
 #### Training Loss