Skip to content

Commit

Permalink
gif added, README updated
Browse files Browse the repository at this point in the history
  • Loading branch information
AmeyaWagh committed Jun 3, 2018
1 parent 8469400 commit 0fb9e99
Show file tree
Hide file tree
Showing 4 changed files with 21 additions and 18 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
Train/
*.h5
logs/
*.zip
*.zip
*.mp4
13 changes: 12 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,17 @@ The [lyft Perception challenge](https://www.udacity.com/lyft-challenge) in assoc
## Approach
Although it was a segmentation problem and did not require instance segmentation, I went ahead with [MASK-RCNN](https://arxiv.org/pdf/1703.06870.pdf) as it was the state of the art algorithm in image segmentation and I was always intrigued to learn about it. Also I started on *28th*, just after finishing my first term, so transfer learning was my only shot. :sweat:


*Click to Watch on youtube*


<div style="text-align:center;">
<a href=https://www.youtube.com/watch?v=Q56fzNjmYKc>
<img src=./assets/final.gif width="640" height="480" >
</a>
</div>


#### Mask-RCNN (A brief overview)

Mask-RCNN, also known as [Detectron](https://github.com/facebookresearch/Detectron) is a research platform for object detection developed by facebookresearch. It is mainly a modification of Faster RCNN with a segmentation branch parallel to class predictor and bounding box regressor. The vanilla ResNet is used in an FPN setting as a backbone to Faster RCNN so that features can be extracted at multiple levels of the feature pyramid
Expand Down Expand Up @@ -66,7 +77,7 @@ def process_labels(self,labels):
```

#### MaskRCNN Configuration
For this application Resnet-50 was used by setting `BACKBONE = "resnet50"` in config.
For this application Resnet-50 was used by setting `BACKBONE = "resnet50"` in config. `NUM_CLASSES = 1+2` for 2 classes (car,road) and `IMAGE_MAX_DIM = 1024` as image dimensions are 800x600.

```python
class LyftChallengeConfig(Config):
Expand Down
Binary file added assets/final.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
23 changes: 7 additions & 16 deletions inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
import mrcnn.model as modellib
from mrcnn import visualize
from mrcnn.model import log
from moviepy.editor import VideoFileClip



Expand Down Expand Up @@ -165,21 +166,11 @@ def segment_images_batch(original_images):
except KeyboardInterrupt as e:
break

# video_len = video.shape[0]
# offset=1
# a = time.time()
# for idx in range(0,video_len,4):
# # if video_len-idx >4:
# # offset = 4
# # else:
# # offset = video_len-idx
# # model.config.IMAGES_PER_GPU = offset
# rgb_frames = video[idx:idx+offset,:,:,:]
# print(rgb_frames.shape)
# final_img = segment_images_batch(rgb_frames)
# b=time.time()
# _secs = (b-a)%60

# print("FPS:",video_len/_secs)
def process_video(INPUT_FILE,OUTPUT_FILE):
video = VideoFileClip(INPUT_FILE)
processed_video = video.fl_image(segment_images)
processed_video.write_videofile(OUTPUT_FILE, audio=False)

process_video('./test_video.mp4','./output_video.mp4')

exit()

0 comments on commit 0fb9e99

Please sign in to comment.