Skip to content

Latest commit

 

History

History
131 lines (86 loc) · 6.17 KB

File metadata and controls

131 lines (86 loc) · 6.17 KB

ADVSCD: Anomaly Driven Video Summarization Using Change Detection

Objective

Perform video summarization on lengthy CCTV footage focused on public crimes.

Dataset Used:

UCF-Crime Dataset : https://www.crcv.ucf.edu/projects/real-world/

Highlights

  • Efficiently summarizes large videos around crime anomalies present in the UCF-crime dataset when tested on a small-scale
  • Introduces a novel quantitative evaluation of video summarization
  • Compactness of the output video

The figure below shows the algorithm's effectiveness when tested on a few randomly chosen videos from the UCF-crime dataset.

image

Proposed Architecture

The final architecture followed in the project is shown below:

image

Stages of the Architecture

1. CDM (Change Detection Module)

This module identifies the frames undergoing significant changes in the scene and passes that to the next stage of the architecture.

The figure below illustrates the operations happening in the CDM module:

image

2. ADM (Anomaly Detection Module)

This module identifies the crime anomalies happening in the scenes and passes the potential anomalies to the next stage of the architecture.

The figure below shows the architecture of the adopted LSTM-based pre-trained Autoencoder fine-tuned on the UCF-crime dataset:

image

3. CBT (Clustering-Based Technique)

K-means clustering is performed on the selected anomalies for categorizing critical anomalies without any manual intervention.

image

4. VSM (Video Summarization Module)

This module performs video summarization on selected frames by combining some of the frames from the original video to generate a meaningful compact video output centred around the anomaly.

State-of-the-art adopted either as it is or modified for this project

Literature Aim Adopted Idea Part of the proposed Architecture
Thapa et al. Moding Object Detection Frame Differencing/Summing Change Detection
Chong and Tay Anomaly Detection in Videos Spatiotemporal Autoencoder, ConvLSTM Anomaly Detection Model
Hasan et al. Learning Regularity in Videos Use of Autoencoder, Reconstruction Score Anomaly Detection
Jadon and Jasim Video Summarization Video Skimming Video Summarization

Evaluation

This project evaluates the effectiveness of the final architecture by considering the SSIM (Structural Similarity Index Method), IP (Inclusion Percentile of Anomalies), Compactness Measure and G-mean value. A multi-objective evaluation technique like Pareto Front is also utilized for evaluating the architecture. This has not been incorporated into the repository but will be clearly explained in a paper published in future.

How to Run the project

  • Clone the repository
  • Create a Virtual Env using conda from the environment.yml file: conda env create -f environment.yml
  • Set up the config.py file by providing the right path to relevant files. The suggested value of NUMBER_OF_CLUSTERS is 3 for the UCF-crime dataset.
  • The execution entry point is main_cluster_based.py, and hence the command python main_cluster_based.py will perform video summarization.
  • The output video will be in the directory specified as TO_SAVE_DIRECTORY in the config.py file.

File Structure

Models

All the models associated with this project are in the directory /models. The final fine-tuned model selected for the architecture after several tests is auto_encoder5.hdf5.

Notebooks

All the associated Notebooks are in the directory /notebooks. All the model training, visualization and unit testing were performed in the respective notebooks. The notebooks, however, are not properly cleaned and hence have minimal readability.

Utils

All the separate module functions and repetitively used helper functions are written in respective files and stored in the /utils directory.

Remarks and Open Issues

Initial Approach and its future scope

  • The file main.py contains an approach that utilizes potentially non-anomalous frames to create a threshold value for frames flagged as anomalous.
  • In the future, this method could be developed by adding feedback loops with a scope for improvement.

Limitations

  • This architecture does not do well on subtle human crime anomalies like Pickpocketing, Shoplifting, etc
  • This can be overcome by training the model on more videos and improvising on this architecture.

Acknowledgements