The Action Recognition in the Dark (ARID) dataset focuses on human action recognition in challenging lighting conditions, such as low-light and darkness. This project employs ARID v1.5 to train a model to detect human actions under these lighting conditions.
The ARID dataset consists of video clips featuring 11 distinct human actions performed in various scenes, both indoor and outdoor, under varying lighting conditions. More details about the ARID v1.5 dataset can be found here.
From each video, 20 frames are extracted to form a list of frames per video, serving as the primary dataset elements.
To improve visibility, reduce noise, and standardize frames, histogram equalization is applied to each frame.
Key-point information is extracted from each frame using the YOLOv8-pose key-point detection model.
The model architecture utilizes a Long-term Recurrent Convolutional Networks (LRCN) approach. The detailed code for the architecture can be found above.
The model is compiled using categorical cross-entropy loss, optimized with Adam optimizer. The training history is saved for plotting and evaluation purposes.
The model is evaluated using the test dataset, and a confusion matrix is generated to analyze its performance:
This project has successfully built a model to detect and classify human actions in challenging lighting conditions using the ARID v1.5 dataset. The LRCN approach combined with strategic preprocessing steps has allowed for improved accuracy and performance.