This repository contains the code, model, and deployment configs for the paper Sign-to-Speech for Sign Language Understanding: A case study of Nigerian Sign Language which appeared first at the NeurIPS workshop on Machine Learning for Developing World (ML4D) 2021, and subsequently as a demo paper at the Thirty-First International Joint Conference on Artificial Intelligence {IJCAI-22}.
Our dataset is a novel dataset for the Nigerian Sign Language comprising of 5000 images of 137 sign words/phrases including the alphabet letters. Data collectors of 20+ individuals comprising of a TV sign language broadcaster and students and teachers from 2 special education schools in Nigeria. The dataset is hosted on Lanfrica Records.
- Clone the repository and
pip install -r requirements
. - If you are on a Linux OS, TTS engines might not be pre-installed on your platform. Use the code below to install them.
sudo apt-get update && sudo apt-get install espeak ffmpeg libespeak1
- Install Python's dependencies.
pip install -r requirements.txt
- While in the project directory's root, spin up the deepstack custom model's server by running the command below;
sudo docker run -v ~/path/to/project_folder/savedmodels_configs/yolo_model/weights:/modelstore/detection -p 88:5000 deepquestai/deepstack
- run the image_detection script on the image;
python image_detection.py image_filename.file_extension
My default port number is 88. To specify the port on which DeepStack server is running, run this instead;
python image_detection.py image_filename.file_extension --deepstack-port port_number
Running the above command would return two new files in your project root directory -
- a copy of the image with bbox around the detected sign with the meaning on the top of the box,
- an audiofile of the detected sign language.
- run the livefeed detection script;
python livefeed_detection.py
My default port number is 88. To specify the port on which DeepStack server is running, run this instead;
python livefeed_detection.py --deepstack-port port_number
This will spin up the webcam and would automatically detect any sign language words in view of the camera,
while also displaying the sign meaning and returning its speech equivalent immediately through the PC's audio system. Press **q**
to quit the live video.
video2132736597.mp4
@inproceedings{ijcai2022p855,
title = {Sign-to-Speech Model for Sign Language Understanding: A Case Study of Nigerian Sign Language},
author = {Kolawole, Steven and Osakuade, Opeyemi and Saxena, Nayan and Olorisade, Babatunde Kazeem},
booktitle = {Proceedings of the Thirty-First International Joint Conference on
Artificial Intelligence, {IJCAI-22}},
publisher = {International Joint Conferences on Artificial Intelligence Organization},
editor = {Lud De Raedt},
pages = {5924--5927},
year = {2022},
month = {7},
note = {Demo Track},
doi = {10.24963/ijcai.2022/855},
url = {https://doi.org/10.24963/ijcai.2022/855},