Inspired by Google Home and Amazon Alexa, the goal of this project was to create a more user-customizable voice assistant that could more accurately simulate a music player in terms of playing a song given by users, storing favorites in a playlist, and having the ability to play songs randomly from a playlist or in order.
To convert users' voices into text, the speech_recognition
library was utilized. The program selects the microphone and listens for audio input, attempting to transcribe the captured audio into text using Google’s speech recognition service. If an error occurs during the conversion process, the program indicates that an error has taken place.
In order for the program to respond to user commands, a text-to-speech model was set up using IBM Watson's Text to Speech library. Note: The API key for this must be set up on their website. The function retrieves the program's reply and converts it into a .wav file using a text-to-speech synthesizer, which vocalizes the program's response. The file is then played for the user and subsequently deleted.
Transformers library offers pre-trained models for a variety of natural language processing tasks. In this instance, it utilizes the DialoGPT model, which is specifically designed for generating conversational responses. The chatbot operates in a continuous loop, where it listens for user input, processes that input, and generates a response. Specifically, the program utilizes the DialoGPT-medium model, which is fine-tuned for generating dialogue and can produce coherent responses based on user interactions. The recognized text is encapsulated in a Conversation object, which is subsequently passed to the NLP pipeline. The model processes the input and generates a response, which is then extracted from the model's output and refined to retain only the relevant text.
Play Music Function:
Once the program recognizes that the user wants to play some audio, it cleans up the input text to isolate the music title. It then constructs a YouTube search URL to find the track using the provided music name, employing the requests
library and regular expressions (regex). The program fetches the search results and retrieves the URL of the first video from YouTube, extracting the video title using BeautifulSoup. Next, it checks if the track is already downloaded (in WAV format) in the program's directory. If it is, the program plays the track using simpleaudio. If the track is not found locally, it downloads it using youtube_dl and ffmpeg, saving it in the specified directory. Special characters in the file name are removed to prevent potential errors.
Remove Function: The remove function searches for a URL on YouTube using the same methods. If it finds a matching URL for the user's query, it retrieves the title and then searches for that title in the audio directory. If the file is found, it is deleted.
Play playlist and play random functions: The playlist function iterates through all existing audio files, playing them one at a time. In contrast, the random play function selects a random file from the directory and plays it.
Download Arduino IDE from https://docs.arduino.cc/software/ide-v1/tutorials/Windows
➊ Github: https://github.com/AumkarMali/
➋ Youtube: https://www.youtube.com/channel/UC7rhCKur9bF01lV0pNJNkvA