A.L.A.I.N.A. - A Marvel Movie Inspired Voice Assistant for the Open- Source Community
Abstract:
We all have voice assistants in our phones like Siri/Alexa/ Google Assistant, but to use any of these voice assistants we need some or the other device and basically, we have to buy some device to use them. ALAINA would be a completely open source and available to all kinds of people, be it a developer who wants to integrate their IOT devices with an AI system without buying an ALEXA enabled device or a user with no device who wants to integrate their inventions with an AI voice assistant system. ALAINA supports most languages spoken and has space to add various types of custom commands for the user, as it will be available to anyone who has the knowledge of python and the source code is available for anyone to modify and customize according to their own will, ensuring complete customization and freedom to play with it.
ALAINA uses keyword based neural network architecture, so basically put, it will analyze the keywords in the sentences to provide the user with the feedback and actions based on the user’s commands. To keep the coding redundancy to a minimum, this approach was used along with bag of words algorithm.
INTRODUCTION:
As we know, the last few years, from 2015-till date, we have had a lot of ways to communicate with a computer/smartphone/IOT. Privacy related issues: The user can use ALAINA as their own server controller and enable access through all devices that the user owns and does not have to buy ALEXA enabled device for it, which no one knows how much data mining and information leak of each user is done by amazon, google, Facebook. It is an open secret that Facebook steals data and sells it to corporates so they can target their advertisement to specific users to influence, half of which is done by our voice assistants and MIC’s embedded in the devices that we buy. Privacy is breached at the cost of convenience. This is what ALAINA intends to avoid and keep the data of the user with devices with ease by using voice assistants and Speech recognition software’s. But for any of those software’s to run, we need expensive software’s that cost around 6000-8000 Rs. Minimal, while on the other hand devices like Arduino, Raspberry PI can range from 1000-2000 Rs, using which a person can integrate his own system with some other device for free, reducing their cost of prototyping and experimentation. But none of the modules have AI based voice assistant, they need to be built, so ALAINA aims to be the base of operations for such PROTOTYPE applications, experiments and also be used with various ANDROID and WINDOWS/MAC if they user wants to implement it.
Problem Statement:
Availability issues:
The only issue with ALEXA or SIRI is that they are premium products that only have the ability to do what’s attached to it across AMAZON’s web service and not a completely OPEN-source design, which is fair enough but the normal user, low cost/ budget user won’t have any access to what can and cannot be done. So, ALAINA aims to fill that void by enabling any user to use her base code and modify it the way any user wants to in their server instead of using AMAZON’s or APPLE’s cloud service. Just like in the marvel movies, Friday can be accessed by only two people, based on voice commands, the whole organization runs with the help of the AI used by tony stark, similarly we aspired to find the fine line between privacy and open source. Themselves, while doing everything the user wants, exactly the way they want it and without any privacy breach. Reason to be open source: ALAINA was made keeping in the mind, the common user, the naïve user, the basic developer and also the intrigued individual and curious minds with no resources, which is why we decided to keep ALAINA open source and made sure that it is available for prototyping. Any user can add their own custom commands to add more functionality making ALAINA even smarter day-by-day.
STUDIES AND FINDINGS:
ALAINA was made using basics of python, its vast libraries, and use of KERAS and TensorFlow, NLTK package of python the current generation of AI related packages that help in using neural networks, and architecture. ALAINA itself stands for All Language processing Artificial Intelligence using Neural Networks. There have been many issues regarding lagging of text to speech conversion, audio delay and no response from the system altogether due to the vast amount of neural processing that works in the background, in low powered processors, hence we had to tweak the whole system from being completely neural architecture based to just using bag of words algorithm and using keywords to define commands given to ALAINA. Other voice assistant systems usually work with mobile devices, ALAINA was kept simple so every developer can make it for any kind of device, and be it IOT, Windows, Linux, and Android installed on Raspberry PI, custom android ROM based systems.
Most of the AI voice assistant systems are based on the user input and will just return the data asked for through Google, or prebuilt app installed in the device or what they are coded upon and cannot be integrated with more functions as its owned by their respective companies, but with ALAINA we have the freedom to do whatever we want it to act like and/or respond to us while performing the functions we want it to. Nonetheless, all the current voice assistant systems are one of the most loved and used systems, but they can do much more than they do if they were open source, which is what we plan to do with ALAINA. The current libraries i.e., NLTK, has embedded iNLTK in it, due to which the process of understanding Indian accents and speech becomes easier for the program to understand and bridge the gap between human and machine communication. Alaina has been made open source for the very purpose that every programmer can have the power to define their own functions for ALAINA to perform the custom functions that every user can get for their own, by just coding the program in the existing function. As Alaina even has a way to perform inside programs, we could be looking into coding without using our hands in VS Code, if we choose to do so, in that way we could be just speaking the code and then ALAINA would write it down, of course we made sure that there is human intervention so as to not be able to find errors for ourselves and rectify.
The base code will be provided to everyone for the fact that open- source community, filled with millions of programmers can just add their code snippets, which will add functionality to ALAINA even more. There is a possibility that other user’s code can create malfunctions in the main source code, which is why the main source code will be made available but not editable online, only iterations of the code, when verified by the code author will be published back in the form of upgrades or iterations.
ALAINA being made on python is a major necessity, as the AI part of Alaina is basically using TensorFlow and Keras and CuNN- Cuda Neural Network package by Nvidia for its neural network architecture, due to which the ever-evolving libraries by Nvidia, python community will always be helpful in further upgrading ALAINA by the open-source community. Because of this, the control is always in the hands of the users, and also the means to add the variations of problems that she can solve, and a solution based on the problem. If there is an issue, or something ALAINA is not able to perform or does not understand the user’s query, it shall repeat the user with the query and the output has been made to feel like a normal human asking a question.
SYSTEM ANALYSIS AND MODELLING:
ALAINA was made using basics of python, its vast libraries, and use of KERAS and TensorFlow, NLTK package of python the current generation of AI related packages that help in using neural networks, and architecture. ALAINA itself stands for All Language processing Artificial Intelligence using Neural Networks.
There have been many issues regarding lagging of text to speech conversion, audio delay and no response from the system altogether due to the vast amount of neural processing that works in the background, in low powered processors, hence we had to tweak the whole system from being completely neural architecture based to just using bag of words algorithm and using keywords to define commands given to ALAINA.
Other voice assistant systems usually work with mobile devices, ALAINA was kept simple so every developer can make it for any kind of device, and be it IOT, Windows, Linux, and Android installed on Raspberry PI, custom android ROM based systems.
Most of the AI voice assistant systems are based on the user input and will just return the data asked for through Google, or prebuilt app installed in the device or what they are coded upon and cannot be integrated with more functions as its owned by their respective companies, but with ALAINA we have the freedom to do whatever we want it to act like and/or respond to us while performing the functions we want it to. Nonetheless, all the current voice assistant systems are one of the most loved and used systems, but they can do much more than they do if they were open source, which is what we plan to do with ALAINA.
The current libraries i.e., NLTK, has embedded iNLTK in it, due to which the process of understanding Indian accents and speech becomes easier for the program to understand and bridge the gap between human and machine communication. Alaina has been made open source for the very purpose that every programmer can have the power to define their own functions for ALAINA to perform the custom functions that every user can get for their own, by just coding the program in the existing function. As Alaina even has a way to perform inside programs, we could be looking into coding without using our hands in VS Code, if we choose to do so, in that way we could be just speaking the code and then ALAINA would write it down, of course we made sure that there is human intervention so as to not be able to find errors for ourselves and rectify.
The base code will be provided to everyone for the fact that open- source community, filled with millions of programmers can just add their code snippets, which will add functionality to ALAINA even more. There is a possibility that other user’s code can create malfunctions in the main source code, which is why the main source code will be made available but not editable online, only iterations of the code, when verified by the code author will be published back in the form of upgrades or iterations.
ALAINA being made on python is a major necessity, as the AI part of Alaina is basically using TensorFlow and Keras and CuNN- Cuda Neural Network package by Nvidia for its neural network architecture, due to which the ever-evolving libraries by Nvidia, python community will always be helpful in further upgrading ALAINA by the open-source community.
Because of this, the control is always in the hands of the users, and also the means to add the variations of problems that she can solve, and a solution based on the problem. If there is an issue, or something ALAINA is not able to perform or does not understand the user’s query, it shall repeat the user with the query and the output has been made to feel like a normal human asking a question.
Design Methodology:
The design methodology is based on the aspect of Neural Networks used in the basic techniques of AI application. Neural Networks and Bags of words techniques are used to build up the neural architecture of ALAINA, which helps it in understand the basic way to respond to a query to the user.
We have used Selenium as a secondary suite to interact with various functionalities, like tracking IP, interacting with web- based controls, and all major components related to web activities.
ALAINA was made in python for the very reason that python is a multi-platform software, so it works with any kind of system, not just windows, but also Android and Ubuntu. With some tinkering to the source code, it can even work with Raspberry Pi and IOT devices. We can embed it with Raspberry pi / Arduino. As engineers, anyone who is a fan of Iron Man from the movies, they would love to have a voice assistant that not only listens to them, or responds, but does everything around the house with the help of just a voice command. By enabling it to be embedded to an IOT device and Blynk App, it is possible to control all the devices in your house with your voice by embedding ALAINA in the IOT device.
ALAINA was made open source and in python to enable the users/ the open-source community have the ease to make their own devices without worrying or depending on some paid company devices and be able to embed it in anything they invent, reinvent. Discover. ALAINA can be the way to enable home automation devices with their own embedded voice assistant. IF enabled on to a cloud server, you can control everything in your house, your devices, your PC’s, all the devices you own with your own custom commands.
Fig.1 DESIGN PROCESS
Fig.2 Decision Model FLOW-CHART
Fig.3 High Level Design
UML DIAGRAMS:
- DFD Diagram
Fig.4 DFD Diagram
- Use Case Diagram
Fig. 5 Use Case Diagram
- ACTIVITY DIAGRAM
Fig. 6 Activity Diagram
- SEQUENCE DIAGRAM
Fig. 7 Sequence Diagram