Skip to content
Vignesh Rao edited this page Apr 20, 2023 · 78 revisions

Everything Jarvis

📓   Wiki updated on April 20, 2023

Demo videos were recorded back in 2020, which are way outdated but will give a basic idea.

Jarvis uses specific keywords to trigger the respective function which performs a specific task. It is not 100% foolproof but most of the common errors are caught and responses are configured accordingly. If there are any unexpected exceptions please raise an issue.

Restrictions:

Jarvis is a heavy and processor intense package. To make sure the host machine doesn't suffer Jarvis' wrath, it runs on LIMITED mode if the machine is low on CPU cores. This will disable all background processes. Although it may limit Jarvis' ability to communicate offline, it will still be fully functional on voice commands.

Currently, Jarvis can run only on Linux, macOS and Windows.

Tested on macOS High Sierra, Mojave, Catalina, Big Sur, Monterey, Ventura* and Windows 10 and Ubuntu 22.0 LTS

Known Issue with pyttsx3 module on macOS Ventura 13.0: This version of macOS does not hold the attribute VoiceAge. Workaround has been raised as a PR

Usage:

  • Jarvis works by automatically detecting the Operating System it is being run on.
  • Some key features require API keys, but they can be generated for free.

None of the features in Jarvis require subscriptions *

Stability

  • There are broad exception clauses implemented to prevent Jarvis from crashing.
  • To make sure Jarvis is always connected to the internet, it runs a connection checker in the background.
  • The connection checker uses a built-in OS-agnostic module to enable Wi-Fi and connect to a given SSID (stored as env vars).

Offline Communication with Jarvis:

  • Using Telegram:

    1. A Telegram bot has to be created using BotFather
    2. Token has to be added to the env var BOT_TOKEN
    3. List of chat ids have to be added to the env var BOT_CHAT_IDS
    4. List of bot usernames have to be added to the env var BOT_USERS
  • Using FastAPI:

    1. Hosts the offline communicator on localhost.
    2. Requires a port number added to the env var OFFLINE_PORT. Defaults to 4483
    3. Requires a password for authentication added to the env var OFFLINE_PASS. Defaults to OfflineComm

If the LIMITED mode is disabled, Jarvis will automatically try to launch speech-synthesis in a docker container.

If this launch fails or the SPEECH_SYNTHESIS_TIMEOUT is set to 0, this part will be skipped

To enable independent speech-synthesis
docker run \
    -it \
    -p 5002:5002 \
    -e "HOME=${HOME}" \
    -v "$HOME:${HOME}" \
    -v /usr/share/ca-certificates:/usr/share/ca-certificates \
    -v /etc/ssl/certs:/etc/ssl/certs \
    -w "${PWD}" \
    --user "$(id -u):$(id -g)" \
    thevickypedia/speech-synthesis

To test speech synthesis running locally

curl -X POST \
 -H "Content-Type: text/plain" \
 -d 'Welcome to the world of speech synthesis' \
 'http://localhost:5002/api/tts?voice=en-us_northern_english_male-glow_tts&vocoder=medium' \
 --output temp.wav && open temp.wav

💡   Speech Synthesis can run on a docker container for better voices but, response might be negligibly slower. If you don't have docker installed or simply don't want to use it, set the SPEECH_SYNTHESIS_TIMEOUT env var to 0. This is also done automatically if failed to launch a docker container upon startup.

Features:

  • Lock, unlock, honk & blink, remote start/stop and set the AC temperature in any JLR (Jaguar Land Rover) vehicles.
  • Open and close garage door using MyQ garage controller.
  • Control any LG Web OS or Roku Televisions.
  • Guard the surroundings using face detection and audio recognition. Sends an email and SMS alert in case of an intruder.
  • Send a text message to most US based mobile numbers.
  • Email any recipient using the contact name and email address mapping stored in contacts.yaml file.
  • Facial recognition and detection. Click here to read the setup instructions.
  • Automatically turn on Wi-Fi and connect to a set SSID. Also happens automatically anytime within 10 seconds of internet disconnect.
  • Scale up/down a vpn server (in any region) on demand using vpn-server module
  • Monitor stocks using the stock monitor endpoint.
  • Run cron scheduled jobs using regular cron expressions without the need of a crontab.
  • Run certain tasks in the background every few seconds/minutes/hours/days.
  • Set up a reminder at a said time and send a message to your phone and/or email at the given time.
    • Reminders can also be setup to a different person if their contact details are present in the contacts.yaml file.
  • Set an alarm/timer at any desired time.
  • Wish you on events/festivals and birthday using an env var.
  • Increase or decrease master volume of your machine via voice commands.
  • Locate, ring and enable lost mode on any of the user's Apple devices.
  • Control smart lights (that uses MagicHome application) in the same IP range.
  • Get meetings information using ICS parser from a shared calendar with an ics url.
  • Read Outlook/Calendar and inform about meetings in the next 12 hours. macOS only
  • Tell public and private IP address of the running device.
  • Swap voices on demand.
  • Mute on demand.
  • Display realtime microphone usage on a graph.
  • Adjust Screen Brightness.
  • Tell system vitals including boot time, fan speed*, CPU and GPU temperature*. macOS only
  • Restart the running device and suggest a restart in case of high boot time.
  • Scan local ip range to get smart devices connected which acts as IP feeder for smart lights and TV.
  • Tell random facts.
  • Heads or Tails.
  • Take notes and saves it to a notes.txt file.
  • Tell a joke.
  • Tell the number of repositories on your GitHub and clone a particular one.
  • Tell the list of google home devices in your IP range.
  • Get you the distance between two places or distance from your location to a particular place.
  • Create or remove tasks for todo lists.
  • Scan your Music folder for .mp3 and plays them on any smart devices in IP range.
  • Look for unread emails in your gmail account.
  • Get the weather information at any location.
  • Get the news update from fox-news.
  • Get your investment details using Robinhood API.
  • Get facts from Wikipedia using wiki api.
  • Open any application installed in running machine.
  • Get the current date and time.
  • Open a Google search for any query.
  • Shutdown/restart the running device.
  • Respond to most basic conversations.
  • If anything apart from the above is requested, Jarvis uses google's Places API to match the requested phrase and suggest options. If the requested phrase doesn't resolve on Places API, it then uses google search engine parser to get results for the phrase. If anything beyond that, Jarvis will launch a Google search on your default browser.

Investment Details:

  • Login to your Robinhood Web App
  • Go to Account -> Settings or click me
  • Turn on Two-Factor Authentication
    • Select “Authentication App”
    • Click “Can’t Scan It?”, and copy the 16-character QR code

Facial Recognition:

  • For this feature to work, a bit of setup is required
    • Create a parent directory named train and create more subdirectories with the name of a people within the parent directory train
    • Now drop the images directly within the subdirectories named after each person
    • Please note that even if the subdirectories are not added, the face_recognition script is written in a way that it can learn from unrecognized/new faces by storing them with named subdirectories within train
    • The camera id has been made dynamic so that it can choose the camera automatically
  • Try to keep the images as light as possible (which means both the display and file size of each image)
    • Reduced image size will not affect the accuracy as the facial recognition script converts each image into a pixel matrix
    • Reduced image size and display area can actually help in faster scanning and recognition
  • Make good of the tolerance level in the face recognition script
    • The learning_rate can be switched to match your needs
    • Learning rate generally depends on the clarity of the images stored and the lighting at which it was taken
    • Try to avoid images with very low resolution to maintain an adequate learning rate
    • Lower the learning rate, higher the tolerance level (which means the lower you go on learning rate, the stricter or perhaps more accurate matching it does)
    • Layman's terms:
      • Increased learning_rate: Exact to No match
      • Decreased learning_rate: Incorrect to close match
      • 0.5 - 0.7 should suffice in most cases though
Clone this wiki locally