respeakerd
is the server application for the microphone array solutions of SEEED, based on librespeaker
which combines the audio front-end processing algorithms.
It's also a good example showing how to utilize the librespeaker
. Users can implement their own server application / daemon to invoke librespeaker
.
This manual shows how to compile and run this project respeakerd
, and then introduces the protocol used in the communication between respeakerd
and a Python client implementation for AVS (https://github.com/respeaker/avs).
- json: https://github.com/nlohmann/json, header only
- base64: https://github.com/tplgy/cppcodec, header only
- inih: https://github.com/benhoyt/inih, source files are included
- libsndfile1-dev libasound2-dev: save PCM to wav file, installed by librespeaker
- libdbus-1-dev: nofity led ring server with events
- cmake
- librespeaker-dev, including header files for compilation with librespeaker
$ sudo apt install -y cmake libdbus-1-dev
$ sudo apt update
$ sudo apt-cache policy librespeaker
$ sudo apt install -y librespeaker-dev
$ cd PROJECT-ROOT/build
$ cmake .. #`cmake -DCMAKE_BUILD_TYPE=Debug ..` if you want to build the debug version
$ make
$ sudo apt-get install -y debhelper dh-make fakeroot
$ mkdir -p build && cd build
$ cp -rf ../debian .
$ sudo apt-get update
$ sudo apt-get install -y --allow-downgrades librespeaker-dev/testing #or `librespeaker-dev/stretch` if you want to build release version
# sed -i '6c \\tcmake -DCMAKE_BUILD_TYPE=Release ..' debian/rules #if you want to build release version
$ chmod a+x debian/pack.sh
$ debian/pack.sh
The command line parameters may change during the development of this project. To get the updated information of the command line paramenters, please inspect the options with
$ respeakerd -help
Usage: respeakerd [options]
respeakerd is a server application for the microphone arrays of SEEED.
-m, --mode=MODE the mode of respeakerd, can be: standard, pulse
default: standard
--mic-type=MIC_TYPE the type of microphone array, can be CIRCULAR_6MIC, CIRCULAR_4MIC
default: CIRCULAR_6MIC
-t, --test test the configuration file and exit
--hotword-engine=STRING the hotword engine, can be: snowboy, snips
default: snowboy
--snowboy-res-path=PATH the path to snowboay's resource file
default: /usr/share/respeaker/snowboy/resources/common.res
--snowboy-model-path=PATH the path to snowboay's model file
default: /usr/share/respeaker/snowboy/resources/snowboy.umdl
--snowboy-sensitivity=FLOAT_NUMBER the sensitivity of snowboy
default: 0.5
--snips-model-path=PATH the path to snips's hotword model files
default: /usr/share/respeaker/snips/model
--snips-sensitivity=FLOAT_NUMBER the sensitivity of snips hotword engine
default: 0.5
-s, --source=STRING the source of pulseaudio from which the audio stream is pulled
default: default
-g, --agc-level=INTEGER target dBFS for AGC, the range is [-31, 0]
e.g. -g "-3" or --agc-level="-3", default: -3
-r, --ref-channel=INTEGER the channel index of the AEC reference, 6 or 7
default: 6
--fifo-file=FILE the path of the fifo file which is required by the pulse mode
default: /tmp/music.input
--dynamic-doa if specified, the DoA direction will dynamically track the sound,
otherwise it only changes when hotword detected
-w, --enable-wav-log enable logging audio streams into wav files for debugging purpose
-v, --debug print more messages
-h, --help display this help and exit
--version output version information and exit
respeakerd can work in multiple modes.
-
Standard mode (default, or
-mode=standard
) In this mode respeakerd will work as a socket server, and communicate with clients via the socket protocol, audio stream and events like triggered will go through this socket, in JSON format. The socket is an UNIX Domain Socket at/tmp/respeakerd.sock
. respeakerd will recreate this socket file every time it startup. -
PulseAudio mode (
-mode=pulse
) respeakerd can stream its output into PulseAudio system in this mode. With the PulseAudio system, the processed audio stream out of respeakerd can then be dispatched to arbitrary consumer applications. To work with PulseAudio, configurations need to be done for PulseAudio, see 4. More about PulseAudio mode. After those configurations, PulseAudio will create a fifo file/tmp/music.input
to receive and dispatch audio stream. So if you don't know how to configure PulseAudio to create the fifo file at another path, please don't change the-fifo_file
parameter of respeakerd, just use the default.
All the command line options (except --test and --config) will be reflected in the configuration file. The default location of the configuration file is /etc/respeaker/respeakerd.conf
.
The configurations in the file have lower priority than the command line options, that is, if you specify the same option both in command line and the configuration file, respeakerd
will take the value from command line.
If you're using the Pi, the following things should be checked. (We put those modifications into ReSpeaker Core v2's system image, if you're using Core v2, just ignore)
- default-sample-format = float32le
- default-sample-rate = 48000
Get your current default settings with pactl info
and check.
For simplifying the configurations, we provide a tool - respeakerd-pi-tools
, you can install this tool via
sudo apt install respeakerd-pi-tools
And do the configurations with ease
sudo respeakerd-pi-tools setup-pulse
We need PulseAudio's module-pipe-source
module to be loaded, this will be handled by respeakerd_safe
, it will detect if users have configured respeakerd
to work as pulse
mode, and will load the module automatically. When we're doing development, we might hope to load the module manually.
pactl load-module module-pipe-source source_name="respeakerd_output" format=s16le rate=16000 channels=1
pactl set-default-source respeakerd_output
Or just put into PulseAudio's configuration file.
$ sudo vim /etc/pulse/default.pa
Add the following lines to the end of this file:
load-module module-pipe-source source_name="respeakerd_output" format=s16le rate=16000 channels=1
set-default-source respeakerd_output
When we're doing development, we might want to start respeakerd
in pulse
mode manually.
$ cd PROJECT-ROOT/build
$ src/respeakerd --mode=pulse --source="alsa_input.platform-sound_0.seeed-8ch" --debug
Add other options if you need.
Please note that if no application's consuming the audio stream from
respeakerd_output
source, respeakerd will seem like get stuck. This is normal because writing to a Linux pipe will be blocked if there's no consumer at the other end of this pipe. Everything will be working if you start to read the pipe, e.g.parecord -d respeakerd_output dump.wav
.
respeakerd
exposes unix domain socket at /tmp/respeakerd.sock
, this socket is a duplex stream socket, including input channel
and output channel
.
Output channel: respeakerd
outputs audio data and events to clients.
Input channel: clients report messages to respeakerd
, e.g. cloud_ready status message.
Please note that for now the respeakerd
only accepts one client connection.
The messages are wrapped in json format, splited by "\r\n", like:
{json-packet}\r\n{json-packet}\r\n{json-packet}\r\n
{"type": "audio", "data": "audio data encoded with base64", "direction": float number in degree unit}
{"type": "event", "data": "hotword", "direction": float number in degree unit}
For now the following messages are supported:
{"type": "status", "data": "ready"}
This is a status message which indicates that the client application has just connected to the cloud (here this client is both a client of respeakerd and a client of ASR cloud, e.g. Alexa Voice Service), respeakerd can now accept voice commands. In the following of this muanual, we illustrate all the mentions of cloud with Alexa.
{"type": "status", "data": "connecting"}
This is a status message which indicates that Alexa client has just lost connection to the cloud, respeakerd can't accept voice commands until ready
state.
{"type": "cmd", "data": "stop_capture"}
This is a command message issued from the client. Generally the client gets this message from Alexa cloud, as Alexa has detected the end of a sentence. respeakerd
hasn't utilize this message for now, it just keeps posting data to the client, becuase the base library the client is using - voice-engine
- does drop packets when Alexa isn't available to receive inputs.
{"type": "status", "data": "on_speak"}
This is a status message which indicates that the client has just received the speech synthesis from Alexa and will begin to play. respeakerd
utilizes this status to enhence the algorithms. It's recommended that the client should capture this event and pass it down to respeakerd
if you're doing your own client application.
respeakerd uses System Bus to deliver signals. This is especially usefull when it's working with the C++ version AVS client, as the C++ version AVS client doesn't communicate with respeakerd via the socket protocol but PulseAudio instead, so it can no longer receive the critical hotword
event from respeakerd via json through the socket protocol. It receives the events via D-Bus.
D-Bus object name: "/io/respeaker/respeakerd" Interface: "respeakerd.signal"
respeakerd outputs:
trigger
signalrespeakerd_ready
signal
respeakerd listens to:
- client
ready
signal - client
connecting
signal - client
on_speak
signal
And the following signals will be listened by the pixel_ring_server (scripts/pixel_ring_server, which is a Python script to drive the RBG led ring on the board)
on_idle
signalon_listen
signalon_think
signalon_speak
signal
Except trigger
and respeakerd_ready
, all other signals are generated by the C++ AVS client.