diff --git a/README.md b/README.md index b882263..d6378e0 100644 --- a/README.md +++ b/README.md @@ -26,6 +26,7 @@ end ## Getting started To get started with Elixir WebRTC, check out: +* the [Introduction to Elixir Webrtc](https://hexdocs.pm/ex_webrtc/intro.html) tutorial * the [examples directory](https://github.com/elixir-webrtc/ex_webrtc/tree/master/examples) that contains a bunch of very simple usage examples of the library * the [`apps` repo](https://github.com/elixir-webrtc/apps) with example applications built on top of `ex_webrtc` * the [documentation](https://hexdocs.pm/ex_webrtc/readme.html), especially the [`PeerConnection` module page](https://hexdocs.pm/ex_webrtc/ExWebRTC.PeerConnection.html) diff --git a/guides/mastering_transceivers.md b/guides/advanced/mastering_transceivers.md similarity index 100% rename from guides/mastering_transceivers.md rename to guides/advanced/mastering_transceivers.md diff --git a/guides/introduction/consuming.md b/guides/introduction/consuming.md new file mode 100644 index 0000000..ee6ef03 --- /dev/null +++ b/guides/introduction/consuming.md @@ -0,0 +1,27 @@ +# Consuming media data + +Other than just forwarding, we probably would like to be able to use the media right in the Elixir app to +e..g feed it to a machine learning model or create a recording of a meeting. + +In this tutorial, we are going to build on top of the simple app from the previous tutorial by, instead of just sending the packets back, depayloading and decoding +the media, using a machine learning model to somehow augment the video, encode and payload it back into RTP packets and only then send it to the web browser. + +## Deplayloading RTP + +We refer to the process of taking the media payload out of RTP packets as _depayloading_. + +> #### Codecs {: .info} +> A media codec is a program used to encode/decode digital video and audio streams. Codecs also compress the media data, +> otherwise, it would be too big to send over the network (bitrate of raw 24-bit color depth, FullHD, 60 fps video is about 3 Gbit/s!). +> +> In WebRTC, most likely you will encounter VP8, H264 or AV1 video codecs and Opus audio codec. Codecs that will be used during the session are negotiated in +> the SDP offer/answer exchange. You can tell what codec is carried in an RTP packet by inspecting its payload type (`packet.payload_type`, +> a non-negative integer field) and match it with one of the codecs listed in this track's transceiver's `codecs` field (you have to find +> the `transceiver` by iterating over `PeerConnection.get_transceivers` as shown previously in this tutorial series). + +_TBD_ + +## Decoding the media to raw format + +_TBD_ + diff --git a/guides/introduction/forwarding.md b/guides/introduction/forwarding.md new file mode 100644 index 0000000..cf91675 --- /dev/null +++ b/guides/introduction/forwarding.md @@ -0,0 +1,195 @@ +# Forwarding media data + +Elixir WebRTC, in contrast to the JavaScript API, provides you with the actual media data transmitted via WebRTC. +That means you can be much more flexible with what you do with the data, but you also need to know a bit more +about how WebRTC actually works under the hood. + +All of the media data received by the `PeerConnection` is sent to the user in the form of messages like this: + +```elixir +receive do + {:ex_webrtc, ^pc, {:rtp, track_id, _rid, packet}} -> + # do something with the packet + # also, for now, you can assume that _rid is always nil and ignore it +end +``` + +The `track_id` corresponds to one of the tracks that we received in `{:ex_webrtc, _from, {:track, %MediaStreamTrack{id: track_id}}}` messages. +The `packet` is an RTP packet. It contains the media data alongside some other useful information. + +> #### The RTP protocol {: .info} +> RTP is a network protocol created for carrying real-time data (like media) and is used by WebRTC. +> It provides some useful features like: +> +> * sequence numbers: UDP (which is usually used by WebRTC) does not provide ordering, thus we need this to catch missing or out-of-order packets +> * timestamp: these can be used to correctly play the media back to the user (e.g. using the right framerate for the video) +> * payload type: thanks to this combined with information in the SDP offer/answer, we can tell which codec is carried by this packet +> +> and many more. Check out the [RFC 3550](https://datatracker.ietf.org/doc/html/rfc3550) to learn more about RTP. + +Next, we will learn what you can do with the RTP packets. +For now, we won't actually look into the packets themselves, our goal for this part of the tutorial will be to forward the received data back to the same web browser. + +```mermaid +flowchart LR + subgraph Elixir + PC[PeerConnection] --> Forwarder --> PC + end + + WB((Web Browser)) <-.-> PC +``` + +The only thing we have to implement is the `Forwarder` GenServer. Let's combine the ideas from the previous section to write it. + +```elixir +defmodule Forwarder do + use GenServer + + alias ExWebRTC.{PeerConnection, ICEAgent, MediaStreamTrack, SessionDescription} + + @ice_servers [%{urls: "stun:stun.l.google.com:19302"}] + + @impl true + def init(_) do + {:ok, pc} = PeerConnection.start_link(ice_servers: @ice_servers) + + # we expect to receive two tracks from the web browser - one for audio, one for video + # so we also need to add two tracks here, we will use these to forward media + # from each of the web browser tracks + stream_id = MediaStreamTrack.generate_stream_id() + audio_track = MediaStreamTrack.new(:audio, [stream_id]) + video_track = MediaStreamTrack.new(:video, [stream_id]) + + {:ok, _sender} = PeerConnection.add_track(pc, audio_track) + {:ok, _sender} = PeerConnection.add_track(pc, video_track) + + # in_tracks (tracks we will receive media from) = %{id => kind} + # out_tracks (tracks we will send media to) = %{kind => id} + out_tracks = %{audio: audio_track.id, video: video_track.id} + {:ok, %{pc: pc, out_tracks: out_tracks, in_tracks: %{}}} + end + + # ... +end +``` + +We started by creating the PeerConnection and adding two tracks (one for audio and one for video). +Remember that these tracks will be used to *send* data to the web browser peer. Remote tracks (the ones we will set up on the JavaScript side, like in the previous tutorial) +will arrive as messages after the negotiation is completed. + +> #### Where are the tracks? {: .tip} +> In the context of Elixir WebRTC, a track is simply a _track id_, _ids_ of streams this track belongs to, and a _kind_ (audio/video). +> We can either add tracks to the PeerConnection (these tracks will be used to *send* data when calling `PeerConnection.send_rtp/4` and +> for each one of the tracks, the remote peer should fire the `track` event) +> or handle remote tracks (which you are notified about with messages from the PeerConnection process: `{:ex_webrtc, _from, {:track, track}}`). +> These are used when handling messages with RTP packets: `{:ex_webrtc, _from, {:rtp, _rid, track_id, packet}}`. +> You cannot use the same track to send AND receive, keep that in mind. +> +> Alternatively, all of the tracks can be obtained by iterating over the transceivers: +> +> ```elixir +> tracks = +> peer_connection +> |> PeerConnection.get_transceivers() +> |> Enum.map(&(&1.receiver.track)) +> ``` +> +> If you want to know more about transceivers, read the [Mastering Transceivers](https://hexdocs.pm/ex_webrtc/mastering_transceivers.html) guide. + +Next, we need to take care of the offer/answer and ICE candidate exchange. As in the previous tutorial, we assume that there's some kind +of WebSocket relay service available that will forward our offer/answer/candidate messages to the web browser and back to us. + +```elixir +@impl true +def handle_info({:web_socket, {:offer, offer}}, state) do + :ok = PeerConnection.set_remote_description(state.pc, offer) + {:ok, answer} = PeerConnection.create_answer(state.pc) + :ok = PeerConnection.set_local_description(state.pc, answer) + + web_socket_send(answer) + {:noreply, state} +end + +@impl true +def handle_info({:web_socket, {:ice_candidate, cand}}, state) do + :ok = PeerConnection.add_ice_candidate(state.pc, cand) + {:noreply, state} +end + +@impl true +def handle_info({:ex_webrtc, _from, {:ice_candidate, cand}}, state) do + web_socket_send(cand) + {:noreply, state} +end +``` + +Now we can expect to receive messages with notifications about new remote tracks. +Let's handle these and match them with the tracks that we are going to send to. +We need to be careful not to send packets from the audio track on a video track by mistake! + +```elixir +@impl true +def handle_info({:ex_webrtc, _from, {:track, track}}, state) do + state = put_in(state.in_tracks[track.id], track.kind) + {:noreply, state} +end +``` + +We are ready to handle the incoming RTP packets! + +```elixir +@impl true +def handle_info({:ex_webrtc, _from, {:rtp, track_id, nil, packet}}, state) do + kind = Map.fetch!(state.in_tracks, track_id) + id = Map.fetch!(state.out_tracks, kind) + :ok = PeerConnection.send_rtp(state.pc, id, packet) + + {:noreply, state} +end +``` + +> #### RTP packet rewriting {: .info} +> In the example above we just receive the RTP packet and immediately send it back. In reality, a lot of stuff in the packet header must be rewritten. +> That includes SSRC (a number that identifies to which stream the packet belongs), payload type (indicates the codec, even though the codec does not +> change between two tracks, the payload types are dynamically assigned and may differ between RTP sessions), and some RTP header extensions. All of that is +> done by Elixir WebRTC behind the scenes, but be aware - it is not as simple as forwarding the same piece of data! + +Lastly, let's take care of the client-side code. It's nearly identical to what we have written in the previous tutorial. + +```js +const localStream = await navigator.mediaDevices.getUserMedia({audio: true, video: true}); +const pc = new RTCPeerConnection({iceServers: [{urls: "stun:stun.l.google.com:19302"}]}); +localStream.getTracks().forEach(track => pc.addTrack(track, localStream)); + +// these will be the tracks that we added using `PeerConnection.add_track` +pc.ontrack = event => videoPlayer.srcObject = event.stream[0]; + +// sending/receiving the offer/answer/candidates to the other peer is your responsibility +pc.onicecandidate = event => send_to_other_peer(event.candidate); +on_cand_received(cand => pc.addIceCandidate(cand)); + +// remember that we set up the Elixir app to just handle the incoming offer +// so we need to generate and send it (and thus, start the negotiation) here +const offer = await pc.createOffer(); +await pc.setLocalDescription(offer) +send_offer_to_other_peer(offer); + +const answer = await receive_answer_from_other_peer(); +await pc.setRemoteDescription(answer); +``` + +And that's it! The other peer should be able to see and hear the echoed video and audio. + +> #### PeerConnection state {: .info} +> Before we can send anything on a PeerConnection, its state must change to `connected` which is signaled +> by the `{:ex_webrtc, _from, {:connection_state_change, :connected}}` message. In this particular example, we want +> to send packets on the very same PeerConnection that we received the packets from, thus it must be connected +> from the first RTP packet received. + +What you've seen here is a simplified version of the [echo](https://github.com/elixir-webrtc/ex_webrtc/tree/master/examples/echo) example available +in the Elixir WebRTC Github repo. Check it out and play with it! + +Now, you might be thinking that forwarding the media back to the same web browser does not seem very useful, and you're probably right! +But thankfully, you can use the gained knowledge to build more complex apps. + +In the next part of the tutorial, we will learn how to actually do something with media data in the Elixir app. diff --git a/guides/introduction/intro.md b/guides/introduction/intro.md new file mode 100644 index 0000000..c28326b --- /dev/null +++ b/guides/introduction/intro.md @@ -0,0 +1,33 @@ +# Introduction to WebRTC + +In this series of tutorials, we are going to learn what is WebRTC, and go through some simple use cases of Elixir WebRTC. +Its purpose is to teach you where you'd want to use WebRTC, show you what the WebRTC API looks like, and how it should +be used, focusing on some common caveats. + +> #### Before You Start {: .info} +> This guide assumes little prior knowledge of the WebRTC API, but it would be highly beneficial +> to go through the [MDN WebRTC tutorial](https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API) +> as the Elixir API tries to closely mimic the browser JavaScript API. + +## What is WebRTC + +WebRTC is an open, real-time communication standard that allows you to send video, audio, and generic data between peers over the network. +It places a lot of emphasis on low latency (targeting values in low hundreds of milliseconds end-to-end) and was designed to be used peer-to-peer. + +WebRTC is implemented by all of the major web browsers and is available as a JavaScript API, there's also native WebRTC clients for Android and iOS +and implementation in other programming languages ([Pion](https://github.com/pion/webrtc), [webrtc.rs](https://github.com/webrtc-rs/webrtc), +and now [Elixir WebRTC](https://github.com/elixir-webrtc/ex_webrtc)). + +## Where would you use WebRTC + +WebRTC is the obvious choice in applications where low latency is important. It's also probably the easiest way to obtain the voice and video from a user of +your web application. Here are some example use cases: + +* videoconferencing apps (one-on-one meetings of fully fledged meeting rooms, like Microsoft Teams or Google Meet) +* ingress for broadcasting services (as a presenter, you can use WebRTC to get media to a server, which will then broadcast it to viewers using WebRTC or different protocols) +* obtaining voice and video from web app users to use it for machine learning model inference on the back end. + +In general, all of the use cases come down to getting media from one peer to another. In the case of Elixir WebRTC, one of the peers is usually a server, +like your Phoenix app (although it doesn't have to - there's no concept of server/client in WebRTC, so you might as well connect two browsers or two Elixir peers). + +This is what the next section of this tutorial series will focus on - we will try to get media from a web browser to a simple Elixir app. diff --git a/guides/introduction/modifying.md b/guides/introduction/modifying.md new file mode 100644 index 0000000..a75cb93 --- /dev/null +++ b/guides/introduction/modifying.md @@ -0,0 +1,231 @@ +# Modifying the session + +So far, we focused on forwarding the data back to the same peer. Usually, you want to connect with multiple peers, which means adding +more PeerConnection to the Elixir app, like in the diagram below. + +```mermaid +flowchart BR + subgraph Elixir + PC1[PeerConnection 1] <--> Forwarder <--> PC2[PeerConnection 2] + Forwarder <--> PC3[PeerConnection 3] + end + + WB1((Web Browser 1)) <-.-> PC1 + WB2((Web Browser 2)) <-.-> PC2 + WB3((Web Browser 3)) <-.-> PC3 +``` + +In this scenario, we just forward packets from one peer to the other one (or even a bunch of other peers). This is a bit more challenging for a bunch of reasons: + +## Negotiation gets more complex + +You need to decide who starts the negotiation for every PeerConnection created - it can be either the client/web browser (so the case we went through +in the previous sections), the server, or both depending on when the peer joined. Also, don't forget that after you add or remove tracks from a PeerConnection, +new negotiation has to take place! + +> #### The caveats of negotiation {: .tip} +> But wait, the peer who added new tracks doesn't have to start the negotiation? +> +> Certainly, that's the simplest way, but as long as the *number of transceivers* of the offerer (or, to be specific, the number of m-lines in the offer SDP with the appropriate +> `direction` attribute set) is greater or equal to the number of all tracks added by the answerer, the tracks will be considered in the negotiation. +> +> But what does that even mean? +> Each transceiver is responsible for sending and/or receiving a single track. When you call `PeerConnection.add_track`, we actually look for a free transceiver +> (that is, one that is not sending a track already) and use it, or create a new transceiver if we don' find anything suitable. If you are very sure +> that the remote peer added _N_ new video tracks, you can add _N_ video transceivers (using `PeerConnection.add_transceiver`) and begin the negotiation as +> the offerer. If you didn't add the transceivers, the tracks added by the remote peer (the answerer) would be ignored. + +Let's look at an example: +1. The first peer (Peer 1) joins - here it probably makes more sense for the client (so the Web Browser) to start the negotiation, as the server (Elixir App/ +`Forwarder` in the diagram) does not know how many tracks the client wants to add (the `2. offer/answer` message indicates the exchange of offer where the direction of +the arrow means the direction of the offer message). + +```mermaid +flowchart LR + subgraph P1["Peer 1 (Web Browser)"] + User-- "1. addTrack(track11)" -->PCW1[PeerConnection] + end + + subgraph elixir [Elixir App] + PCE1[PeerConnection 1]-- "3. {:track, track11}" -->Forwarder + end + + PCW1-. "2. offer/answer" .->PCE1 +``` + +2. The second peer (Peer 2) joins - now we need to make a decision: we want Peer 2 to receive track from Peer 1, but Peer 2 also wants to send some tracks. +We can either: + - perform two negotiations: the first one, where Peer 2 is the offerer and adds their tracks, and the second one where the server is the offerer and adds + Peer 1's tracks to Peer 2's PeerConnection. + + ```mermaid + flowchart LR + subgraph elixir [Elixir App] + PCE1[PeerConnection 1] + Forwarder-- "4. add_track(track12)" -->PCE2[PeerConnection 2] + PCE2-- "3. {:track, track22}" -->Forwarder + end + + subgraph P2["Peer 2 (Web Browser)"] + U2[User]-- "1. addTrack(track22)" -->PCW2[PeerConnection] + PCW2-- "6. ontrack(track12)" --> U2 + end + + PCW2-. "2. offer/answer" .->PCE2 + PCE2-. "5. offer/answer" .->PCW2 + ``` + + - assuming that we expect only _N_ tracks from Peer 2, we can use the tip above and + make sure that there are at least _N_ transceivers in Peer 2's PeerConnection on the Elixir side and do just a single negotiation. + Note that you can also add transceivers without associated track, that's what you would need to do if + _N_ in the diagram was greater than 1, because we only have a single track available. + + ```mermaid + flowchart BR + subgraph elixir [Elixir App] + PCE1[PeerConnection 1] + Forwarder-- "2. add_transceiver(track12)" -->PCE2[PeerConnection 2] + PCE2-- "4. {:track, track22}" -->Forwarder + end + + subgraph P2["Peer 2 (Web Browser)"] + U2[User]-- "1. addTrack(track22)" -->PCW2[PeerConnection] + PCW2-- "5. ontrack(track12)" --> U2 + end + + PCE2-. "3. offer/answer" .->PCW2 + ``` + +> #### Negotiation needed {: .tip} +> Instead of relying on gut feeling when it comes to performing the renegotiation, you can use the `negotiationneeded` event (of, in the case of Elixir WebRTC, +> `{:ex_webrtc, _from, :negotiation_needed}` message). It should fire every time renegotiation is needed. Be careful though! If you plan to add five tracks +> at once, do not perform five renegotiations by accident, when you could do only one at the very end! + +3. Lastly, Peer 1 also wants to receive Peer 2's tracks, so we need to add the new tracks to Peer 1's PeerConnection and perform the renegotiation there. + +```mermaid +flowchart LR + subgraph P1["Peer 1 (Web Browser)"] + PCW1[PeerConnection]-- "3. ontrack(track12)" --> U1 + end + + subgraph elixir [Elixir App] + Forwarder-- "1. add_track(21)" -->PCE1[PeerConnection 1] + PCE2[PeerConnection 2] + end + + PCE1-. "2. offer/answer" .->PCW1 +``` + +> #### Who owns the tracks? {: .warning} +> Each of the tracks exists only in the context of its own PeerConnection. That means even if your Elixir App forwards media from one peer to +> another, it only takes RTP packets from a track in the first peer's PeerConnection and feeds them to another track in the second peer's PeerConnection. +> For instance, the role of `Forwarder` in the examples above would be to forward media in such way: +> +> ```mermaid +> flowchart LR +> subgraph Forwarder +> track11 -.-> track12 +> track22 -.-> track21 +> end +> PC1[PeerConnection 1] --> track11 +> PC2[PeerConnection 2] --> track22 +> track12 --> PC2 +> track21 --> PC1 +> ``` +> +> This might be a bit counterintuitive, as in reality both of the tracks `track11` and `track12` still carry the same media stream. + +A similar process would happen for all of the joining/leaving peers. If you want to check an actual working example, check out the +[Nexus](https://github.com/elixir-webrtc/apps/tree/master/nexus) - our Elixir, WebRTC-based videoconferencing demo app. + +## Types of video frames + +When speaking about video codecs, we should also mention the idea of different types of frames. + +We are interested in these types (although there can be more, depending on the codec): +* I-frames (/intra-frames/keyframes) - these are complete, independent images and do not require other frames to be decoded +* P-frames (predicted frames/delta-frames) - these only hold changes in the image from the previous frame. + +Thanks to this, the size of all of the frames other than the keyframe can be greatly reduced, but: +* loss of a keyframe or P-frame will result in a freeze and the receiver signaling that something is wrong and the video cannot be decoded +* video playback can only start from a keyframe + +Thus, it's very important not to lose the keyframes, or in the case of loss, swiftly respond to keyframe requests from the receiving peer and produce a new keyframe, as +typically (at least in WebRTC) intervals between unprompted keyframes in a video stream can be counted in tens of seconds. As you probably realize, a 15-second video +freeze would be quite disastrous! It's also important to request a new keyframe when a new peer that's supposed to receive media joins, so they can start video +playback right away instead of waiting. + +If you want to learn more about digital video, check out the [Digital video introduction](https://github.com/leandromoreira/digital_video_introduction) project. + + +## Matching codecs + +When connecting two peers, you also have to make sure that all of them use the same video and audio codec, as the codec negotiation happens +completely separately between independent PeerConnections. If you're not familiar with how codecs are negotiated in a WebRTC session, get back +to the previous tutorial on consuming media data. + +In a real scenario, you'd have to receive the RTP packet from the PeerConnection, inspect its payload type, find the codec associated with that payload type, find the payload type +associated with that codec on the other PeerConnection, and use it to overwrite the original payload type in the packet. + +Unfortunately, at the moment the `PeerConnection.send_rtp` API forces you to use the topmost negotiated codec, so there's no way to handle RTP streams with changing codecs. +The only real solution is to force `PeerConnection` to negotiate only one codec. + +```elixir +codec = %ExWebRTC.RTPCodecParameters{ + payload_type: 96, + mime_type: "video/VP8", + clock_rate: 90_000 +} +{:ok, pc} = PeerConnection.start_link(video_codecs: [codec]) +``` + +This is not ideal as the remote PeerConnection might not support this particular codec. This tutorial will be appropriately updated once the `PeerConnection` API allows +for more in this regard. + +> #### WebRTC internals {: .tip} +> If you're developing using a Chromium-based browser, be sure to type out `chrome://webrtc-internals` in your address bar, +> you'll access a lot of WebRTC-related stats. +> +> If you ever see a black screen with the "loading" spinning circle instead of your video in the `video` HTML element, be sure +> to find your PeerConnection in the WebRTC internals, go to the `inbound-rtp(type=video, ...)` tab and check the `pliCount` stat. +> If you see it growing, but the video still does not load, you most likely are missing a keyframe and are not responding +> to the PLI (Picture Loss Indication) requests with a new keyframe. + +In the case of a forwarding unit, like the example we have been examining in this section, we cannot really produce a keyframe, as we don't produce any video at all. +The only option is to send the keyframe request to the source, which in `ExWebRTC.PeerConnection` can be accomplished with the `PeerConnection.send_pli` function. +PLI (Picture Loss Indication) is simply a type of RTCP packet. + +Usually, when forwarding media between peers, we would: +* send PLI to source tracks when a new receiving peer joins, +* forward PLI from source tracks to receiving tracks + +This can be achieved with this piece of code similar to this: + +```elixir +defp handle_info(:new_peer_joined, state) do + for source <- state.sources do + :ok = PeerConnection.send_pli(source.peer_connection, source.video_track_id); + end + {:noreply, state} +end + +defp handle_info({:ex_webrtc, from, {:rtcp, packets}}, state) do + for packet <- packets do + case packet do + %ExRTCP.Packet.PayloadFeedback.PLI{media_ssrc: ssrc} -> + # TODO how to get the ids + + :ok = PeerConnection.send_pli(peer_connection, source_track_id) + + _other -> :ok + end + end + + {:noreply, state} +end +``` + +Just be careful to not overwhelm the source with PLIs! In a real application, we should probably implement some kind of rate limiting for the keyframe +requests. + diff --git a/guides/introduction/negotiation.md b/guides/introduction/negotiation.md new file mode 100644 index 0000000..14156e6 --- /dev/null +++ b/guides/introduction/negotiation.md @@ -0,0 +1,191 @@ +# Negotiating the connection + +Before starting to send or receive media, you need to negotiate the WebRTC connection first, which comes down to: + +* specifying to your WebRTC peer what you want to send and/or receive (like video or audio tracks) +* exchanging information necessary to establish a connection with the other WebRTC peer +* starting the data transmission. + +We'll go through this process step-by-step. + +## Web browser + +Let's start from the web browse side of things. Let's say we want to send the video from your webcam and audio from your microphone to the Elixir app. + +Firstly, we'll create the `RTCPeerConnection` - this object represents a WebRTC connection with a remote peer. Further on, it will be our interface to all +of the WebRTC-related stuff. + +```js +const opts = { iceServers: [{ urls: "stun:stun.l.google.com:19302" }] } +const pc = new RTCPeerConnection(opts) +``` + +> #### ICE servers {: .info} +> Arguably, the most important configuration option of the `RTCPeerConnection` is the `iceServers`. +> It is a list of STUN/TURN servers that the PeerConnection will try to use. You can learn more about +> it in the [MDN docs](https://developer.mozilla.org/en-US/docs/Web/API/RTCPeerConnection/RTCPeerConnection) but +> it boils down to the fact that lack of any STUN servers might cause you trouble connecting with other peers, so make sure +> something is there. + +Next, we will obtain the media tracks from the webcam and microphone using `mediaDevices` JavaScript API. + +```js +// a popup asking for permissions should appear after calling this function +const localStream = await navigator.mediaDevices.getUserMedia({ audio: true, video: true }); +``` + +The `localStream` is an object of type `MediaStream` - it aggregates video or audio tracks. +Now we can add the tracks to our `RTCPeerConnection`. + +```js +for (const track of localStream.getTracks()) { + pc.addTrack(track, localStream); +} +``` + +Finally, we have to create and set an offer. + +```js +const offer = await pc.createOffer(); +// offer == { type: "offer", sdp: ""} +await pc.setLocalDescription(offer); +``` + +> #### Offers, answers, and SDP {: .info} +> Offers and answers contain information about your local `RTCPeerConnection`, like tracks, codecs, IP addresses, encryption fingerprints, and more. +> All of that is encoded in a text format called SDP. You, as the user, generally can very successfully use WebRTC without ever looking into what's in the SDP, +> but if you wish to learn more, check out the [SDP Anatomy](https://webrtchacks.com/sdp-anatomy/) tutorial from _webrtcHacks_. + +Next, we need to pass the offer to the other peer - in our case, the Elixir app. The WebRTC standard does not specify how to do this. +Here, we will just assume that the offer was sent to the Elixir app using some kind of WebSocket relay service that we previously connected to, but generally it +doesn't matter how you get the offer from the other peer. + +```js +const json = JSON.stringify(offer); +webSocket.send(json); +``` + +Let's handle the offer in the Elixir app next. + +## Elixir app + +Before we do anything else, we need to set up the `PeerConnection`, similar to what we have done in the web browser. The main difference +between Elixir and JavaScript WebRTC API is that, in Elixir, `PeerConnection` is a process. Also, remember to set up the `ice_servers` option! + +```elixir +# PeerConnection in Elixir WebRTC is a process! +{:ok, pc} = ExWebRTC.PeerConnection.start_link(ice_servers: [%{urls: "stun:stun.l.google.com:19302"}]) + ``` + +> #### PeerConnection configuration {: .info} +> There is quite a lot of configuration options for the `ExWebRTC.PeerConnection`. +> You can find all of them in `ExWebRTC.PeerConnection.Configuration`. For instance, all of the JavaScript `RTCPeerConnection` events +> like `track` or `icecandidate` in Elixir WebRTC are simply messages sent by the `ExWebRTC.PeerConnection` process sent to the process that +> called `ExWebRTC.PeerConnection.start_link/2` by default. This can be changed by using the `start_link(controlling_process: pid)` option! + +Then we can handle the SDP offer that was sent from the web browser. + +```elixir +# we will use the Jason library for decoding the JSON message +receive do + {:web_socket, {:offer, json}} -> + offer = + json + |> Jason.decode!() + |> ExWebRTC.SessionDescription.from_json() + + ExWebRTC.PeerConnection.set_remote_description(pc, offer) +end +``` + +> #### Is WebRTC peer-to-peer? {: .info} +> WebRTC itself is peer-to-peer. It means that the audio and video data is sent directly from one peer to another. +> But to even establish the connection itself, we need to somehow pass the offer and answer between the peers. +> +> In our case, the Elixir app (e.g. a Phoenix web app) probably has a public-facing IP address - we can send the offer directly to it. +> In the case when we want to connect two web browser WebRTC peers, a relay service might be needed to pass the offer and answer - +> after all, both of the peers might be in private networks, like your home WiFi. + +Now we create the answer, set it, and send it back to the web browser. + +```elixir +{:ok, answer} = ExWebRTC.PeerConnection.create_answer(pc) +:ok = PeerConnection.set_local_description(pc, answer) + +answer +|> ExWebRTC.SessionDescription.to_json() +|> Jason.encode!() +|> web_socket_send() +``` + +> #### PeerConnection can be bidirectional {: .tip} +> Here we have only shown you how to receive data from a browser in the Elixir app, but, of course, you +> can also send data from Elixir's `PeerConnection` to the browser. +> +> Just be aware of this for now, you will learn more about sending data using Elixir WebRTC in the next tutorial. + +Now the `PeerConnection` process should send messages to its parent process announcing remote tracks - each of the messages maps to +one of the tracks added on the JavaScript side. + +```elixir +receive do + {:ex_webrtc, ^pc, {:track, %ExWebRTC.MediaStreamTrack{}}} -> + # we will learn what you can do with the track later +end +``` + +> #### ICE candidates {: .info} +> ICE candidates are, simplifying a bit, the IP addresses that PeerConnection will try to use to establish a connection with the other peer. +> A PeerConnection will produce a new ICE candidate every now and then, that candidate has to be sent to the other WebRTC peer +> (using any medium, i.e. the same WebSocket relay used for the offer/answer exchange, or some other way). +> +> In JavaScript: +> +> ```js +> pc.onicecandidate = event => webSocket.send(JSON.stringify(event.candidate)); +> webSocket.onmessage = candidate => pc.addIceCandidate(JSON.parse(candidate)); +> ``` +> +> And in Elixir: +> +> ```elixir +> receive do +> {:ex_webrtc, ^pc, {:ice_candidate, candidate}} -> +> candidate +> |> ExWebRTC.ICECandidate.to_json() +> |> Jason.encode!() +> |> web_socket_send() +> +> {:web_socket, {:ice_candidate, json}} -> +> candidate = +> json +> |> Jason.decode!() +> |> ExWebRTC.ICECandidate.from_json() +> +> ExWebRTC.PeerConnection.add_ice_candidate(pc, candidate) +> end +> ``` + +Lastly, we need to set the answer on the JavaScript side. + +```js +answer = JSON.parse(receive_answer()); +await pc.setRemoteDescription(answer); +``` + +The process of the offer/answer exchange is called _negotiation_. After negotiation has been completed, the connection between the peers can be established, and media +flow can start. + +You can determine that the connection was established by listening for `{:ex_webrtc, _from, {:connection_state_change, :connected}}` message +or by handling the `onconnectionstatechange` event on the JavaScript `RTCPeerConnection`. + +> #### Renegotiations {: .info} +> We've just gone through the first negotiation, but you'll need to repeat the same steps after you added/removed tracks +> to your `PeerConnection`. The need for renegotiation is signaled by the `negotiationneeded` event in JavaScript or by the +> `{:ex_webrtc, _from, :negotiation_needed}` message in Elixir WebRTC. You will learn more about how to properly conduct +> a renegotiation with multiple PeerConnectins present in section TODO. + +You might be wondering how can you do something with the media data in the Elixir app. +While in JavaScript API you are limited to e.g. attaching tracks to video elements on a web page, +Elixir WebRTC provides you with the actual media data sent by the other peer in the form +of RTP packets for further processing. You will learn how to tackle this in the next part of this tutorial series. diff --git a/lib/ex_webrtc/rtp_receiver.ex b/lib/ex_webrtc/rtp_receiver.ex index 63552c6..6ce9be0 100644 --- a/lib/ex_webrtc/rtp_receiver.ex +++ b/lib/ex_webrtc/rtp_receiver.ex @@ -150,6 +150,7 @@ defmodule ExWebRTC.RTPReceiver do {rid, receiver} end + @doc false @spec receive_rtx(receiver(), ExRTP.Packet.t(), non_neg_integer()) :: {:ok, ExRTP.Packet.t()} | :error def receive_rtx(receiver, packet, apt) do @@ -173,6 +174,7 @@ defmodule ExWebRTC.RTPReceiver do end end + @doc false @spec receive_report(receiver(), ExRTCP.Packet.SenderReport.t()) :: receiver() def receive_report(receiver, report) do rid = SimulcastDemuxer.demux_ssrc(receiver.simulcast_demuxer, report.ssrc) diff --git a/mix.exs b/mix.exs index df296b4..14e7e8e 100644 --- a/mix.exs +++ b/mix.exs @@ -73,14 +73,23 @@ defmodule ExWebRTC.MixProject do end defp docs() do + intro_guides = ["intro", "negotiation", "forwarding", "consuming", "modifying"] + [ main: "readme", logo: "logo.svg", - extras: ["README.md", "guides/mastering_transceivers.md"], + extras: + ["README.md"] ++ + Enum.map(intro_guides, &"guides/introduction/#{&1}.md") ++ + Path.wildcard("guides/advanced/*.md"), source_ref: "v#{@version}", formatters: ["html"], before_closing_body_tag: &before_closing_body_tag/1, nest_modules_by_prefix: [ExWebRTC], + groups_for_extras: [ + Introduction: Path.wildcard("guides/introduction/*.md"), + Advanced: Path.wildcard("guides/advanced/*.md") + ], groups_for_modules: [ MEDIA: ~r"ExWebRTC\.Media\..*", RTP: ~r"ExWebRTC\.RTP\..*" @@ -90,6 +99,7 @@ defmodule ExWebRTC.MixProject do defp before_closing_body_tag(:html) do # highlight JS code blocks + # and mermaid graphs """ @@ -106,6 +116,29 @@ defmodule ExWebRTC.MixProject do } }); + + + """ end end