How and why Daily is using Rust for our WebRTC APIs

rictic · on April 11, 2022

> This means that whenever we decide to throw the switch to multi-threaded WebAssembly, essentially no code needs to change as all of our logic is already written in a concurrent-friendly manner. We simply reap the benefits of a multi-threaded runtime on the Web.

In a multithreaded wasm app, do all threads have equal access to the DOM, or is it more like JS where there is a UI thread which can access all APIs and a bunch of worker threads which can only access a more limited set of APIs?

If the latter, has anyone sought to expose that constraint to the Rust type system? e.g. via a capabilities object that's neither Sync nor Send

monocasa · on April 11, 2022

Last I checked, no WASM has direct DOM access, but instead it all has to go through a bridge to JS.

kjsthree · on April 11, 2022

The question still stands though. Does each individual worker have its own bridge to JS? If so it’s just a constant race condition between them.

monocasa · on April 11, 2022

Yes, sort of. They can all call into JS, but that JS context is single threaded and serializes access. There can be races depending on what assumptions you make, but those races can't cause memory unsafety of the type usages of Sync and Send are trying to prevent.

Matthias247 · on April 11, 2022

How would it serialize? That means potentially interrupting any other JavaScript or WASM code running on the main UI thread and executing other JS code there? That seems kind of against what JavaScript is (100% singlethreaded).

I assume if it’s even possible, it might execute the code only if the main thread is not executing any events - which happens only once per eventloop iteration.

I guess what is happening is that worker can call into JS, but can not access the DOM or anything else living on the main thread. They have to send messages to the other thread, just as JavaScript workers would need to do.

monocasa · on April 11, 2022

Because it's structured as WASM popping a normal event on the event queue of the normal JS main thread. There's no 'now run JS on the WASM thread but with DOM access'.

fosefx · on April 11, 2022

> 100% singlethreaded

You can (and should) use Web Workers[1], those run off the UI thread, but also don't have access to the DOM.

[1] https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers...

yolo69420 · on April 11, 2022

Even if it sounds like it makes sense, something tells me that these (not just this one, all of these kinds of 'why x and y is happening') are really after-the-fact rationalizations after some dev or devs decided to engage in a pet project that would have happened even if it had no sensible justifications at all.

kwindla · on April 11, 2022

FWIW we did two pretty extensive proof-of-concept prototypes (one in pure C++ and one in Rust) before choosing Rust for this project. I, personally, thought that we would choose C++. I'm a big "choose boring technology" advocate. [1]

In other words, we tried hard to document a lot of before-the-fact rationalizations to go with the inevitable after-the-fact rationalizations. :-)

[1] https://mcfunley.com/choose-boring-technology

Sean-Der · on April 11, 2022

I do that all the time. is anything wrong with that? Sometimes you go and build something to solve a specific problem, and realize it solves lots of different things.

I don't think it matters what the Daily team originally intended. It is exciting/inspiring to me that they are trying to solve things differently. I think security for RTC is really important, so seeing more memory safe language usage makes me hopeful for the future.

All this new code also means people are really learning what goes into these problems. A whole new generation of coders are getting into the space and rethinking things. It is great.

NoraCodes · on April 11, 2022

Why do you say that?

yolo69420 · on April 12, 2022

it's just kind of obvious from looking at the distribution of these kinds of posts. lots of people posting blogs about why they use x and y new technology with these objectively good reasons to do so, and then two years later everyone jumps on another pet project hype train.

it seems like these reasons to do something mysteriously only stay valid for the short period of time in which a technology has some kind of hype status and quickly fades when people realize that it's actually not that much improvement in practice and the hassles (training devs in new language, worse language ecosystem etc.) aren't actually worth it.

NoraCodes · on April 12, 2022

I'd be interested to see follow-up posts by these companies later. As an example, my former employer started using Rust in 2017 and is still happily and productively using it today, but they only put out a blog post when they started, and they haven't talked about it since.

Sean-Der · on April 11, 2022

What is `daily-core` actually doing? It is signaling code?

Have you evaluated webrtc-rs yet? I haven't keep up to date with mediasoup, but I saw they have a rust effort. Have you evaluated that yet?

Seems possible to drop lots of C++ dependencies :)

jpgneves · on April 11, 2022

Hi, I'm João, one of the co-authors of this post!

Regarding mediasoup, their Rust effort so far has focused mostly on the server side of things, whereas for this particular project we would be looking more towards the client side.

That would definitely be something we are considering contributing to, as it would, as you say, drop quite a significant amount of C++ (with additional benefits on the web side, too).

7sidedmarble · on April 11, 2022

I'm curious about the approach you guys are taking here. The client side of doing WebRTC is by far the easiest part. Maintaining separate client side implementations for mobile and web is... no harder then maintaining a web and mobile app separately anyways.

The server side is the hard part to work on. Also, one of the downsides of wasm is that it can be a pain to communicate out to the JS APIs, especially for doing lots of DOM manipulation. I'm not sure how sound this approach is really. It seems like you're going to do a lot of extra work in order to solve a problem that isn't much of a problem in the first place.

kwindla · on April 11, 2022

I would say that both the client and the media server (SFU) side of the work are challenging in their own ways, if you are trying to support a large variety of use cases, features, and sessions with large numbers of participants.

The client-side and server-side code end up being tightly coupled and you end up having a lot more client-side code than maybe is obvious if you're building an application that uses WebRTC in one specific way. For example, handling fast subscription to and un-subscription from batches of tracks is non-trivial, but important if you're implementing "grid mode" client views.

The goal of the approach we're taking here is to be able to support a bunch of different platforms at the same level of performance, stability, and feature parity. Web, iOS, and Android are the three most important platforms. But people are also using WebRTC on Flutter, native Linux, macOS, Windows, Unreal, Unity, and various embedded platforms.

kwindla · on April 11, 2022

We're fans of webrtc-rs, but interop considerations pretty much mean that building on top of libwebrtc is necessary for our client libraries for now.

`daily-core` is signaling, the (always evolving) low-level trickery needed to scale calls to 10,000+ participants, state management, and an opinionated internal API that lets us efficiently maintain public-facing API features like track management, and device selection.

mwcampbell · on April 11, 2022

Have you considered releasing your Rust bindings for libwebrtc separately, either as open source or as another commercial product? I could have really used that for a desktop application I started developing a few months ago. In the absence of such bindings, I went with Electron instead, reluctantly. And no, I don't think your high-level API would be a good fit; what I'm doing with WebRTC is too custom.

Edit: Also, webrtc-rs doesn't work for me either; AFAIK, it doesn't yet have enough of the media stack, including hard stuff like echo cancellation.

Sean-Der · on April 11, 2022

You should help out with the gaps in the media stack! That is the only way it will get better :) I don't believe echo cancellation is a hard problem either. I don't know the specific details, but I have heard this argument so many times.

People told me that unless you were a developer at a big company you can't build DTLS, SCTP and RTP Congestion Control either. Maybe the community implementations aren't as good yet, but I think it is a tortoise vs hare.

regexident · on April 12, 2022

> You should help out with the gaps in the media stack! That is the only way

As it happens I am the author of the audio-buffer and constraint-algorithm implementations in webrtc-rs/media: https://github.com/webrtc-rs/media/commits?author=regexident

So even though we are not using webrtc-rs in Daily today, we are (or at least I am) contributing to it, in hopes of it becoming a feasible option at some point.

yolo3000 · on April 11, 2022

Are there use cases for calls with so many participants?

vr000m · on April 11, 2022

One thing to consider is if you would have One API and a few knobs to control the experience and not have to worry about the underlying protocol-, reliability-, latency- aspects to build that experience. (particularly not worry about the shenanigans around tuning RTMP, RTP, HLS, or webrtc)

There are some use-cases in every industry. * Finance/Company: earnings calls as mentioned before, all-hands meetings at companies (with several people queuing for questions and answers) * Live Events (music or talkshows) * Tutoring or education in general. * Healthcare and education -- Surgeries which are broadcast to several schools and have active collaboration from some doctors. The large-scale interaction can begin much earlier before the surgery.

VWWHFSfQ · on April 11, 2022

People have started turning to webrtc for broadcast streaming (1->many) instead of hls/dash because of the lower glass-to-glass latency. But it has scalability problems which makes it enormously expensive comparatively. There are low-latency variants of HLS and DASH emerging now though

dgunay · on April 11, 2022

WebRTC is some of the lowest latency streaming you can get, so I would imagine any realtime event with a large broadcasting audience would benefit. Interactive/2-way though, that's another story.

tebbers · on April 11, 2022

Company earnings calls with analysts? Livestreams?

regexident · on April 12, 2022

> Have you evaluated webrtc-rs yet?

Vincent here, fellow dev at Daily, working on daily-core and Daily for iOS. :)

We did evaluate webrtc-rs, but we found that it simply isn't yet were it would need to be in order to be a realistic replacement for libwebrtc.a today. As a matter of fact I am a member of the webrtc-rs org (https://github.com/orgs/webrtc-rs/people), having contributed the audio-buffer and constraint-algorithm implementations so far (https://news.ycombinator.com/item?id=31000261).

I personally have high hopes for being able to make the jump at some point. Just alone being able to toss the monstrous `build.rs` that currently ties ninja/gn (libwebrtc), cmake (libmediasoupclient), clang (our own bridging code) and bindgen (for header imports) together and makes things work across multiple platforms would almost be worth it for me. With Rust implementations of mediasoupclient and webrtc none of this hassle would be necessary. In pure Rust stuff just works.