Show HN: Roc – Real-Time streaming over the network

ssl232 · on May 30, 2019

This is exciting. About 5 years ago I tried to set up a home-made Sonos clone by using two Raspberry Pis to stream synchronised audio across my network. I did get it to work, but it was a huge hassle finding the particular combination of PulseAudio configuration flags to use, and I had to set up a dedicated wireless network for the bandwidth because it fell over if I used compression. I figured the best approach would be to write a PulseAudio module in C but I never had the time or skill to do it.

I'll definitely be giving this a go!

geekuillaume · on May 30, 2019

I've built something similar at home with three RPis and Snapcast [0]. It has an integration with Librespot [1] that shows it as a Spotify destination. It works really well!

[0] https://github.com/badaix/snapcast [1] https://github.com/librespot-org/librespot

ssl232 · on May 30, 2019

Wow, that's amazing! I'll definitely have a look at that first!

aaronarduino · on May 29, 2019

What is the latency? I don't see any latency numbers listed on the linked site. Latency would be my number one concern when using software like this.

gavv42 · on May 29, 2019

So far I mostly tested Roc on several 2.4 Ghz Wi-Fi networks. You can usually expect 100-500 ms in this case (depending on the network). See "Typical configuration" in this article[1].

Most likely you will be able to achieve lower latencies on a better channel, but I did no serious testing for lower latencies yet. This is in my to-do list.

[1] https://gavv.github.io/articles/new-network-transport/

rahimnathwani · on May 30, 2019

You can choose the target latency. Presumably, the larger that value, the less effect dropped packets and network jitter have on the quality of the output:

https://roc-project.github.io/roc/docs/manuals/roc_recv.html

gavv42 · on May 30, 2019

Right.

You can as well configure the FEC block size (it should be smaller than the target latency), the length of network packets, the length of internal audio frames in the pipeline, and the resampler window. And also the I/O latency, e.g.PulseAudio buffer size. So basically you can configure all (or almost all) parameters that can affect the resulting latency.

I'll document these parameters and their configuration a bit later. (Currently you can find all of them in the man-page in in the API, but there is no overview page that explains how exactly do they affect the total latency).

staticfloat · on May 30, 2019

I once wanted to do something similar, so I wrote my own realtime audio streaming tool [0]; the UX resembles something more akin to a foot-gun than the polished (and much more feature-complete) Roc, but it does have excellent latency properties by running at the absolute ragged edge of what it can.

You could connect multiple clients to stream music to a single server (or multiple servers if you wanted to); the server kept a list of pending audio buffers for each client, mixing them all together into a ring buffer that gets spat out to portaudio. If a client underran, it simply missed that ring buffer rotation, and it "fell behind" by one buffer length (we request the minimum latency from portaudio, so this is usually measured in single-digit milliseconds). That would cause a few crackles and pops in the first second or two as the natural jitter of the network caused the client to underrun a few times, but then it would stabilize. (This didn't bother me much, as I usually was playing silence when I first connected it in any case).

In my experiments, the overall system latency was pretty close to the perceptual limit, I would estimate around 10ms, streaming over wifi from my laptop to a raspberry pi.

[0] https://github.com/staticfloat/popuset

kitotik · on May 29, 2019

From their roadmap summary:

> Research. Learn to measure the full network latency, test Roc on different network types and conditions, determine the minimum possible latency that we can handle on different channels.

It sounds like latency is unknown, but my guess is that it’s comparable to RTP plus a few extra milliseconds to handle error correction and a constant latency.

radarsat1 · on May 30, 2019

Wasn't there a way to do this with Jack? Ah yeah, there were a few attempts.. http://jackaudio.org/faq/netjack.html

gavv42 · on May 30, 2019

AFAIK zita-njbridge is quite similar to Roc, but it's JACK-specific and has no loss recovery.

vlaskovits · on May 30, 2019

This seems pretty cool -- but if I understand correctly, the promise of this is more constant/predictable latency that it is real-time.

holy_city · on May 29, 2019

What sets ROC apart from an AES67 stack?

aaronarduino · on May 29, 2019

Or Dante for that matter.

holy_city · on May 29, 2019

I'm including Dante/Ravenna/Livewire in that question.

My guess is that this is more an application toolkit for home/consumer applications instead of professional use, but a lot of the problems they have listed on their roadmap are already solved through a family of open* standards. The name escapes me but there's an industry initiative to provide FOSS APIs on top of web tech to build these kinds of applications. Engineering manager from Fox/Disney gave a talk about it last month (maybe it was march?) at the LA SMPTE meeting. Lots of money is going into this domain.

* with an AES membership, but you should have that for this kind of project anyway

detaro · on May 29, 2019

any pointers to open implementations of those two?

holy_city · on May 29, 2019

AES67 is just an interoperability standard, it's more of a device level protocol than anything. You buy AES67 compatible gear for your application, then use the vendors' tools like Dante Virtual Soundcard (so you can essentially treat the networked audio system as a normal soundcard on your machine through CoreAudio/WASAPI/JACK/etc).

It's actually pretty great, most of the time there's no need for a separate API just to handle streaming. It "just works."

tootie · on May 30, 2019

Doesn't Dante also require proprietary hardware?

holy_city · on May 30, 2019

Kinda? You need hardware at some point. AES67 was all about creating an open protocol for connecting different proprietary stuff, and frankly there's only a handful of places where I've seen open hardware worth its salt in audio. If you need high capacity, low latency audio over networked machines, you're going to need proprietary hardware/software in the chain somewhere.

xmichael999 · on May 29, 2019

!RemindMe when this supports video