Ultra Low Latency Video Streaming: The Current State

At Zender.tv we specialise in interactive livestreaming. Ultra Low Latency Streaming is very exciting for us: the engagement of the audience increases as they have the feeling they are heard or can make an immediate difference. No more awkward video delays or spoiler alerts via social, but almost instant feedback and interaction with the presenter and the show.

You don't have to take our word for it: just look at the many social streaming platforms like Twitch, Youtube, Facebook and Periscope. They are all adding low latency options to increase audience engagement. Mixer, a game streaming solution acquired by Microsoft, takes pride in providing ultra low latency as an industry game changer (pun intended)

So... where does this leave the 'non-platform streamers' aka broadcasters?

I decided to see what low latency solutions are available on the market today. Many thanks to Wowza, Akamai, IrisPlatform, NetInsightNanocosmos and Phenix for allowing us to evaluate their Ultra Low Latency solution.

Before the rise of Low Latency, typically a broadcaster had their own streaming solution in place, and had even invested in their own video player. In that case we would just integrate one of the excellent standard players in our product and add the interactivity layer on top of it.

While this worked nicely, show producers always had to take into account the possible lag between the time something happened in the studio and when the audience would see it at home on their devices. This lag has many names in the industry:  'livestream latency', 'glass to glass latency' or 'hand waving latency' (see video below).

 The amount of seconds it took for the presidential debate in 2016 to reach the audience.

The amount of seconds it took for the presidential debate in 2016 to reach the audience.

To make matters even more complicated, this latency could vary very much between different viewers, causing video player drift: one viewer might see his favourite team win and another viewer might only see the same action 5 seconds later. An interesting infographic by Mashable on the 2016 Presidential debate shows that it's even worse with different broadcasters and social platforms in the mix. Facebook streaming was close to the latency of cable distribution, and the difference between different regular tv networks was between 10-55 seconds.

Also watch Anders Cedronius (Net Insight) talk on showing us the delays in broadcast networks.

Latency sucks

A variety of solutions have been used in the past to 'synchronise' viewers: some added extra metadata to the video signal to know how far a device is behind. Others used audio fingerprinting to 'listen to where the user is in the video. These solution were merely stopgaps for the real problem: high latency.

The Protocol Parade

The current most common web streaming protocols are HLS (m3u8) and MPEG-DASH. These protocols are supported by all the major Content Delivery Networks (CDNs) and video players, both on web and mobile devices. However… they are not optimised for interactivity, but for mass broadcast/distribution, much like traditional television distribution.

Will Law on low latency streaming

Interestingly enough, the first internet protocols had much lower latency than the current web streaming protocols: RTSP, SDLP and most notable RTMP. RTMP was developed as part of the Adobe Media Services and became a real standard.

And then… lawyers & patents… Adobe claimed the ownership and there was a fierce industry battle. According to Will Law (Akamai) this seems to mainly impact US and Europe as in Asia there are RTMP open source implementations:

Playing RTMP required the Adobe Flash Media player (and a license). Last year Flash received its deadly blow when Chrome decided to kill support, due to many reasons - not in the least the many security flaws in the Flash Player.

Who will win the race to deliver ultra low latency streaming?

Wowza powers ultra low latency streaming applications in a plug-in free environment.

Everyone wanted to have an http friendly protocol and not a proprietary protocol. The industry needed to find a new solution for this... and so the race began: who can stream and deliver ultra low latency at scale?

To demonstrate the difference between the old and new, have a look at the timing in this Ultra Low Latency Streaming Service demo video made by our friends at Wowza:

The Video nicely demonstrates the time between the first hand waving and having it appear in the player. Ultra low latency would be around 2 to 4 seconds and the untuned HLS stream would even clock in at 30 seconds!

Current options for ultra low latency streaming

The_current_state_of_Ultra_Low_Latency_streaming.png

Now that we've established the problem and a bit of streaming history, let's see what the current options are for ultra low latency streaming

The process of delivering a video stream looks like this:

 

  • The encoder takes the actual video signal and converts it into a digital format. Typically a transcoder uses either MPEG, RTMP or HVEC as the digital format.

  • The original digital format is often converted in different other digital formats. For example, using a transcoder, an RTMP ingest signal can be converted in both MPEG-DASH or HLS and can generate different qualities (mobile, desktop, TV)

  • The CDN delivers these digital format to the viewers. Sometimes the CDN provider offers an integrated transcoder in their cloud solution.

  • On the end-user device, the video player takes the digital format and shows the video on the user screen.
     

Most of the innovation happens in the delivery protocol. The basic requirements are:

  • video is delivered as fast as possible from encoder to player
     

  • video distribution is cost efficient to deliver to a high number of people
     

  • ideally it uses open standards available so it can work all devices without extra technology
     

Delivery protocols: towards a new standard?

We mainly see four options (or a mixture) being used by Ultra Low Latency providers:

1. Use one of the old technologies like RTMP

2. Enhance the existing protocols like HLS/MPEG-DASH (CMAF)

3. Create a new protocol based on websockets/http2 to deliver video frames to the browser

4. Use WebRTC, the new kid on the block

Option 1: RTMP is dead, long live RTMP

The first option (using RTMP) is still a valid choice:  on mobile devices we can still use RTMP, because Flash was used by browsers for playing RTMP streams, not on the mobile devices. It might be old but it's a proven technology: for example HQ Trivia currently uses a mixture of RTMP/RTSP to stream their quiz using a proprietary video player. As RTMP is dying in the browser, less and less CDNs are delivering RTMP streams, so it seems like a dead end for the future. 

Option 2: HLS & Mpeg-Dash enhanced

The second option, to enhance HLS & MPEG-DASH, seems to be the most open standards oriented solution. In a nutshell: to make it low latency, you reduce the size of the video segments sent to the player.

Apple used to have a default of 8 seconds segments for HLS. Recently its advised segment size went down to 2 seconds. By even further reducing the segment size (100-200 milliseconds), video players can now get to much lower latency. This is sometimes referred to by Fragmented MP4. It has been championed by Periscope to be the first to deliver Low Latency HLS. For more details, watch Mark Kalman's Periscope LHLS Media Streaming video below, and read Mark Kalman, Geraint Davies, Michael Hill, and Benjamin Pracht's blog post  on Introducing LHLS Media Streaming here.

Because having to deliver two types of stream HLS/MPEG-DASH for different players, a new standard CMAF was defined: it is a standard that is originally agreed upon by Apple and Microsoft for streaming. It can combine the two currently most used protocols HLS and MPEG-DASH, and allows to use the same protocol to delivery both type of streams. It's well documented and already supported by all Apple devices today.

Segmented Stream Startup.png

Presentation by Romain Bouqueau (GPAC Licensing) and Nicolas Weil (then Akamai, now AWS)

For more detail see the excellent presentation by Akamai Will Law's (see higher) and Paris Video meetup on low-latency streaming using CMAF chunks:

The current solution used HTTP/1.1 Chunked Encoding to deliver the segment. This can be further enhanced by using Websockets, HTTP2.0 and even moving from TCP to UDP. This will crystallize in another standard called QUIC. (See Mattias Geniar’s nice write-up on it).

Option 3 : Let's be creative with websockets

To deliver data to your browser with low latency, there is a standard protocol called websockets. In essence it's a bi-directional connection between your browser, and ideal to send & receive packets without too much network overhead.

Therefore it's only logical that companies like Wowza used this to create their own new protocol. They were already using their own delivery protocol WOWZ inside their CDN, and extended it for the delivery from edge to the browser.

Presented at the Demuxed 2017 conference by Scott Kellicker and Jamie Sherry - 3 Second End to End Latency At Scale:

Websockets at Wowza.png

It's clear that technically this would be very similar to option 2, but with a proprietary protocol. But who knows, it might result in an option standard.

Truth is, they are almost out of preview phase of the solution and are the first big CDN to delivery Ultra Low Latency for the masses.

h5live workflow-1.png

On the other hand Nanocosmos is already fully operations. They apply a hybrid approach based on the device. So even though the websockets part is not standard , the protocols are.

 

Option 4: The future is WebRTC?

WebRTC is the new kid on the block. Created for Low Latency by design, it works much more similar to conferencing tools than livestreaming services. As opposed to HLS and similar protocols, it will trade in streaming quality for low latency, possibly resulting in garbled images.

As it is slowly getting implemented in newer browser versions and native device, WebRTC offers low latency to the same extent of RTMP with better media quality. Beyond compatibility there is currently a lack of support of the major CDN parties to deliver it at scale. But that's doesn't mean it's not used: some parties have implemented their own CDN to overcome this problem.

Another thing to note is that WebRTC can easily integrate the transfer of data/metadata combined with the videostream. This enables frame accurate events being triggered in the browser, to synchronize other things outside video content.

 Send a link, get people to join you in your livestream

Send a link, get people to join you in your livestream

Phenix is a vendor that offers a scalable interactive broadcasting solution, including synchronisation.

Nice touch with Phenix is that not only can you get everything to play synchronised on every device, you can also send a link to invite people in your livestream. 

This makes your stream social: you can watch with friends, invite a guest on a show, bring in a commentator...

 

No article about Ultra Low Latency is complete without the Wowza infographic that defines the thresholds for latency.

Ultra_Low_Latency_Streaming_for_When_Every_Second_Counts_Wowza.png

Video Player Support for Ultra Low Latency Streaming

"With great video protocol comes great video player" - Uncle Video

As the new Ultra Low Latency protocols are emerging, the video players need to support the new streaming formats. If you are going with one of the enhanced standards, chances are that the protocol support will end up magically in one of your existing players (JwPlayer, TheoPlayer) though it might take a while for the standards to settle down.

How to evaluate new video players:

While evaluating the new players the following aspects are worth drilling down to:

Device & Browser & Network Compatibility:

The first thing to determine is the kind of devices you need to support for playback

1. web browser (desktop): Chrome, Firefox, Safari, Internet Explorer, Opera

2. web browser (mobile): Mobile Safari, Chrome on iOS and Android

3. mobile applications: via an sdk iOS/Android/Windows

4. other appliances: xbox, Apple TV, Samsung TV, ...

Browser support is usually pretty good. On the mobile applications side you would probably have to go for iOS 10/11 support  and Android 4.4+ support.

From all the ultra low latency providers we found only NanoCosmos with it's H5Live solution to stream on all browsers and devices including Mobile Safari! All the others refer to Apple Mobile Safari not supporting the "Media Sources Extension", a standard allowing the handling of video & audio elements in code. (see CanIUse graphic below). For now your best bet is to refer Apple mobile webusers to either install your app or use Chrome for iOS.

The last thing to check is if the protocol is 'firewall' friendly. Some corporate networks or ISP block certain network ports. Be sure to check if there is proxy support for the protocol or if needed a fallback protocol such as HLS or audio only.

Overall during our tests we found latency of the new solutions to be consistently somewhere between 2 to 4 seconds (in the real world). What a huge improvement over the traditional 15+ seconds! All players integrated nicely with our interactive livestreaming solution Zender.tv

Buffering & Drift control:

With all the talk about latency we have ignored a few other problems: buffering and drift.

Nothing is more annoying if the spinner of death kicks in during your favourite game or show.

First of all you need to advise your viewers to be on a good network connection. Ultra low Livestreaming can not really buffer much because it would be introducing a delay.

We find it good practice if the player/protocol implements the following:

  • detail buffer stats on the buffer so we can monitor the 'health of the streaming'
     

  • detection of buffer starvation so we can at least show a nice message
     

  • detection of drift so we know a player is getting behind on live
     

  • control of the 'catch up strategy':
     

  • sometimes it makes sense to just skip to the latest video frames

  • other times it makes sense to slightly increase playback speed

Obviously drift and buffering are reduced by adapting the video quality to the network speed. Not all protocols have auto multi bitrate selection requiring you to implement your own bandwidth checking and video quality selection algorithm. On a similar note, not all devices/browsers accept the same sample rate for audio: some will only work for 44.1K others with 48K. So if you hear your audio is garbled, it might worth checking.

We want to do a special shoutout to NetInsight for being able to do true frame accurate synchronised playback across all connected players.

Conclusion

These are exciting times for content providers. Ultra Low Latency enables new digital formats where audiences can be truly engaged and be part of the action. Unidirectional broadcasting makes room for Interaction focused broadcasting. With Zender.tv we can help you make that transition and provide out-of-the box experiences your audiences will love.

Patrick Debois1 Comment