All streaming is not created equal
A deeper dive into the nuances and challenges of live streaming
We’re approaching Super Bowl LV this week, so it’s always worth reminding ourselves that every year this event sets the US record for live concurrent streaming — and live concurrent streaming is hard. Really hard, especially at scale.
There are few teams in the world that can manage high quality, live streaming at scale and the Super Bowl would be a test for most of them. Audiences expect to enter the stream immediately, experience the highest picture quality with minimal latency and zero buffering, and enjoy an ad experience that is seamless. All very fair consumer expectations 😎
At FOX, we delivered last year’s Super Bowl LIV, and we set records in terms of scale whilst innovating through the delivery of the first 4K HDR Super Bowl event. Delivering this technically for a live event is significantly different to video on demand (VOD) for a few reasons - 👀👇
1. The stampeding herd...especially if you lose them the first time
Kick off time doesn’t move, and the ramp in consumers entering a service is generally close to vertical. You can plan for the raw scale of video delivery, with Content Distribution Networks (CDNs) doing a lot of the heavy lifting, but you also need to scale all the supporting transactional services. These services - which determine whether a request is valid, whether a user is eligible to see the content, and gather information to inform the client how to render content - have different constraints and scaling concerns.
At FOX, we designed this tier into three micro-services which each handled tens of thousands of requests per second. Sure, companies like AWS — whom are a significant partner of ours in delivering these high scale events — make it easier to access this scale of infrastructure, but hardware alone doesn’t solve this issue. Smart architecture does. At kick-off time, we enabled 900,000 new joiners into the stream in under 60 seconds (and many more after that).
Another related key principle for us was resiliency. At this scale, you are hitting new peaks and Murphy’s law is sure to strike. You don’t get the linear ramp up in scale over time in advance to test and evaluate your architecture pre-event. Not only does the Super Bowl set new records, it demolishes previous records. In our case, we >4x’ed our previous record on Super Bowl day.
To use a mountain climbing analogy and keeping the math honest, rather than training at Denali to hit Mount Everest, this would be like training at Mount Baldy (#52 peak in USA) to hit Mount Everest. Reasonable scenarios for us included an AWS region going down, a CDN failing, our video stack falling over - the list went on an on, and these scenarios were key in the 18 month planning window that preceded the event.
Coming out of this planning came design decisions such as a multi-CDN architecture, redundant origins, and parallel independent video stacks running hot/hot. We planned a series of graceful degradations to optimize for live playback at the expense of all else. We also planned for this question:
What if we lose ALL traffic and need to get everyone back into the stream immediately?
That is the ultimate stampeding herd and drove our audacious request per second goals. Thankfully, we didn’t need to test this in production on game day. Still, we processed over 100 Jira tickets on game day as we hit new issues, that we were able to handle without user-facing impact thanks to the more than one dozen dry runs and tabletop exercises we ran in preparation. Resiliency requires both technical and operational planning.
A view of how Super Bowl viewers entered and remained in-stream in 2020
2. Concurrent scale vs elongated demand
Live events, by nature, are simultaneously viewed by the audience. For large scale events this creates massive bandwidth dependency across every part of the network and if this isn’t managed proactively, the consumer experience is significantly impacted.
VOD is different — you pre-populate the network (especially the edge of the network) with the video assets and when the time is right, you light up access to those assets. And in general, those assets are consumed across a very elongated window of time reducing the bandwidth drain and making quality of service easier to deliver.
Consumer behaviors around entertainment content in particular have dramatically changed. Today, most entertainment content, like FOX’s hit Bob’s Burgers is consumed well after it airs live. This leads to greater overall viewership (more than double) but close to 80% of this is spread beyond the first week of availability.
Time-shifting in action
For something like the Super Bowl, you’re shifting the signal from the stadium through to the consumer in a matter of seconds and the network paths have to be clear and ready to accept concurrent demands for the stream.
The CDN marketplace is more vibrant than ever, with established players like Akamai and disruptors like Fastly competing heavily for business. The approach to dynamic multi-CDN routing is now prime time and it’s an approach that we used at FOX for Super Bowl LIV. Depending upon performance, availability and sometimes price, you dynamically pick the CDN path to send the stream down to the consumer. Just 3 years ago, this would have been fraught with risk and would have been prohibitive from a latency perspective, but today it really is the only way to carry off massive scale events. With re-buffering ratio as a core input into our CDN decisioning engine, we kept the rate below 1% even as we hit peak concurrencies.
But don’t expect to get the bandwidth on demand from the CDN’s for an event like the Super Bowl — reserved capacity where you guarantee a payment floor in return for the guarantee on capacity is still the only way to guarantee that the path will truly be available to you. This may change in the future, but for now, capacity reservation is critical and can be expensive. For Super Bowl LIV we reserved >80tbps of CDN capacity ⚡️
3. Low tolerance to latency
FOX believes, without question, that low-latency is the future and that live streaming can surpass even the speeds of cable and satellite. ‘Live’ is simply no good if you’re minutes behind the action, and until recently latency has been somewhat accepted when it comes to streaming. Maybe more real time networks like Twitter, or instantaneous push notifications with breaking news headlines and scores have helped highlight how latent video experiences are - regardless of the platform you’re viewing on. And as 5G and new consumer offerings in betting and wagering begin to gain penetration, you’ll see expectation around low latency spread into other live consumer services.
When it comes to live streaming, this issue is compounded significantly by the digital advertising marketplaces. It’s still astounding how there is very little quality of service to ad-serving and the fragmented nature of the ad-marketplaces mean that this is a tough technical issue to fix. The Hulu team has probably invested the most in their own video advertising infrastructure to work around this point, and it shows in the experience.
At FOX, we’ve placed a lot of energy into reducing latency throughout our production and distribution chain — using the latest CODECs, eliminating unnecessary hops and investing in cloud native platforms from the camera through to the consumer. For large scale sporting events like NFL Sunday we also integrated ads into the production stream itself, preserving the consumer experience and ensuring that our advertising clients gain maximum reach across both linear and digital platforms. On game day, end users experienced latency of roughly 8-12 secs behind the feed from Master Control, and we’re currently experimenting with streams that have less than 1 sec of latency, so watch this space 🚀
4. Additional overhead
Dynamic advertising insertion isn’t the only requirement that complicates live media delivery. Encryption, authentication and authorization services are critical aspects to protecting the investment in content and sports rights and if executed well they don’t impact consumer experience. But implemented poorly for massive scale live events and they’ll bring the entire service to a stand still.
Every large scale live streaming operator has rip cords that are frequently pulled to work around these issues — the reality is that significant technology innovation is still required to improve robustness and reduce the friction that consumers experience when logging in to live services. Existing technologies are chatty — impacting the performance of the network, and can be extremely unreliable at scale.
We’re embarking on two exciting projects at FOX to really leap forward in this particular area. First, we are working with an early stage company called Eluv.io, leveraging blockchain to manage and grant access to live content across the network. So far, this work has proven to be far more resilient than traditional encryption and rights management techniques and has dramatically reduced latency.
Alongside this, we’re embarking on a strategic journey with Okta to transform our entire approach to consumer authentication. Better known for their enterprise offerings, they’re now turning their attention to the consumer journey. This project is still early days, but we’re excited about building the next generation of consumer authentication with the Okta team — if only to prevent their co-founder, Todd McKinnon, from complaining on Twitter (rightly so) 😜👇
Todd McKinnon, co-founder at Okta, speaks on behalf of many consumers when highlighting friction
5. Quality, notably delivering 4K HDR content for live premium sports streaming
All of the above is compounded further when you overlay the growing expectation from consumers that 4K HDR is table-stakes for premium content. At FOX our first 4K streaming event was the FiFA Women’s World Cup in the summer of 2019. We used this event as the fore-runner to Super Bowl LIV — stress testing approaches and architectures that we needed to have confidence in for 2020.
4K was one of those experiments we deployed, and we instantly found out two things: the consumer feedback was off the charts positive, and through careful planning the quality of service around a 4K stream held up well.
Since that experiment we’ve evolved from 3% of devices accessing the 4K stream to well over 20% in one of the most recent NFL games on FOX. And of course, we were the first to stream the Super Bowl in 4K in 2020 — where 14% of our streams were delivered in 4K. A major milestone when you consider the scalability challenges around delivering that event.
A sampling of events showing the 4K growth as a % of stream takers over the last 18 months.
Of course, the technical purists will debate what resolution a 4K stream actually is, but almost all of the streaming services offering 4K are delivering streams of around 20mbps for 4K, compared with HD streams of around 5mbps. Four times the bandwidth, to 20% of your streaming audience with all of the complications outlined above still make live 4K difficult - very difficult at scale. We don’t doubt that this is one reason why CBS-Viacom have chosen not to stream Super Bowl LV in 4K later this week 😢
On that note, we do wish our friends over at CBS-Viacom all of the best next week when the stampeding heard hit that play button shortly before kick-off…. we’re sure it’ll be another record setting live streaming event for the US market 📈 🏈🙌
About FOX
To achieve seamless live streaming, we have a strong internal engineering team that oversees our streaming platforms, products and services. We recently announced the appointment of Varun Narang to head up our efforts in this area. Varun spearheaded the largest scale live streaming platform in the world, Hotstar, and joins us in Los Angeles later this year. An IPL cricket final in India is about 5 times the scale of a Super Bowl in terms of concurrency for those of you who are interested 🏏
We have a multitude of partners that help us power our platform, but our approach has always been to invest heavily in the engineering and software talent to put these partners to work. Oh, and we’re hiring 😀
Hi Paul - enjoyed this article. At CSG I brought in the Formula 1 deal and so can relate to your comments on the need for resilience and scalability across every element that connects with and can impact the quality of the customer experience. Delivering a great experience for paying F1 fans, around the globe, over 20 times a year, real sorts out the players from the pretenders. Hope you're well (we met when I worked with Jonathan Milne, Jay, Bismarck and co at Ooyala)
Bill Gash