Home Technical Articles Why Network Capacity Isn’t Enough: Caching Strategies for Better Streaming

Why Network Capacity Isn’t Enough: Caching Strategies for Better Streaming

About The Author


To keep up with the growing volume of media content, the Verizon Media, now Edgio Media, Platform has invested in expanding our global cache footprint. In 2019 alone, we added more than 25 Tbps of capacity, seven global PoPs, and close to 900 last-mile connections. While effective at improving performance, raw capacity is not enough, nor is it a sustainable business model for meeting the ever-growing global demand for streaming content.

To maximize the availability of our network capacity, we invest equally in technologies, processes, and tools that keep operational and infrastructure costs in check. Our research team continually pushes the boundary of caching technologies, applying and refining processes to give our network operators granular control over how, when, and where content is cached.

Modern Caching Strategies

The goal of any caching strategy is to keep the most popular content in the cache while quickly and efficiently removing the less popular content. Over the years, researchers and software developers have devised countless strategies intended to solve the caching challenge. These range from relatively simple to extremely complex. Some of the more popular strategies include:

  • The least recently used (LRU)
  • Least frequently used (LFU)
  • First in, first out (FIFO)

It would be convenient if there were a single caching strategy to rule all situations. However, such a solution has yet to be developed, and the effectiveness of a particular strategy can vary greatly depending on server and disk sizes, traffic patterns, and other factors. Based on extensive testing, we’ve found that LRU offers the best compromise between hit rate and disk I/O, providing 60% fewer writes than FIFO while maintaining high hit rates. Additionally, for the disk sizes used in our CDN, LRU performs on par with more complex policies like S4LRU (Quadruply-segmented LRU). You can get more detail in this paper we published last year at the Passive and Active Measurement Conference (PAM) held in Puerto Varas, Chile.

Evolving Caching Strategy with Hybrid LRU

Even though LRU works very well for our environment, we’re always looking for ways to drive innovation and improve customer performance. This has led to a new capability we recently added to our platform called Hybrid LRU. It’s called hybrid because it adds a layer of abstraction on top of LRU. If we don’t use the hybrid functionality, the system continues to operate normally, so it’s very easy to understand and activate or deactivate.

What we’re doing with the hybrid approach is tweaking the LRU system to give us more control over specific pieces of content. By control, we mean having the ability to explicitly store some content for a longer or shorter duration based on predefined settings.

This is important due to changes occurring across the video streaming landscape, particularly the rapid growth in live streaming. Our network alone has hosted hundreds of thousands of live events, many of which are delivered to millions of concurrent viewers. Despite the massive popularity of such events, once a live event stream is completed, it’s not likely to be re-streamed at any significant volume. With Hybrid LRU, we can specify a shorter cache period, freeing up valuable cache resources for other media and content.

We are experimenting with locking down certain content and providing a best-effort assurance that it will remain in our cache. This can be particularly useful for live video streams with limited shelf life but may still be in high demand for a few hours following a live event, which becomes a normal piece of video-on-demand content. This functionality can also be used in conditions where a content provider explicitly wants to lock some content for a specific period of time so that it does not hit their origin servers.

Hybrid LRU also allows us to store some content for a longer duration of time. This is useful if the origin is located in a remote part of the world, for example, which can lead to poor QoE when the CDN does not have requested content in its cache. In such cases, a new client request would trigger a cache miss that the origin will need to fill, potentially resulting in rebuffering. Aging this content slower will stay in the cache longer and reduce the number of such origins fills.

Hybrid LRU Usage Parameters

Hybrid LRU consists of two tunable parameters that give us the ability to either delay or speed up the eviction or removal of specific content from our caches:

  • Aging Rate
  • Time to Live (TTL)
The aging rate defines the rate of increase in eviction score over time. It’s a scaling function that operators can use to make a piece of content age faster or slower. The default value for the aging rate is 10, so changing this value to 200, for example, will accelerate the aging of the video file by 20 times (200/10 = 20). The value could also be changed to five to age a piece of content at half the default speed.

The Time to Live (TTL) parameter reduces the age of an item by a specified amount. It works by giving an extremely low eviction score to an item for the duration set by this variable. This forces an item to stay in the cache for a specified duration since it was last accessed. The default is 0 seconds, which means no special preference.

The charts below show how these tunable parameters work to adjust how long content stays in the cache. Considering these parameters as knobs or dials that can be precisely adjusted to match content demands is useful. The charts show how objects age over time on server caches while waiting to be accessed.

First, let’s look at Aging Rate. Traditional LRU objects age at the same rate over time. But as we turn up the aging rate dial, items age faster over time. Similarly, when we turn the dial in the opposite direction, items age slower than LRU. Turn the dial enough, and the slow-aging items never exceed the “eviction threshold,” as figure one shows. With this control, we can either remove items sooner to free up space or keep items on disk longer as needed to reduce origin pulls or for other reasons.

In contrast to Aging Rate, TTL lets us change the cache-ability of a particular item. For the duration set using the TTL function, an item does not age while it’s on the disk, so it is less likely (even very unlikely) to get evicted. After the TTL expires, the item can begin to age either in the traditional LRU manner or with fast aging or slow aging (depending on how the operator configures it). In the figure below, TTL with slow aging kept an item on disk to the point where it didn’t exceed the cache eviction threshold. At the opposite end, TTL ensured that a live video stream was cached for at least the duration of the event, but after that was quickly removed from the disk using fast aging.

In most cases, changing the aging rate value is the preferred method for adjusting the timing for when content is evicted from the cache because it can easily adapt to the amount of traffic on a disk. TTL, on the other hand, is more aggressive and can effectively lock out a portion of a disk until the content is released. However, as these examples illustrate, the two controls can be used together to reliably achieve the desired effect.

Forward-Looking Caching Strategies

A broad caching strategy such as LRU is like a big hammer, treating all content equally, regardless of type or file size. If a file doesn’t get a hit within a certain time, it gets deleted from the cache. Meanwhile, other files (like one-time live video streams/events) that are unlikely to get hits in the future sit in the cache, taking up space. Hybrid LRU adds a level of refinement intending to reduce unnecessary cache footprint and improve cache hit ratio. It’s like using a small hammer or screwdriver to more accurately control what files should stay in the cache and which should be removed.

Currently, Hybrid LRU is experimental and requires an operator to adjust the eviction time frames for content. Looking forward, we’re researching whether request profiles and other factors can be leveraged to make adjustments automatically. Live events, for instance, have many different profiles – thousands of requests for the same file segments coming in at the same time – then video-on-demand files. We’re also looking at making adjustments based on file size – do you want to keep large files on disk to minimize network traffic or keep smaller files on hand to optimize for a cache hit ratio?

Even though we’re confident in the performance and maturity of our caching system and strategies, the need to optimize finite resources remains an important and ongoing effort.