Multi-CDN High Level Architecture with CMCD
What is CMCD?
Common Media Client Data (CMCD) refers to open specification CTA-5004 that was released in 2020. What is an Open Specification? Open specification refers to a set of documented requirements and standards publicly available for anyone. The big headline here is that the CMCD open specification defines how media players (known as clients) generate video streaming data and share it with CDNs for each media request. Before this specification was defined, it was the wild west as far as how client data was sent, received, and processed with CDNs. CMCD at its core is just a set of keys with valuable data. Using the CTA-5004, I summarized those keys and added insights for you to create a helpful cheat sheet.Description | Key | Definition | *Use |
---|---|---|---|
Encoded Bitrate | br | Encoded bitrate of the audio or video object being requested. | Shows actual bitrate delivered and can be used by CDN to infer the object size. |
Buffer Length | bl | Player Buffer length at the time of request. | Can be used by CDN to infer the health of the playback. |
Buffer Starvation | bs | Buffer starvation event. Indicates rebuffer/stalling in playback. | Rebuffer percent is calculated as total number of sessions vs sessions with at least one rebuffer event over given period of time. |
Content ID | cid | Unique string identifying the current content. | Useful for tracking down problematic content but is rarely used by players. |
Object Duration | d | Playback duration in milliseconds of the object being requested. |
When aggregated it can be used as an estimate of hours watched. Can be used to determine if the content is in an ad break and type of video being watched. Can be used to determine the chunk size. |
Deadline | dl | Deadline from the request time until the first sample of this Segment/Object needs to be available in order to not create a buffer underrun or any other playback problems | |
Measure Throughput | mtp | Throughput between the client and server, as measured by the client. | Estimated throughput bandwidth between CDN to player. Useful for comparing CDN and external metrics. |
Next Object Request | nor | Relative path of the next object to be requested. | Used for prefetching. |
Next Range Request | nrr | If the next request will be a partial object request, then this string denotes the byte range to be requested. If the ‘nor’ field is not set, then the object is assumed to match the object currently being requested. | Used for prefetching. |
Object Type | ot |
Media type of the current object being requested: m = text file, such as a manifest or playlist a = audio only v = video only av = muxed audio and video i = init segment c = caption or subtitle tt = ISOBMFF timed text track k = cryptographic key, license or certificate. = other If the object type being requested is unknown, then this key MUST NOT be used. |
Used for troubleshooting. Can be used to determine encoding and DRM issues. DRM providers have limited visibility on what versions/browsers are currently used/supported and so “k” is extra helpful.
"c” can be used to alert potential compliance issues. |
Playback Rate | pr | 1 if real-time, 2 if double speed, 0 if not playing. SHOULD only be sent if not equal to 1. | Can be used to infer if player if player is adjusting playback rate to make up for other issues (such as the origin or CDN is failing to delivery segments quick enough). |
Requested Maximum Throughput | rtp |
Requested maximum throughput that the client considers sufficient for delivery of the asset.
***Throughput refers to the amount of data that is transmitted. |
Will tell you player and CDN performance. This can be used for more efficient data management and in effect, save resources.
This can benefit clients by preventing buffer saturation through over-delivery and can also deliver a community benefit through fair-share delivery. The concept is that each client receives the throughput necessary for great performance, but no more. |
Streaming Format | sf |
d = MPEG DASH h = HTTP Live Streaming (HLS) s = Smooth Streaming o = other |
Helps determine stream related issues for players that support DASH/HLS. Can compare performance based on streaming format for players that have multi-format support. |
Session ID | sid | GUID identifying the current playback session. A playback session typically ties together segments belonging to a single media asset. Maximum length is 64 characters. It is RECOMMENDED to conform to the UUID specification. |
This key is always recommended to be included in CMCD logging. It is arguably the most useful key as it is used for aligning logs together.
Can be helpful troubleshooting for caching issues. Same content ID with two session IDs strongly indicates a caching issue. |
Stream Type | st |
v = all segments are available – e.g., VOD l = segments become available over time – e.g., LIVE |
Invaluable key for troubleshooting. |
Startup | su | Signals startup of content. |
Removes the need for beaconing. CDNs knowing the startup of content can be helpful for optimizing subsequent playback. This flag is also sent after buffer flag (bs). |
Top Bitrate | tb | Highest bitrate rendition in the manifest or playlist that the client is allowed to play. |
Used to determine bitrate or bitrate laddering issues. Shows the top bitrate that the player could play at that time. Can be used for comparing to available bitrates. |
CMCD Version | v | Version of CMCD specification used. | This key allows for version control and indicates that there will be future CMCD versions released. |
Custom Key |
Custom keys requires “cmcd-“ prefix. Fictional example: cmcd-edgio |
Allows for unique CMCD keys to be sent which extends CMCD to be fully customizable. |
At present, Edgio is the only CDN that publicly states support for all CMCD keys. The defined keys can be transmitted in three delivery modes from players to CDNs.
- Custom HTTP header in each request. The keys can be used with four header names.
- CMCD-Request: keys whose values vary with each request.
- CMCD-Object: keys whose values vary with the object being requested.
- CMCD-Status: keys whose values do not vary with every request or object.
- CMCD-Session: keys whose values are expected to be invariant over the life of the session.
- HTTP query arguments.
- JSON object independent of each HTTP request.
Buffer starvation and startup in CMCD format via three delivery modes.
Who Should be Using CMCD?
Streaming services of any size should be using CMCD as soon as possible. However, since its release, the knowledge and implementation of CMCD has been limited. The trends that we are seeing indicate that there is growing adoption and support. That being said, we would love to talk to you about your use case and how CMCD can elevate your business.Why is CMCD Needed?
CMCD is all about making sure users get the best possible streaming experience every time they hit play. As a Solutions Engineer at Edgio, my job is to review a content provider’s technical architecture to make sure our CDN works smoothly and efficiently for the best user experience. A key component of any technical architecture is the data gathered from logs, metrics, and traces (also known as the three pillars of observability) to accomplish this. Below are a few more reasons why CMCD is worth incorporating if you aren’t already sold on it.Standardization
Data is collected and analyzed by a blend of third party and internal proprietary tools for each piece of architecture. The difficulty is that the data gathered each follow its own format. Non-standard data can create side effects such as inconsistencies, reduced quality, limited scalability, and increased maintenance efforts. I challenge you to find an engineer that hasn’t faced some or all these issues before. To address this, CMCD standardizes client-side data (player) to server-side data (CDN) ensuring interoperability. This is significant because it makes it the closest a thing CDN can get to client-side Real User Measurement (RUM) data.Enhanced Customization
Custom CMCD keys infinitely expand its utility.Use Case: Status Code Customization
Here is a real-world custom status code that Edgio uses. “000 – A Edgio-specific status code returned when the origin sends no response, so there is no status code to log (for example when the client disconnects before the origin delivers the response).” This definition is from the Log File Fields section of our Configuring Log Delivery Service documentation.
Edgio could create a cmcd-000 custom key.
Simplified Workflows
In recent years, tech companies have been cutting costs, consolidating tools, and reducing overall complexity. The free data that CMCD yields can replace internal and external log analysis services. The result is a simplified and optimized workflow.Use Case: Player Workflow
Old Process
Third-party analytics are widely used by streaming companies today. However, these tools require an extensive and continual process for integration shown below.
-
- Procurement
- Each third-party analytics vendor has its own software development kit (SDK) for beacon integration with each player. Android’s ExoPlayer (dash), Apple’s player (HLS), Microsoft’s PlayReady (Xbox/Windows), web based (websites, televisions), and more are all examples of players that require separate SDKs.
- Testing and validation. Metadata is required to conform to the vendor’s guidelines.
It is worth noting that this full process can easily take a year or more.
New Process
Enabling is simple. Edgio can capture your CMCD data in near real-time via our Log Delivery Service. That means that player beaconing or third-party integration isn’t needed for unlocking next generation data analytics. For our customers with a large user base this translates to millions of dollars saved annually.
Security
Information security might not be the first thing you think of when it comes to streaming, but it is still an important component.- CMCD does not have access to login data or contain PII (Personal Identifiable Information) because it compartmentalizes user experience from information. It only contains generic performance information.
- CMCD data is not reliant on intermediary services. This limits sharing data with third parties keeping data internal.
Observability
CDNs can now see precise session data on how the customer is experiencing their service. This shared visibility can be used by CDNs and streaming providers to:- Setup more accurate monitoring and alerting.
- Give more lead time on major incidents.
- Diagnose and resolve issues more effectively.
Use Case: Outage Handling
Old Process
This is the typical process flow of how a streaming company’s outage is resolved with separate player and CDN logs.
-
- An outage occurs. Unfortunately, human communication or intervention from third party analytics isn’t typically offered.
- Internal alert is triggered from performance indicators dropping, errors codes, dashboards, etc.
- The alert is triaged internally and escalated based on factors such as who is on-call, subject matter experts, and department delegation.
- Runbooks are used if this is an issue that has been previously encountered and documented.
- The decision is made if the issue should be escalated externally to partners such as CDN.
- Resolution is reached.
- Internal Root Cause Analysis (RCA) is performed and communicated to customers.
Due to the lengthy escalation path this outage takes a long time to reach a resolution.
New Process
This is the process flow of how an outage is resolved with CMCD.
-
- An outage occurs. Access to native CMCD data allows everyone to have full visibility. Our Managed Services team proactively reaches out to you for mitigation.
- The resolution is reached by Edgio collaborating in real-time to resolve outages while looking at the same data as your engineers.
- Edgio can assist with RCA and strategize with you as a partner to prevent future outages.
The outage resolution is reached faster and requires less internal engineering resources.
Performance Gains
The primary purpose of a CDN is to increase the performance of content served over the internet.Prefetching
We see CMCD requested most often from our customers to unlock universal prefetching support. Prefetching is uploading content data to cache before it is needed to speed up the delivery of content.
Next Generation Intelligent Load Balancing
Our CDN automatically delivers content from the most optimal Point of Prescence (POP) within our private global network. We handle internal network load balancing for you, but when multiple CDNs are used the external load balancing is more complicated and obfuscated for content providers to manage. The problem becomes how can streaming companies leverage CDNs in their stack intelligently. Each company attempts to solve this problem in a unique way, but for most it is still manually split by percentage or availability delegated by the player. CMCD aware load balancing is the answer. Having CDN and player data in unified logs allows greater insight into each CDN’s performance which is required for intelligent load balancing. This is in its nascent stages, but it is also the most promising feature. Streaming providers aren’t using this method in their production environments yet. I mention this because that means its application is theoretical. More to come on this use case as we see CMCD evolve and grow with other emerging streaming technologies such as AI/ML.
Use Case: Intelligent Geographic Load Balancing
A streaming provider has exclusive streaming rights to the largest live sports event of the year in Latin America. To prepare for the event, CMCD aware load balancing can be employed. During the game, data is used in real-time to automatically distribute traffic to the CDN with the best performance per region, locale, and/or POP out of the CDN stack. This prevents the need for manual fail over if a CDN in the stack is underperforming, at capacity, or runs into technical issues. In this scenario, our CDN would handle the largest traffic share because Edgio typically outperforms other CDNs in Latin America due to our priority of investing heavily in emerging markets.
Adaptive Traffic Optimization
Real-time use of CMCD can be used for adaptive traffic optimizations such as steadying playback bitrates across streaming viewers.
How to Enable CMCD?
Before you proceed, you should know that there are two technical requirements which are that the player and the CDN both need to support CMCD usage. Most players and CDNs today have CMCD support.Player | Headers | Query | JSON |
---|---|---|---|
THEOplayer | |||
Bitmovin | |||
Shaka Player | |||
Android ExoPlayer | |||
hls.js | |||
dash.js |
*It should be noted that player support was determined by public documentation availability and this list is not exhaustive. Feel free to reach out with CMCD player support and we can update to include that info.
? = Not public information at time of writing. Feel free to reach out with CMCD CDN support and we can update to include that info.
There are other considerations too:
- For security, HTTPS is strongly recommended over HTTP for all CMCD data transmissions over the web.
- CORS and content protection introduces extra complexity into CDN configurations. CORS needs to be configured to allow headers CMCD-Request, CMCD-Object, CMCD-Status, CMCD-Session to be sent from the player to the CDN.
Enable for Edgio CDN
See our CMCD Report documentation’s section titled Enabling CMCD Logging for instructions. If you aren’t already a customer, reach out to us for a performance trial to work with our team of Solutions Engineers and see our CDN in action.
Enable for Uplynk
Edgio’s Uplynk is a perfected streaming platform with integrated CDN for broadcasters and OTT providers. Coupling CMCD with Uplynk will augments its end-to-end visibility that no other streaming platform has. If you aren’t already a customer, reach out to us for an Uplynk demo and to speak directly with our Managed Services to enable. You can read more on the future of Uplynk in Edgio and Bitmovin Join Forces to Optimize User Experience.
How to View CMCD?
CMCD data is located within your CDN logs which can be inspected. The data can be pulled into log repositories, dashboards, and reports. Edgio became an early adopter when we announced our CMCD Report that is powered by EdgeQuery.
Example of Edgio’s CMCD Report that is available in the CDN Control Portal
The CMCD Report gives a visual overview along with the ability to see detailed information at your fingertips. The CMCD data itself is sent to your log storage solution whether that is located on-premises, shared with a third-party, or kept in our network on Edgio’s Origin Storage.
Looking Forward
If you haven’t already, you should now feel well acquainted with CMCD and how it can transform streaming today. As an industry leader with CMCD, Edgio will continue taking it to the full extent of what is possible. Stay tuned!
Acknowledgments
Special thanks to our Program Manager Anthony Karr for his invaluable feedback and suggestions. Thanks to our Software Architect Yuri Nepyyvoda for his outstanding work leading the development of our cutting edge CMCD implementation.
Terminology
- CORS: Cross-Origin Resource Sharing
- Interoperability: The ability for systems to work together.
- Open Specification: A set of documented requirements and standards publicly available for anyone.
- OTT: Over-the-top which refers to delivering content on top of internet services.
- Prefetching: Prefetching is uploading data to cache before it is needed to speed up the delivery of content. Prefetching is also known as warming or pre-warming cache.
- RUM: Real User Measurement data
- SDK: Software Development Kit
- Uplynk: A perfected streaming platform with integrated CDN for broadcasters and OTT providers.