Content Delivery Networks 
and Video Streaming

by Ben Shapiro 

Copyright 2005, Faulkner Information Services. All Rights Reserved.

Preview

Content Digital Networks (CDNs) help ensure that popular multimedia content can be accessed quickly and delivered efficiently. CDN providers such as Akamai and Speedera have successfully deployed huge networks for just this purpose, but CDNs do not have to be as massive as Akamai's. This report discusses the market leading CDNs and at smaller CDNs that can be built in-house.

Report Contents:

Executive Summary

[return to top of this report]

Content is king; on the World Wide Web and elsewhere. If users do not have high-quality content delivered quickly to them, they will leave and seek the sites of the competition. In the early days of the Web, big content providers like news organizations and streaming media providers built large hosting facilities to store and serve content from a centralized location. Pretty quickly, those companies discovered that they were not getting good response times serving content to all their users. Because of the architecture of the Internet, if a backbone or major ISP went down, content could be forced to take a less speedy route to the user. Hosting content centrally also placed a burden on the company to ensure it had enough redundant links through several ISPs to protect against a network outage.

To meet this need, Content Delivery Networks (CDNs) started sprouting up. A CDN is a network of servers hosted by a third party that places content geographically closer to the users that are request that content. When a company uses a CDN to host content, they usually use special URLs in HTML tags for media assets like images and streaming video. These tags refer to a special DNS server that determines the user’s location, and redirects their request to a cache server located near them. If that cache server does not have a copy of the requested file, it will retrieve it automatically from the central site and cache it until it is set to expire. Many CDNs are operated in an ASP (Application Service Provider) model where the company providing the CDN provides the service and the customer’s only investment are the monthly hosting and usage fees.

Hosted CDN vendors include:

  • Akamai--The most well known CDN providing geographic cache and content distribution.
  • Globix--Provides content caching services to the edges of their network.
  • Speedera--Provides a wide array of content caching and failover capabilities.

It is also possible to build a custom CDN using open source tools in order to save on the costs involved.

Description

[return to top of this report]

A CDN is a network of servers hosted by a third party that places content geographically closer to the users that are request that content. When a company uses a CDN to host content, they usually use special URLs in HTML tags for media assets like images and streaming video. These tags refer to a special DNS server that determines the user’s location, and redirects their request to a cache server located near them. If that cache server does not have a copy of the requested file, it will retrieve it automatically from the central site and cache it until it is set to expire. Most CDNs maintain hundreds of cache servers all over the globe. CDNs are now offering products that take advantage of ESI (edge side includes) to dynamically create content at the edge of the network. This allows companies to offload the processing of dynamic pages to the cache servers on the edge of the network closest to the users. Large application server vendors like BEA Systems and IBM have built in support for ESI.

System Architecture of a CDN

Figure 1 shows a typical CDN in action. In this example, two users are requesting the same movie file from a website.

Figure 1. Typical CDN Architecture

One of the users in the example is located in the US and the other in the UK. The HTML page that both users have requested has had all the media links changed to point to a specialized location aware DNS server. The DNS server figures out where the user is coming from, and redirected the user to the proper content cache. The user’s browser or media player requests the file from the nearest cache, and if the file is not present on that cache, it is downloaded by the content cache directly from the source server. The content can usually be cached based on user demand or pre-cached with all the other content. Both users end up having content efficiently streamed to them by a local server with minimal requests back to the main server. The USA user gets his content from the USA caching server, and the UK user gets her content from the UK caching server.

Enabling Content for CDN Hosting

Typically firms host only resource-intensive files on a CDN. These include images, movies, and streaming video. HTML files are usually small and can be served off the source Web server. In most cases, it is simply necessary to modify links to these resources to take advantage of the CDN. Some vendors offer more advanced options that host Web applications at the edge of the network rather than from centrally managed servers. The vendor will supply the packaging format for the application and management tools to monitor your application’s performance.

Building a CDN with Open Source Tools

If the budget is not available to work with one of the big CDN providers, a simpler version of a CDN can be built using open source software and some custom software development.

To build a CDN the following is needed need:

  • Linux machines to host the software.
  • Squid--An open source proxy/cache server.
  • BIND/DJBDNS--For DNS resolution.
  • Added custom code on the DNS server to check for the users location and return the closest cache server.

The general process for building a CDN would be to determine where to physically strategically locate cache machines, build them, install and configure squid to point to the origin server, and write some custom DNS code to redirect users to the proper cache based on geographic information determined from a reverse host lookup.

The hardest part of building a CDN comes with creating the code to determine the closest network cache. Simple redirection based on host information can be accomplished easily, but more complex algorithms are extremely difficult to implement and the cost to implement them can be prohibitive.

Current View

[return to top of this report]

There are currently a number of vendors operating CDNs and providing caching services. Some of the major vendors are listed below.

Akamai

Akamai provides one of the largest CDN networks on the Internet. It has three product offerings: Edge Platform, Edge Control, and NOCC. Edge platform is the CDN. This network is made up of 14,000 nodes in over 70 countries. Edge platform can handle both flat content as well as dynamic applications, allowing companies to dynamically scale Web applications to meet the demands of end users.

Edge Control provides a management console over the applications and content running in the Edge platform. Edge Control features include:

  • Visibility Tools--EdgeControl provides dashboard-like views of traffic, activity, and site performance parameters to reveal usage trends and patterns. Customers can now learn how much traffic a campaign has generated, where most users are coming from, how many hits a program has generated, top downloads and URLs, and other valuable data. User-friendly automatic or on-request reports help you make sounder IT investment and resource decisions, and gauge the effectiveness of e-business initiatives.
  • Control Tools--Provides real-time information and application management and visibility into delivery, traffic trends, activity patterns, and site performance. Alert functionality monitors the origin infrastructure, notifying customers of performance variances or origin failure. Diagnostic tools add the ability to discover and diagnose performance issues.

Globix

Globix provides a CDN service called EarthCache. EarthCache leverages more than 1,200 peering agreements to create a highly available, high-speed network. EarthCache features include:

  • Scalability--Caching nodes are located at more than half of Globix's network PoPs and leverage the peering connections with other Tier 1 backbones. Each EarthCache node has a minimum of 5G bps of connectivity and 1 TB of storage space enabling them to handle even the largest content events.
  • Intelligence and Efficiency--EarthCache divides the Globix network into logical zones, continuously probes end-users connections, and directs users to the node that will best serve them. It also uses "pull caching" and content "freshness" to ensure that users receive the best content available.
  • Flexibility – EarthCache supports both Web objects (http, JPEG, etc.) as well as a full array of streaming media files sets including Microsoft Windows Media, Real Networks Real Media, and Apple QuickTime Progressive Download

Speedera

Speedera offers a suite of products called SpeedSuite, a combination of services covering performance, streaming media, availability, security, and management. Similar to Akamai, Speedera offers both media file hosting as well as the ability to host entire applications throughout its network. Speedera provides a management console called SpeedEye to provide visibility into the network and allows you to manage site assets. Speedera features include:

  • Content Delivery Service--Provides a Web site with a performance boost and added scalability to handle peak loads. The service uses a dedicated, worldwide network of caching servers that are deployed at PoPs in the Americas, Europe, and the Asia-Pacific region.
  • Speedera Download Service--Provides the performance and scalability for faster and more reliable digital goods delivery, worldwide. When a customer requests digital goods, the Global Traffic Management system routes the request to the fastest, most available server for each download.
  • Speedera Streaming Services--Provide support for On-Demand, Live, and Secure Streaming that ensure high quality, scalability, and secure delivery of Windows Media, RealMedia, Flash, QuickTime, MPEG-4, MP3, and 3GPP streams to users worldwide. Speedera provides several options in choosing how large a global streaming edge network of servers is needed: dual site, regional (Americas, Europe, Asia-Pacific), or global. 
  • Speedera's SpeedSuite for Site Assurance, Traffic Balancer and Failover Services--Guarantee site availability for both distributed and single location sites. Traffic Balancer provides global load balancing across multiple sites. Failover provides a static, mirrored site for single location sites.

Speedera offers Web-based tools for managing a site through the SpeedEye management portal, controlling a Web site's security by using SpeedEye Access Manager, which is used to assign access rights to each individual (IT staff, Marketing content contributors, outside contractors etc) working on the site.  Site Analyzer provides performance monitoring and availability, plus the ability to trouble-shoot the site.

Recommendations

[return to top of this report]

CDNs are a tremendous asset to content-rich Web sites with large user followings. Without dispersing their content, a site can easily get swamped with hits effectively shutting it down. Another benefit of hosting applications and content on a CDN is that it is much harder to be shut down by a DDOS attack. If someone tries to attack the site, the CDN will just shift resources to other nodes in the network while working to block access from the offending machines. The real trick with moving to a CDN is determining when a company is large enough to make the switch. The costs involved in hosting with an ASP model provider well as bandwidth usage costs must be evaluated carefully. It is not easy to quantify all benefits that a CDN can in the short run. In the long run, increased satisfaction by users should lead to higher hit numbers and eventually higher sales. 

Unless operating as an ISP, it is probably a good idea to use one of the ASP model vendors rather than building a CDN internally. An exception to this would be if only a small cache in just a few places. For this purpose, squid could be a very effective and low-cost option.

About the Author

[return to top of this report]

Ben Shapiro is the president of ObjectArts, a New York City-based consulting company. He writes frequently for Faulkner Information Services and sits on Faulkner's Advisory Panel. Mr. Shapiro is also a contributing author to the Apache Cocoon Developer's Handbook published by SAMS and a co-author of Oracle 10g Developer published by Wrox. ObjectArts has worked with many large companies to develop XML-based publishing tools and web-based applications. Mr. Shapiro is an open-source software advocate, and ObjectArts leverages open-source software whenever possible to design and implement cost -effective solutions to technology challenges.

Web Links

[return to top of this report]

Akamai: http://www.akamai.com/
Globix: http://www.globix.com/
Speedera: http://www.speedera.com/
Squid: http://www.squid-cache.org/

[return to top of this report]