Content Delivery Networks
and Video Streaming
by Ben Shapiro
Copyright 2005, Faulkner Information Services. All Rights Reserved.
Preview
Content Digital Networks (CDNs) help ensure that popular multimedia content can be accessed quickly and delivered
efficiently. CDN providers such as Akamai and Speedera have successfully
deployed huge networks for just this purpose, but CDNs do not have to be as
massive as Akamai's. This report discusses the market leading CDNs and at smaller CDNs that can be
built in-house.
Report Contents:
Executive Summary
[return to top of this report]
Content is king; on the World Wide Web and elsewhere.
If users do not have high-quality content delivered quickly to them, they
will leave and seek the sites of the competition. In the early days of the Web, big
content providers like news organizations and streaming media providers built
large hosting facilities to store and serve content from a centralized location.
Pretty quickly, those companies discovered that they were not getting good
response times serving content to all their users. Because of the architecture
of the Internet, if a backbone or major ISP went down, content could be forced
to take a less speedy route to the user. Hosting content centrally also placed a
burden on the company to ensure it had enough redundant links through several
ISPs to protect against a network outage.
To meet this need, Content Delivery Networks (CDNs) started sprouting up. A CDN is a network of servers
hosted by a third party that places content geographically closer to the users
that are request that content. When a company uses a CDN to host content, they
usually use special URLs in HTML tags for media assets like images and streaming
video. These tags refer to a special DNS server that determines the user’s
location, and redirects their request to a cache server located near them. If
that cache server does not have a copy of the requested file, it will retrieve
it automatically from the central site and cache it until it is set to expire. Many CDNs are operated in an ASP (Application Service Provider) model where
the company providing the CDN provides the service and the customer’s only
investment are the monthly hosting and usage fees.
Hosted CDN vendors include:
- Akamai--The most well known CDN providing geographic cache and
content distribution.
- Globix--Provides content caching services to the edges of their
network.
- Speedera--Provides a wide array of content caching and failover
capabilities.
It is also possible to build a custom CDN using open source tools in order to
save on the costs involved.
[return to top of this report]
A CDN is a network of servers
hosted by a third party that places content geographically closer to the users
that are request that content. When a company uses a CDN to host content, they
usually use special URLs in HTML tags for media assets like images and streaming
video. These tags refer to a special DNS server that determines the user’s
location, and redirects their request to a cache server located near them. If
that cache server does not have a copy of the requested file, it will retrieve
it automatically from the central site and cache it until it is set to expire.
Most CDNs maintain hundreds of cache servers all over the globe. CDNs are now
offering products that take advantage of ESI (edge side includes) to dynamically
create content at the edge of the network. This allows companies to offload the
processing of dynamic pages to the cache servers on the edge of the network
closest to the users. Large application server vendors like BEA Systems and IBM have
built in support for ESI.
System Architecture of a CDN
Figure 1 shows a typical CDN in action. In this
example, two users are requesting the same movie file from a website.
Figure 1. Typical
CDN Architecture
One of the users in the example is located in the
US and the other in the UK. The HTML page that both users have requested has had
all the media links changed to point to a specialized location aware DNS server.
The DNS server figures out where the user is coming from, and redirected the
user to the proper content cache. The user’s browser or media player requests
the file from the nearest cache, and if the file is not present on that cache,
it is downloaded by the content cache directly from the source server. The
content can usually be cached based on user demand or pre-cached with all the
other content. Both users end up having content efficiently streamed to them by
a local server with minimal requests back to the main server. The USA user gets
his content from the USA caching server, and the UK user gets her content from
the UK caching server.
Enabling Content for CDN Hosting
Typically firms host only resource-intensive
files on a CDN. These include images, movies, and streaming video. HTML files
are usually small and can be served off the source Web server. In most cases, it
is simply necessary to modify links to these resources to take advantage of the
CDN. Some vendors offer more advanced options that host Web applications at the
edge of the network rather than from centrally managed servers. The vendor will
supply the packaging format for the application and management tools to monitor
your application’s performance.
Building a CDN with Open Source Tools
If the budget is not available to work with one of the big CDN providers, a simpler version
of a CDN can be built using open source software and some
custom software development.
To build a CDN the following is needed need:
- Linux machines to host the software.
- Squid--An open source proxy/cache server.
- BIND/DJBDNS--For DNS resolution.
- Added custom code on the DNS server to check for the users location and return
the closest cache server.
The general process for building a CDN would be to determine where to
physically strategically locate cache machines, build them, install and
configure squid to point to the origin server, and write some custom DNS
code to redirect users to the proper cache based on geographic information
determined from a reverse host lookup.
The hardest part of building a CDN comes with creating the code to
determine the closest network cache. Simple redirection based on host
information can be accomplished easily, but more complex algorithms are
extremely difficult to implement and the cost to implement them can be
prohibitive.
[return to top of this report]
There are currently a number of vendors operating
CDNs and providing caching services. Some of the major vendors are listed below.
Akamai
Akamai provides one of the largest CDN networks on the Internet. It has three
product offerings: Edge Platform, Edge Control, and NOCC. Edge platform is the
CDN. This network is made up of 14,000 nodes in over 70 countries. Edge platform
can handle both flat content as well as dynamic applications, allowing companies
to dynamically scale Web applications to meet the demands of end users.
Edge Control provides a management console over the applications and content
running in the Edge platform. Edge Control features include:
- Visibility Tools--EdgeControl provides dashboard-like views of traffic, activity, and site
performance parameters to reveal usage trends and patterns. Customers can now
learn how much traffic a campaign has generated, where most users are coming
from, how many hits a program has generated, top downloads and URLs, and other
valuable data. User-friendly automatic or on-request reports help you make
sounder IT investment and resource decisions, and gauge the effectiveness of
e-business initiatives.
Control Tools--Provides real-time information and application management and
visibility into delivery, traffic trends, activity patterns, and site
performance. Alert functionality monitors the origin
infrastructure, notifying customers of performance variances or origin failure.
Diagnostic tools add the ability to discover and diagnose performance
issues.
Globix
Globix provides a CDN service called EarthCache. EarthCache leverages more
than 1,200 peering agreements to create a highly available, high-speed network. EarthCache
features include:
- Scalability--Caching nodes are located at more than half of Globix's
network PoPs and leverage the peering connections with other Tier 1 backbones.
Each EarthCache node has a minimum of 5G bps of connectivity and 1 TB of
storage space enabling them to handle even the largest content events.
- Intelligence and Efficiency--EarthCache divides the Globix network
into logical zones, continuously probes end-users connections, and directs users
to the node that will best serve them. It also uses "pull caching" and
content "freshness" to ensure that users receive the best content
available.
- Flexibility – EarthCache supports both Web objects (http, JPEG, etc.) as
well as a full array of streaming media files sets including Microsoft Windows
Media, Real Networks Real Media, and Apple QuickTime Progressive Download
Speedera
Speedera offers a suite of products called SpeedSuite, a
combination of services covering performance, streaming media, availability,
security, and management. Similar to Akamai, Speedera offers both media file
hosting as well as the ability to host entire applications throughout its network. Speedera provides a management console called SpeedEye to provide
visibility into the network and allows you to manage site assets.
Speedera features include:
- Content Delivery Service--Provides a Web site with a performance boost
and added scalability to handle peak loads. The service uses a
dedicated, worldwide network of caching servers that are deployed at PoPs in the Americas, Europe, and the Asia-Pacific region.
- Speedera Download Service--Provides the performance and scalability
for faster and more reliable digital goods delivery, worldwide. When a customer
requests digital goods, the Global Traffic Management system routes the request to the fastest, most available server for each
download.
- Speedera Streaming Services--Provide support for On-Demand, Live, and Secure
Streaming that ensure high quality, scalability, and secure delivery of Windows Media,
RealMedia, Flash, QuickTime, MPEG-4, MP3, and 3GPP streams
to users worldwide. Speedera provides several options in choosing how large a
global streaming edge network of servers is needed: dual site, regional
(Americas, Europe, Asia-Pacific), or global.
- Speedera's SpeedSuite for Site Assurance, Traffic Balancer and Failover
Services--Guarantee site availability for both distributed and single
location sites. Traffic Balancer provides global load balancing across multiple
sites. Failover provides a static, mirrored site for single location sites.
Speedera offers Web-based tools for managing a site
through the SpeedEye management portal, controlling a Web site's
security by using SpeedEye Access Manager, which is used to assign access
rights to each individual (IT staff, Marketing content contributors, outside
contractors etc) working on the site. Site Analyzer
provides performance monitoring and availability, plus
the ability to trouble-shoot the site.
Recommendations
[return to top of this report]
CDNs are a tremendous asset
to content-rich Web sites with large user followings. Without dispersing their
content, a site can easily get swamped with hits effectively shutting it down.
Another benefit of hosting applications and content on a CDN is that it is much
harder to be shut down by a DDOS attack. If someone tries to attack the site,
the CDN will just shift resources to other nodes in the network while working to
block access from the offending machines. The real trick with moving to a CDN is
determining when a company is large enough to make the switch. The costs
involved in hosting with an ASP model provider well as bandwidth usage costs
must be evaluated carefully. It is not easy to quantify all benefits that a CDN
can in the short run. In the long run, increased satisfaction by users should
lead to higher hit numbers and eventually higher sales.
Unless operating as an ISP, it is probably a good idea to use one of the ASP
model vendors rather than building a CDN internally. An exception to this would
be if only a small cache in just a few places. For this purpose, squid could be
a very effective and low-cost option.
About the Author
[return to top of this report]
Ben Shapiro is the president of
ObjectArts, a New York City-based consulting company. He writes frequently for
Faulkner Information Services and sits on Faulkner's Advisory Panel. Mr. Shapiro
is also a contributing author to the Apache Cocoon Developer's Handbook
published by SAMS and a co-author of Oracle 10g Developer published by
Wrox. ObjectArts has worked with many large companies to develop XML-based
publishing tools and web-based applications. Mr. Shapiro is an open-source
software advocate, and ObjectArts leverages open-source software whenever
possible to design and implement cost -effective solutions to technology
challenges.
Web Links
[return to top of this report]
Akamai: http://www.akamai.com/
Globix: http://www.globix.com/
Speedera: http://www.speedera.com/
Squid: http://www.squid-cache.org/
[return to top of this report]
|