Cache creation in Amazon EC2

In this topic

Choosing an instance size and price
Manual scaling and auto scaling
Deciding where to place the cache

Creating an ArcGIS map, image, or globe service cache in the Amazon Elastic Compute Cloud (EC2) differs from caching outside the cloud in several ways:

You have a number of different instance sizes and prices at your disposal.
You have a variety of locations in the cloud where you can place the cache.

This topic discusses the above factors in more detail.

Choosing an instance size and price

Amazon EC2 offers a variety of instance sizes and specifications. Each has its own price per hour of usage. The larger instances, especially those with a lot of memory, can generate tiles very quickly. The smaller instances generate tiles more slowly but have a lower cost.

You can create your cache on an attached Amazon Elastic Block Store (EBS) volume using a powerful instance. When the caching completes, you can detach the EBS volume and attach it to your regular instance (which may be smaller and less expensive). You can then terminate the powerful instance that you used for caching. In this way, you can use the power of the cloud to cache while not committing to a relatively expensive instance for any longer than necessary.

You may need to make a decision between economy and speed. Using a low power instance with a low cost per hour is not always the most economical choice, as the total cost of the cache is dependent on the number of hours spent creating tiles. On the other hand, the most powerful instances may also yield a higher total cost of the cache: even though you spend fewer hours caching, you pay a higher price per hour.

Using a small test cache (perhaps the size of a medium-sized city) as well as a custom Amazon Machine Image (AMI) or site template, you can perform relatively inexpensive tests with different instance types to find out which is most economical for your cache.

Powerful EC2 instance types are well suited to scheduled cache updates, since many update workflows are time sensitive.

Choosing the number of map service instances to use when caching

Each EC2 instance has a certain number of virtual CPU cores. This number is visible when you choose the instance type from the Launch Instance wizard. The number of cores can help you determine how many instances of the CachingTools geoprocessing service to devote toward your caching. Using too many service instances will overwork your CPUs, while too few service instances will leave your CPUs underutilized.

Although the best number may be reached with some trial and error, a good starting point is to allow a maximum of n + 1 instances of the CachingTools service, where n is the number of virtual cores on a single EC2 instance in your site.

Manual scaling and auto scaling

When building a large cache, you might be tempted to set up auto scaling triggers that automatically increase the number of EC2 instances working on the cache as the CPU usage increases. However, auto scaling is better suited to handling unexpected spikes in traffic. When creating caches, you already know that you will need a great amount of computing power; therefore, it makes more sense to launch all your needed instances before you build the cache, rather than waiting for them to launch sequentially via auto scaling triggers.

Deciding where to place the cache

As described in Strategies for data transfer to Amazon Web Services, there are several types of locations where you can place your data. When you first create the cache, you'll write it to an EBS volume that's attached to your EC2 instance. This volume is attached at the time you build your site, and it's a good place to put the cache if the volume is large enough. If the volume is too small, you need to create and attach another volume and register a server cache directory on it.

Do not build a cache on the C drive of your EC2 instance. If the instance is ever terminated, the cache will be lost.

Feedback on this topic?