Creating an ArcGIS map or image service cache in the Amazon Elastic Compute Cloud (EC2) differs from caching outside the cloud in the following ways:
- You have a number of different instance sizes and prices at your disposal.
- You can add volumes to your instance where you can place the cache.
This topic discusses the above factors in more detail.
Choosing an instance size and price
Amazon EC2 offers a variety of instance sizes and specifications. Each has its own price per hour of usage. The larger instances, especially those with a lot of memory, can generate tiles very quickly. The smaller instances generate tiles more slowly but have a lower cost.
You can create your cache on an attached Amazon Elastic Block Store (EBS) volume using a powerful instance. When the caching completes, you can detach the EBS volume and attach it to your regular instance (which may be smaller and less expensive). You can then terminate the powerful instance that you used for caching. In this way, you can use the power of the cloud to cache while not committing to a relatively expensive instance for any longer than necessary.
You may need to decide between economy and speed. Using a low power instance with a low cost per hour is not always the most economical choice, as the total cost of the cache is dependent on the number of hours spent creating tiles. On the other hand, the most powerful instances may also yield a higher total cost of the cache: even though you spend fewer hours caching, you pay a higher price per hour.
Using a small test cache (perhaps the size of a medium-sized city) as well as a custom Amazon Machine Image (AMI) or site template, you can perform relatively inexpensive tests with different instance types to find out which is most economical for your cache.
Powerful EC2 instance types are well suited to scheduled cache updates, since many update workflows are time sensitive.
Choosing the number of map service instances to use when caching
Each EC2 instance has a certain number of virtual CPU cores. This number is visible when you choose the instance type in the Amazon Web Services Management Console. The number of cores can help you determine how many instances of the CachingTools geoprocessing service to devote toward your caching. Using too many service instances will overwork your CPUs, while too few service instances will leave your CPUs underutilized.
Although the best number may be reached with some trial and error, a good starting point is to allow a maximum of n + 1 instances of the CachingTools service, where n is the number of virtual cores on a single EC2 instance in your site.
Auto scaling
When building a large cache, you may be tempted to set up auto scaling triggers that automatically increase the number of EC2 instances working on the cache as the CPU usage increases. However, auto scaling is better suited to handling unexpected spikes in traffic. When creating caches, you already know that you will need a great amount of computing power; therefore, it makes more sense to launch all your needed instances before you build the cache, rather than waiting for them to launch sequentially via auto scaling triggers.
Deciding where to place the cache
As described in Strategies for data transfer to Amazon Web Services, there are several types of locations where you can place your data. When you first create the cache, you'll write it to an EBS volume that's attached to your EC2 instance. This volume is attached at the time you build your site, and it's a good place to put the cache if the volume is large enough. If the volume is too small, you can replace it with a larger volume you create from a snapshot of the existing data volume and register a server cache directory on it.
Do not build a cache on the C or root drive of your EC2 instance. If the instance is ever terminated, the cache will be lost.
If you have existing caches on a local disk and are comfortable using Amazon Simple Storage Service (S3) buckets, you can copy your CompactV2 caches to a bucket in Amazon S3 and store map caches there. Cloud stores registered as cache directories cannot be used to create or manage caches. The storage format of the cache should be CompactV2 for cache consumption to be supported, as this format is optimized for best performance. If your existing cache is using an older storage format, use the Upgrade Map Cache Storage Format geoprocessing tool to upgrade to the CompactV2 format.
- Create a cache in CompactV2 format by publishing a new cached map or image service using this format or converting an existing cache to CompactV2 format using the Upgrade Map Cache Storage Format geoprocessing tool.
- Create an Amazon S3 bucket in the same region as your ArcGIS Server site on AWS.
- Copy the service caches from the drive local to your ArcGIS Server site on AWS, and place them in a folder named arcgiscache inside your Amazon S3 bucket.
See AWS documentation for examples on copying content to an S3 bucket. Note, if your caches are very large (for example, terabytes in size), you may need to ship them on disk to Amazon and have them upload it.
- Sign in to ArcGIS Server Manager for the site on which the cached service is running and register the S3 bucket as a cloud store and a cache directory with your ArcGIS Server site on AWS.
- While you are still logged in to ArcGIS Server Manager, do one of the following:
- Stop the existing service and change its cache directory to point to the new cloud store cache directory in your S3 bucket.
- Publish a new service that will serve out the caches you placed in the S3 bucket in step 3.
- Restart the service.
Note:
If you update the map service's cache, the updated cache is generated on a drive local to the ArcGIS Server site. You must recopy the cache to the S3 bucket so users of the service can consume the updated cache content.