Skip To Content

High availability in ArcGIS Enterprise

Organizations often require a certain level of system uptime for their ArcGIS Enterprise deployments, such as 99 percent of the time or higher. For these organizations, implementing a strategy to ensure high availability is crucial. This strategy should comprise both infrastructure elements and employee practices; neither can guarantee high availability alone. For more information about high availability considerations, design patterns, and recommendations, refer to the Architecture Center.

The infrastructure component of a high-availability strategy involves maintaining at least two active copies of your deployment, and implementing failover mechanisms to automatically switch from primary to standby as soon as possible after machine failure. The standby deployment continually receives the same content and settings updates as the primary; this distinguishes highly available systems from replicated systems, which rely on regular backups to minimize data loss and do not automatically fail over. All mission-critical or business-critical elements of a deployment should be addressed when implementing high availability.

The human component of a high-availability strategy consists of organizational practices that ensure failover will always be successful and efficient. For example, machine maintenance or system updates should never be applied to both the primary and standby deployments in a highly available system, and a system administrator should always be available to take responsibility in the event of a failure.

The topics in this section explain how to configure and maintain a highly available ArcGIS Enterprise deployment.

When high availability should be used

A highly available ArcGIS Enterprise deployment is complex and requires time, effort, and cost to configure and maintain. It's important to determine whether high availability is required for your organization. Organizations considering high availability should ask questions such as the following:

  • Does your organization have a mandated service-level agreement?
    • What percentage of uptime is required by the service-level agreement?
    • How many minutes or hours of downtime are permitted per year?
    • How is the service-level agreement enforced?
  • Does your organization have a contractual mandate for high availability?
    • What are the terms of that mandate?
  • Will this ArcGIS Enterprise deployment be involved in mission-critical or business-critical operations?
  • Does your organization have the proper licensing from Esri to implement a highly available deployment?
  • Is your organization able to provide the hardware necessary to support a highly available deployment?
    • Do you have the hardware resources to duplicate each component of your deployment?
    • Are you able to configure and maintain a third-party load balancer capable of performing failover?

Important concepts in high availability

The following sections define and discuss key terms used in highly available systems.

Load balancer

Load balancers act as a reverse proxy and distribute traffic to back-end servers. At least one third-party load balancer is required in a highly available ArcGIS Enterprise deployment to improve the capacity and reliability of the software. They handle client traffic to your portal and server sites, as well as internal traffic between the software components.

Though ArcGIS Web Adaptor is considered a load balancer, it’s inadequate to serve as the lone load balancer in a highly available deployment. You can configure ArcGIS Web Adaptor instances with each ArcGIS Server site for an added layer of security and anonymity, or to set up web-tier authentication. In these cases, the third-party load balancer sends traffic through the Web Adaptor rather than directly to ArcGIS Server machines.

Load balancers need to be able to send HTTP health checks to the server health check or portal health check endpoints. A load balancer creates and manages the URLs used for the deployment, which are described in the next section.

URLs used in federation

Several different URLs are used in a highly available ArcGIS Enterprise deployment.

Services URL

This is the URL used by external users and client applications to access ArcGIS Server sites. It’s the URL for the load balancer that handles ArcGIS Server traffic and passes requests either to the site’s Web Adaptor or directly to the ArcGIS Server machines.

Administrative URL

This URL is used by administrators, and internally by the portal, to access an ArcGIS Server site when performing administrative operations. This must direct to a load balancer; if the administrative URL points to a single machine in the ArcGIS Server site and that machine is offline, federation will not work. Depending on the architecture of your system, this can be the same URL as the services URL, or it can be a second load balancer.

Private portal URL

This is an internal URL used by your server sites to communicate with the portal. This must also direct to a load balancer and should be defined prior to federating. If you federate ArcGIS Server sites prior to setting the privatePortalURL, follow steps 8 and 9 in Configure an existing deployment for high availability to update the URL in the deployment. Similar to the administrative URL, this can be the same as the public URL for the portal, or it can be a second load balancer.

Monitoring

Each ArcGIS Enterprise component provides the ability to handle machine-level failures within a deployment. In a highly available component, when one machine goes offline, the other machine will continue to function with little to no disruption. However, the deployment now has a single point of failure and is at risk. It’s important that the deployment and individual machines be monitored to quickly detect failures and notify administrators when one or more machines go offline. This can be achieved using ArcGIS Monitor or third-party monitoring software.

People and practices

To create and maintain a highly available deployment, your organization needs to make sure people and practices are also highly available. If you only have one administrator and that administrator is not available during an outage, that is not a highly available environment.

Equally important are your organizational practices. If you are using virtual machines, do not put all components of a single software tier within a single host. For example, two virtual machines running a highly available portal shouldn’t be in the same virtual machine host, as that host is a single point of failure.

Ensure that there is always at least one component running at each software tier to maintain high availability. If you need to stop or restart a component, make sure that the other machine running the same component is accessible and functioning correctly.

Do not schedule simultaneous backups or maintenance for all machines in a highly available component. If the patch or backup causes all machines to fail, you have no machines left to take responsibility. See Apply patches and updates to highly available components for more guidance.

Storage for configuration files and data

One of the challenges facing customers deploying ArcGIS Enterprise on-premises is acquiring and maintaining a highly available storage device. Because ArcGIS Server and Portal for ArcGIS both require shared storage to set up high availability, the shared storage can be a single point of failure. In an on-premises deployment, use a NAS device or RAID to ensure that the storage of data and configuration files for ArcGIS Server and Portal for ArcGIS is highly available.

Cloud deployments offer the option of storing data and configuration files in a location that’s already highly available: Amazon Simple Storage Service (S3) buckets within Amazon Web Services (AWS) or Blob Storage containers in Microsoft Azure. These storage locations and availability are managed by the cloud provider. Visit the documentation for each respective cloud provider for more information.

Colocate components

Place all components and storage locations in a highly available ArcGIS Enterprise deployment in the same data center or cloud region to provide low-latency connectivity between each component. Do not split the primary and standby machines in a highly available deployment across separate data centers.

To safeguard against loss of a single data center, you can create a secondary deployment in a separate data center or cloud region. See Disaster recovery and replication for more information.