ArcGIS Enterprise is a collection of software components that can be deployed in many configurations based on an organization's requirements and desired workflows. At each varied capacity and configuration, the software requires stability from the underlying operating systems and infrastructure to operate at its highest potential. You can use the following recommendations and techniques to plan for and deliver stable, optimized, and highly functional production sites at every scale.
Planning for deployment
Prior to deploying ArcGIS Enterprise, consider these recommendations.
After a period of routine use, ArcGIS Enterprise will require additional storage space over time to support workflows such as publishing services and sharing content. The amount of storage space needed will vary and depend on several factors, such as the following:
- Types of published services
- Amount and types of content and items uploaded to the organization
- Data storage to support published services (when data is copied to the server during publishing as opposed to referenced directly in a shared location)
While disk space increases over time to support these workflows, the size of scheduled backups will scale proportionally. To plan for sufficient backup storage, it is recommended that retention policies are used for stored backups. For example, in the event when a collection of incremental backups is only useful when coupled with the last full backup in the chain, it is recommended that you remove them when that chain falls outside of retention requirements. Retention policies must align with organizational requirements to prevent data loss and other applicable regulatory requirements.
Prior to deploying ArcGIS Enterprise, it is recommended that administrators determine proper outage requirements and business dependencies for the ArcGIS Enterprise organization. Considerations should include operating system (OS) and software level patching, as well as configuration changes that require downtime to the organization. Planning for regularly scheduled maintenance windows will allow administrators to set availability expectations across the organization and reduce the frequency of unexpected downtime for emergency change requests. Administrators can customize a notification banner to alert the organization's members with details on when these maintenance windows occur.
Implementation of tiered environments
Proper implementation of, and ongoing comparison between, multiple tiered environments (for example, development, staging, and production) has proven to be a valuable tool in terms of reduction of production-down incidents. Many times, the different tiers will become out of sync in terms of system-level configuration, so ensuring that proper change management is carried out will increase the overall stability of the production tier, while allowing for proper vetting of configuration changes in the lower environments. Using these tiered environments to not only perform acceptance and load testing of content is important, but it is also recommended that you install OS and ArcGIS Enterprise patches following the tiered approach as well to minimize potential for disruption of the production environment.
Once the deployment is in use across the organization, consider the following recommendations.
Each ArcGIS Enterprise component generates log files that can be used to identify and troubleshoot issues:
It is recommended that administrators review generated logs for all entries logged at SEVERE or WARNING level, as they may indicate software functionality issues that must be addressed.
Updates and patches
Regular patching is an important part of maintaining secure, up-to-date, stable environments. In terms of scheduling patches, proper backup procedures will help the organization to recover in situations in which the patch introduces adverse behavior, or some sort of failure is experienced. Patching also requires some sort of validation to take place to confirm normal operations of services on the patched machines following the installation. Patching of systems is split into two broad categories: operating system patches and software patches. The release of operating system patches is typically monthly for Windows or more frequent for individual packages on Linux, while the release of ArcGIS Enterprise patches depends on the version's maturity. Administrators should regularly check for updates using the included patchnotification utility and plan to install during regular maintenance windows.
Installation of patches can vary in time, but typically range from 1 to 2 hours for Portal for ArcGIS and 15 to 30 minutes for ArcGIS Server and ArcGIS Data Store. Checking the CPU utilization for the running patch process (typically msiexec.exe on Windows or a bash process running the applyPatch script on Linux) can be a good method to assess progress of the running patch process. Organizations that have established operating system patching cycles can use those schedules to shape when maintenance windows need to be scheduled and when is a suitable time to apply ArcGIS Enterprise patches as well.
When patching highly available deployments, it is recommended that no two (or more) machines within a site should be patched simultaneously. When patching environments where ArcGIS Enterprise components are installed on separate machines, the general rule is to patch machines in this order:
- ArcGIS Data Store
- ArcGIS Server
- Portal for ArcGIS
This recommended order is due to dependencies between published services and the availability of back-end data stores.
When an organization's TLS/SSL certificate expires, members may lose connection and the organization may become inaccessible. To alleviate this, administrators can maintain a schedule to request and apply replacement certificates to avoid disruption or loss of connectivity. Typically, certificates expire annually or every two years if issued by a public certifying authority, while internal certificates can extend much longer but will eventually need to be rotated with a renewed certificate. Proper planning allows for the renewal not only to occur before the expiration date but also to be updated within the web servers during a maintenance window to avoid interruption of service.
When employing Security Assertion Markup Language (SAML) authentication, it is important to track the renewal and rotation requirements for certificates used by the identity provider (IdP) and service provider (SP). Examples include those used to validate the signed SAML request, encrypt the SAML assertion, and validate the signed SAML response. Maintaining a record of those certificate renewal and rotation requirements will prevent disruption to the organization. Learn more about best practices for SAML security.
The authentication method your organization uses may impose additional considerations for administering your ArcGIS Enterprise deployment. When connected to Windows Active Directory or Lightweight Directory Access Protocol (LDAP), a user password is in place to establish a connection to the identity store. If this password expires, the identity store connection can disconnect and prevent users from authenticating. Manage this account as a proactive rotation of the user and password during a maintenance window.
Administrators must also be aware of when database user passwords may be expiring, as this can directly affect the availability of the ArcGIS Server services. Database user passwords can be updated by importing a new database connection file to the existing registered database connection in ArcGIS Server Manager, ArcGIS Desktop, or ArcGIS Pro. Stop services that depend on those database connections prior to updating the database user's password to avoid the potential for locked accounts due to authentication requests containing the expired password.
To ensure sufficient and adequate machine resources are available to your ArcGIS Enterprise deployment, it is important that you perform ongoing monitoring of CPU, RAM, and disk usage. By identifying typical usage, you can observe trends and detect anomalies and fine-tune the resources accordingly on each participating machine over time. Administrators can configure alerts to notify IT and GIS administrators when certain requirements are at risk, such as when disk space is below a certain threshold for free space or when CPU utilization has spiked beyond an expected duration. These alerts provide an added safety measure for administrators to proactively implement corrective actions and to prevent outages due to increases in demand.
Data source management
ArcGIS Server service performance is dependent on the underlying database instances that host the referenced data in published services. To optimize map and feature services for query efficiency, confirm there are no bottlenecks on the data tier that would negatively impact response times. The same rules outlined above for machine resources apply to the back-end DBMS instances and can be supplemented with tools such as database tracing. Some database management system instances have a maximum number of connections allowed, which could become important during scaling events of your ArcGIS Server services or the addition of a machine to the site. Some organizations assign this level of monitoring to their database administration team while in other organizations the responsibility may fall to the IT team that oversees the infrastructure.
Antivirus and antimalware exclusions
While exclusions defined for a security product are typically static, it is useful to check with the team responsible for those products on a regular basis, annually for example, to confirm that the exclusions are still in place and do not negatively impact performance of ArcGIS Enterprise software. Having the checkup conversation can also translate into an understanding of future updates or software replacements that may affect the operation of the existing deployments.
File system permissions
During installation, each ArcGIS Enterprise component is configured with permissions to run with a specified service account. If these permissions are modified, outages or software malfunctions may occur. Updates to these permissions should be rare and controlled for a least-privilege access model as well as for when patches are applied within tiered environments. If file permissions are observed to change frequently, enable an audit of those changes to prevent interruptions in the software.
Scan for operational health issues
The operation of your ArcGIS Enterprise deployment can be adversely affected by various architecture and configuration issues. To check for these issues, you can use the Python script, operationalHealth.py, that comes with the ArcGIS Enterprise portal. The tool analyzes many criteria and configuration properties and generates a report in HTML format that lists all operational health issues affecting your portal. Learn more about scanning your portal for operational health issues.