The data available in ArcGIS Monitor can be configured to detect changes in the health of your Enterprise GIS. The following sections provide more information on how data in ArcGIS Monitor can be leveraged to help facilitate rapid resolution when system measurements are outside your organization's Service Level Agreement (SLA). The following are common use cases in which ArcGIS Monitor data is valuable for analysis:
- System availability
- System performance
- User load and infrastructure utilization
ArcGIS Monitor provides performance alerts in several categories that can be instrumental in identifying the correlation between system performance issues and how they can affect system availability. When troubleshooting performance-related issues, it's recommended that you check other performance counters to help identify the root cause of the issue. The following table lists key performance categories and counters available in ArcGIS Monitor:
|Category||Counter type||Performance counter name||User load counter name||Error counter name||Required extension|
Response Time (sec)
Response Time (sec)
Response Time (sec)
Busy Time per TR (sec)
ArcGIS Server Log
User defined (for example, connections)
Disk % Idle, Disk % Used
Available Memory GB
Network Received, Network Sent
% Processor Time
ELB or IIS Logs
Response Time (sec) (ExcelReport)
Tr/Interval Response Time Histogram
ELB or IIS Logs
Alerts in ArcGIS Monitor occur when collected data values equal or exceed predefined thresholds. All alerts are sorted by the date and time of the occurrence in descending order. Columns in the alerts table can also be sorted and filtered to help administrators focus on specific collections, counter types, and counter names. The following is a list of best practices for managing alerts:
- Alert categories and thresholds should be set and adjusted as necessary to match your organization's target SLA.
- Critical and warning alert thresholds should be adjusted as necessary to reduce excessive alert email notifications.
- Alert conditions should be addressed in a timely manner to ensure the health of your systems and to reduce alert email notification volumes.
- The Current Status view in the ArcGIS Monitor Server application should be reviewed regularly to ensure that your organization's SLA and recovery time objectives are being met.
Alert email notifications
The Monitor Server application regularly queries the database for counters with alert conditions. An email notification is sent based on the following counter sample intervals:
- 1 minute—An email notification will be sent after three consecutive alerts.
- 5 minutes—An email notification will be sent after two consecutive alerts.
- 15 minutes—An email notification will be sent on the first alert.
- 1 hour—An email notification will be sent on the first alert.
Availability is the amount of uptime during a given time span—such as a month or a year—and is expressed as a percentage of time. The following table lists availability percentages and the calculated amount of downtime:
The basic formula used to calculate the percentage of availability is:
Availability = (Total time - Downtime) / Total time * 100
The following should be considered when analyzing availability in ArcGIS Monitor:
- In most cases, scheduled maintenance should not be considered downtime and should be excluded when calculating availability.
- ArcGIS Monitor calculates availability based on critical alerts (warning alerts are excluded) and uses the counter sample interval as the downtime duration.
- It's recommended that you not use sample intervals greater than 5 minutes for critical alerts as this can skew the availability statistics. For example, if you configure a counter with a 1-hour sample interval and there is only one outage per day, the calculated availability would be 95.83 percent. If you configure the same counter with a 1-minute sample interval, the calculated availability would be 99.93 percent, which produces a more realistic availability assessment.
- In most cases, 1- and 5-minute sample intervals should give a reasonable estimation of availability.
Coverage is the amount of time ArcGIS Monitor services have been fully operational and is expressed as a percentage of time. Coverage is an important confidence-level metric that helps you determine the availability of your monitor services. When ArcGIS Monitor services are restarted, the coverage level during that time period will be less than 100 percent. It's important to recognize that coverages less than 100 percent will directly affect the reported availability of your collections and should not be used for analyzing collection availability during that time period.
ArcGIS Monitor makes full use of statistics; therefore, administrators should be familiar with the following basic statistics: minimum, maximum, average, and percentile. For deployments with many counters and large amounts of historical data, analyzing tabular statistics is more effective than analyzing charts. When the time span of a report is less than 12 hours, charts display real-time data values at the collection interval. When the time span of a report is greater than 12 hours, the chart displays hourly averages. As a result, the chart is flattened and does not show maximum values. Table statistics always display actual values for minimum, maximum, percentile, and so on, regardless of the time span. The following percentile values in ArcGIS Monitor indicate the percent of a distribution that is equal to or below a given value:
- P5—The fifth percentile.
- P50—The fiftieth percentile.
- P90—The ninetieth percentile
- P95—The ninety-fifth percentile
- P99—The ninety-ninth percentile