Skip To Content

Lesson 1: Deploy data quality services

Overview

ArcGIS Data Reviewer for Server uses a map service, a geoprocessing service, and a server object extension to provide automated validation services to distributed clients. These services are executed either on a regularly occurring schedule (for example, changed features are validated daily at 9:00 pm) or on an adhoc or on-demand basis (for example, a web application can be used to validates your own edits).

In this lesson, you will deploy and configure services required to implement automated validation of your data using business rules implemented using automated checks in Data Reviewer. These services include a geoprocessing service used to execute automated validation, a map service to manage scheduling, execution, and storage of results from your validation. You will leverage these services using a web application—Batch Validation Manager—that enables the scheduling of reoccurring automated validation or the execution of data validation on an adhoc or on-demand basis. For example, to validate changed features daily at 9:00 pm you would schedule automated validation, but to use a web application to validate your own edits, you would run validation on an adhoc or on-demand basis.

Learn more about data validation using Data Reviewer checks

Prerequisites

The following prerequisites are required to successfully configure and deploy data quality services.

  • The Data Reviewer workspace has the same spatial reference as the data production workspace.
  • The ArcGIS Server account has access to the connection file with read/write permissions to the Data Reviewer workspace.
  • The ArcGIS Server account has access to the connection file with read permissions to the data production workspaces.
  • The map services required to configure the batch validation manager web application are available.
  • Preconfigured batch jobs for validating the data production workspace are available.

Deploy services

To deploy data quality services you will first configure and test the Data Reviewer batch validation service and the Data Reviewer results service.

Deploy the Data Reviewer batch validation service

Before deploying data quality services you need to configure and test the Data Reviewer batch validation service and the Data Reviewer results service.

Data Reviewer for Server includes a service definition (*.sd) file for its batch validation geoprocessing services. You will create this service using the publishing tool found in the ArcGIS Server Manager app.

  1. Log in to ArcGIS Server Manager by opening the manager URL in a web browser.

    Your manager URL is https://<server name>:6443/arcgis/manager.

  2. Click Services on the top banner.
  3. Click Publish Service.
  4. On the Publish Service dialog box, click Choose File.
  5. Browse to the ExecuteBatchJob.sd file and click Open.

    The file is located in <ArcGIS Server installation folder>\ArcGISDataReviewerServer\Server<version>\Service Definitions.

  6. Click Next on the Publish Service dialog box.
    • Optionally click the Folder drop-down arrow and choose a folder name.
    • Optionally click the Cluster drop-down arrow and choose a cluster name.
  7. Click the check box to start the service immediately.
    • Optionally click the check box to share the service on your portal.
  8. Click Next.
  9. Click Publish.
    Note:

    The service security settings for the ExecuteBatchJob geoprocessing service must be set to Public, available to everyone to ensure that scheduled batch validations complete as expected.

Deploy the Data Reviewer results service

Data Reviewer for Server includes a service definition (*.sd) file for its map service used in managing and reporting data quality results. You will create this service using the publishing tool found in the ArcGIS Server Manager app.

  1. Log in to ArcGIS Server Manager by opening the manager URL in a web browser.

    Your manager URL is https://<server name>:6443/arcgis/manager.

  2. Click Services on the top banner.
  3. Click Publish Service.
  4. Click Next on the Publish Service dialog box.
  5. Browse to the reviewer.sd file and click Open.

    The file is located in <ArcGIS Server installation folder>\ArcGISDataReviewerServer\Server<version>\Service Definitions.

  6. Click Next on the Publish Service dialog box.
    1. Optionally click the Folder drop-down arrow and choose a folder name.
    2. Optionally click the Cluster drop-down arrow and choose a cluster name.
  7. Leave the check boxes for starting the service immediately unchecked.
    1. Optionally click the check box to share the service on your portal.
  8. Click Next.
  9. Leave all service capabilities unchecked.
  10. Click Publish.

Configure the Data Reviewer results service

Before running the Data Reviewer results service, you must configure it to store the results of your automated validation. Configure the Data Reviewer results service with the DRS Configuration Utility.

  1. Click Start > All Programs > ArcGIS > ArcGIS Data Reviewer for Server > DRS Configuration Utility.
  2. Verify that the URL listed in the URL text box points to the ArcGIS Server Administrator Directory.

    The format of the Administrator Directory is http://localhost:6080/arcgis/admin.

    Note:

    If you have installed Data Reviewer for Server in a cluster, running the DRS Configuration Utility against one machine in the cluster is sufficient. The configuration changes will automatically be applied to each machine in the cluster.

  3. In the Username and Password text boxes, type your ArcGIS Server primary site administrator user name and password.

    This is the account you use to log in to ArcGIS Server Manager.

  4. Click Connect.
  5. Optionally add the Data Reviewer server object extension (.soe) to your server if it has not been previously installed.
    1. Click Browse in the DRS extension area.
    2. Browse to the location that contains the ESRI.ReviewerServer.soe file and click Open.

      The file is located in <ArcGIS Server installation folder>\ArcGISDataReviewerServer\Server<version>\Bin.

    3. Click Add.
  6. Click the Select Map Service drop-down arrow and choose reviewer.MapServer from the list.
  7. Click Browse next to the Select New Reviewer Workspace text box.
  8. Browse to the location of the Reviewer workspace and click OK.

    The ArcGIS Server account must have at least READ access to the folder containing the .sde connection file. Do not choose a file geodatabase when deploying automated validation capabilities.

    If you have installed Data Reviewer for Server in a multi-machine deployment, the Reviewer Workspace path must be a UNC path and accessible to all machines in the deployment.

  9. Click Apply.

    If the Reviewer map service fails to start, you can restart the service from within the ArcGIS Server Manager.

    DRS Configuration Utility dialog box
  10. Click Close to close the DRS Configuration Utility.

Test the Data Reviewer services

Before proceeding, test your configuration by browsing to the Data Reviewer for Server Services Directory.

  1. Do the one following from a supported web browser:

    For a single-machine deployment

    Browse to the Data Reviewer for Server Services Directory URL (http://<server name>:6080/arcgis/rest/services/reviewer/MapServer/exts/DataReviewerServer).

    For a multimachine deployment

    Browse to the Data Reviewer for Server Services Directory URL through the ArcGIS Web Adaptor (https://<web adaptor url>/arcgis/rest/services/reviewer/MapServer/exts/DataReviewerServer).

  2. Verify that the Data Reviewer for Server Services Directory displays the location and spatialReference of the Reviewer workspace and four Child Resources: Batch Validation, Dashboard, ReviewerResults and Utilities.
  3. The services component is now ready to configure web clients to enable web-based data quality workflows.

Deploy web applications

A client application is required in order to use your automated validation services. In this section, you will deploy the Batch Validation Manager Web application to manage reoccurring validations of your data.

Batch Validation Manager is a web application you can use to schedule the running of Reviewer batch jobs using capabilities provided by Data Reviewer for Server. The application can be configured to run batch jobs either on a recurring basis—daily, weekly, monthly, or yearly—or once at a future date. A scheduled job identifies the data to be validated, the extent of the validation—the full database or a spatial extent—and whether validation should be run on all features or only changed features for enterprise workspaces. The application uses the Data Reviewer for Server batch validation capabilities to schedule and manage batch jobs and stores the results in the Reviewer workspace designated in the DRS Configuration utility.

Host Batch Validation Manager on your web server

Batch Validation Manager can be hosted on your organization’s Microsoft Internet Information Service (IIS) web server. To host Batch Validation Manager on your web server, complete the following steps.

  1. Download and unzip the Data Reviewer Batch Validation Manager application.
  2. Copy the contents to your web server so it can be accessed as a website or virtual directory. In Microsoft Internet Information Services (IIS), the default web server directory is <your directory>\Inetpub\wwwroot\.
    Note:

    You may need to setup and use a proxy page to support sharing and secure services. See Using the proxy for details on installing and configuring a proxy page. If your site needs a proxy, the one that comes with the project will likely be sufficient after you have converted the site to an IIS application.

  3. Open the configuration file (\BatchValidationManager\settings.js) and use the parameter table below to configure the application.

    The settings file for BVM configuration

    ParameterUse

    restReviewerMapServer

    URL of the Reviewer map service.

    The Reviewer map service is the default service shipped with ArcGIS Data Reviewer for Server.

    Example: http://<ArcGIS Server Host Machine Name>:6080/arcgis/rest/services/reviewer/MapServer/

    drsSoeUrl

    URL to the DataReviewerServer server object extension (SOE).

    Example: http://<ArcGIS Server Host Machine Name>:6080/arcgis/rest/services/reviewer/MapServer/exts/DataReviewerServer

    clientTimeUTC

    Set to true to use UTC time when scheduling job execution.

    Set to false to automatically convert client local time to UTC time for scheduling job execution.

    jobExecutionListRefreshInterval

    The amount of time between refreshes in the job executions, in milliseconds, when the Auto Refresh check box is enabled.

    The default is 15,000, or 15 seconds.

    alwaysUseProxy

    Set to true when using a proxy for requests; the default is set to false.

    proxyURL

    The URL of the proxy used to upload batch jobs when batch validation is scheduled.

    The proxy must reside on the same domain as the application. See Using the proxy for more information.

    mapServices

    The map services displayed when specifying an area of interest in scheduling batch jobs.

    The first map service added to the configuration file is the basemap. There are four parameters that may be set for each map service; the last two are optional.

    • serviceType—The map service type: Tiled or Dynamic. This value is required.
    • serviceURL—The URL of the map service. This value is required.
    • initialExtent—The initial extent of the map service in basemap units. This value is optional.
    • spatialReference—The well-known ID (WKID) for the spatial reference of the map. This value is optional.

    dataWorkspaces

    These workspaces are displayed in the Data Workspace drop-down list in the Schedule Batch Validation window. There are four parameters for each production workspace, but only the name and path are required.

    • name—Identifier for the workspace to display in the Data Workspace drop-down list. This value is required.
    • path—Path to the file geodatabase workspace or an enterprise geodatabase connection file. This value is required.
    • spatialReference—Spatial reference well known ID (WKID) used to project a job's analysis area. This value is optional; it is only required if the job's analysis area needs to be projected.

    geometryServiceURL

    The URL for the geometry service used to project the analysis area to match the spatial reference of a data workspace. This value is optional.

    There can be situations when the basemap is not the same spatial reference as the data workspace. To overcome this problem, the Batch Validation Manager application provides a way to project the analysis area on the fly to match the spatial reference of the data workspace.

    publishJobUsername

    The user name assigned to scheduled jobs displayed in the Schedules tab.

    This parameter will be ignored if your service is secured.

  4. Save and close the file.
  5. Type the URL http://<yourServer>/<yourSite>/index.html into your Internet browser with the appropriate substitutions. This will open a fully-configured version of Batch Validation Manager and will confirm that the application is properly set up on the web server.

Use the Batch Validation Manager

Batch jobs are groups of configured Data Reviewer checks that validate your data against certain conditions or business rules. This ensures that the data is compliant with the product specifications or other rules used to determine the validity of your data.

Learn more about automated validation using Data Reviewer batch jobs

You can use Batch Validation Manager to schedule the validation of data on a regular basis, that is, annually, monthly, weekly, or daily. This allows you to validate data to ensure that results are being resolved and verified.

The process of scheduling a batch job includes the following steps.

  1. Specify a name for the scheduled job.
  2. Choose the batch job to run.
  3. Choose a session to store the batch job results.
  4. Optionally, specify the data workspace to validate.
  5. Set the recurrence of the batch job execution.
  6. Set the starting time for the batch job.
  7. Set the extent for the batch job to analyze.
  8. Indicate whether the batch job is only going to run on changed features.

Schedule a new automated validation

The process of scheduling a new automated validation includes the following steps.

  1. From a web browser, open the Batch Validation Manager application at http://<servername>/batchvalidationmanager.
  2. Click the Schedule Batch Validation button to schedule a new job.
  3. In the Schedule Batch Validation dialog box, type a name for the job in the Title text box.
    Note:

    It is recommended that the name of the batch job be meaningful so you know the purpose of the scheduled task. For instance, the name could be the name of the batch job you are running or the dataset you are validating. The value specified as the title appears on the Schedules and Executions tab in the Job Title column.

  4. Click the Browse button next to the Batch Job text box.
  5. Browse to the batch job to run and click Open.
    Note:

    The batch job contains both the business rules to run against the data workspace and the location of the data workspace. Checks in the batch job ideally run on one data workspace, but it is possible to have checks pointing to multiple data sources. However, if you click the Data Workspace drop-down arrow and choose a data workspace, the checks are resourced to the selected data workspace.

  6. Click the Sessions drop-down arrow and choose the Reviewer session that will store the batch job results. You can only choose Reviewer sessions that are stored in the workspace designated in the Data Reviewerconfiguration.
  7. Optionally click the Data Workspace drop-down arrow and choose the data workspace to be validated by the batch job selected on the Batch Job parameter.
  8. Choose an option for when the batch job is executed:

    To run the batch job only once

    Choose the once option.

    To schedule the batch job to run on a regular basis

    Choose the recurring option and indicate the interval and the frequency.

    The batch job can be run daily, weekly, or monthly. You can also choose to stop running the batch job after a specific number of recurrences.

  9. Choose a starting time for the batch job.

    To start the job immediately

    Choose the now option.

    To run the job at a specific date and time

    • Choose once as the Run option and select at a specified date/time to set the date and time when the batch job will run.

    • Choose recurring as the Run option and select to run the batch job at a specified daily, weekly, or monthly time.
  10. Choose an option for the extent.

    To run the batch job on the entire database

    Choose the whole database option.

    To run the batch job on a specific area of the data

    Choose the spatial selection option. Click Draw Area, and draw an extent to validate.

  11. If the data workspace to validate is an enterprise geodatabase, you can check the Changed features only check box. This allows you to limit validation to only those features that have changed from the parent to child version. These changes include the following:
    • Features inserted in the child version but not the parent
    • Features changed in the child version and unchanged in the parent
    • Features changed in both child and parent versions
    • Features changed in the child version and deleted in the parent
  12. Click Submit.
    Tip:

    The job title and schedule information appear on the Schedules tab. If you have scheduled the job to run immediately, using the now option, the job begins to run.

  13. Once a batch job is scheduled, you can do one of the following:

    To disable an active job

    Uncheck the check box next to the job title on the Schedules tab.

    Jobs that have finished their scheduled run cannot be disabled.

    To modify an active job

    Select the name of the job and click Modify.

    To delete an active or finished job

    Select the name of the job and click Delete.

    Jobs that have finished their scheduled run are automatically removed from the system after 24 hours.

View job schedules

The Batch Validation Manager application provides a way to view summary and detailed information on batch job executions and their results. Information about batch job runs is available from the Schedules tab. Here you can view:

  • the name of the scheduled job
  • the batch job to run
  • the Reviewer session to which results are written
  • the recurrence schedule
  • who scheduled the job
  • the data workspace to be validated

To view the specifics of your batch job executions, follow these steps.

  1. From a web browser, open the Batch Validation Manager application at http://<servername>/batchvalidationmanager.
  2. Click the Schedules tab to a show a summary of scheduled jobs. This includes the job's name and the recurrence: daily, weekly, monthly, or yearly.
    Note:

    Jobs that are not scheduled to run on a recurring schedule have a schedule of yearly.

  3. To filter the list of scheduled jobs, choose one of these options:
    • Choose All to display all scheduled jobs.
    • Choose Active to only display currently active jobs.
    • Click the By drop-down arrow to only show jobs scheduled by a specific user.
  4. Click a job name to view schedule details.

Detailed schedule information appears in a pane on the right side of the browser. It includes the name of the batch job run, the Reviewer session storing the results, the frequency of the recurrence, who scheduled the job, and the location of the data workspace.

Note:

If the batch job schedule does not appear in the list, click Refresh. You can also check the Auto Refresh check box so the list automatically refreshes based on a duration set by your administrator.

View job executions

The Executions tab displays information about successful and failed batch job runs. Information about the execution is divided into three sections: Status, Summary, and Properties. The Status section indicates whether the batch job has executed successfully or failed, as does the icon to the right of the item (green check mark or red X). The Summary section shows how long it took to run the batch job in hours, minutes, and seconds; the number of features validated; and the number of results written to the Reviewer session. The Properties section shows the batch job, session name, schedule, and data workspace validated by the job.

  1. From a web browser, open the Batch Validation Manager application at http://<servername>/batchvalidationmanager.
  2. Click the Extensions tab to show a summary of job executions.
    Note:

    By default, job executions are listed in descending order by start time. You can click the Start Time, End Time, Job Title, or Schedule headings to sort the records based on different field values.

  3. To filter the list of complete or running jobs, choose one of these options:
    • ChooseAll to display all the jobs that have run.
    • Choose Only errors to only display those jobs that have failed to run.
    • Click the drop-down arrow to filter list of the job executions by time. You can filter the results based on what has run today, yesterday, within the past seven days, or within the past 30 days.
  4. Click an item in the list to view detailed information about the execution.

    The pane on the right side of the browser shows detailed information for the batch job execution.

    Note:

    If the batch job execution does not appear in the list, click Refresh. You can also check the Auto Refresh check box so the list automatically refreshes based on a duration set by your administrator.