Big data file shares are registered as a data store through ArcGIS Server Manager, and require a manifest to outline the schema of the data as well as the fields that represent geometry and time in a dataset. The manifest is automatically generated when you register a big data file share. You may need to make modifications if there are any changes to your data or if the manifest generation was unable to determine all the information needed, for example, if the automatically-generated manifest did not select the correct field for the geometry or time.
You can view and edit the datasets and manifest information through ArcGIS Server Manager. The manifest viewer comprises the following three components:
- Dataset selector
- Information about the dataset selected
- Information about the field in the selected dataset
There are also advanced options in the big data file share manifest editor. These are discussed in the Advanced section below. It is recommend to use a hints file before editing your data if manifest generation did not correctly determine field names, encoding, field delimiters or quote characters.
Dataset selector
A manifest is composed of one or more datasets. The number of datasets is dependent on the number of folders in your big data file share location. When you open the manifest manager, you can view the datasets that have been successfully registered in your big data file share. When you select a dataset from the drop-down menu, the dataset parameters will be populated with the dataset information.
If you expected to find more datasets in your manifest or are missing any, do the following:
- Verify that you correctly registered the top-level folder. For more information, see Register a data store through ArcGIS Server Manager.
- Check that your input data is in an allowable format such as a collection of delimited files or shapefiles.
- Ensure that the schema of your input dataset of interest is consistent for a collection of files (all files in a single folder must have the same fields).
Dataset
The dataset field describes the format of the selected dataset. Depending on the source of the dataset, which is represented by the file extension, there will be different options that you can change. The file extension can be a shapefile (.shp) or a delimited file (for example, .csv or .tsv). If the input files for a dataset are shapefiles, the following options are available:
Parameter | Description |
---|---|
File extension | Lists the file type extension on the input dataset. For a shapefile, this will always be shp and cannot be modified. |
Geometry | Determines the geometry type of a shapefile. This cannot be modified for a shapefile dataset. |
Spatial reference (WKID/WKT) | Determines the spatial reference of a shapefile. This cannot be modified for a shapefile dataset. |
Time | The time type of the input shapefile dataset. Options are as follows:
|
Time zone | Denotes the time zone of the time fields. If time type is Instant or Interval, you can specify the time zone. |
If the input dataset is a delimited file, there will be multiple parameters that can be modified in the manifest in Manager. These are outlined in the following table:
Parameter | Description |
---|---|
File extension | Lists the file type extension on the input dataset. Common formats are .csv and .txt. This information can be included in the hints file. |
Field delimiter | Determines the delimiter for each field. Common formats are , and ;. This information can be included in the hints file. |
Record terminator | Determines the terminator for each row of data. Common formats are \n and \t. This information can be included in the hints file. |
Has header row | A Boolean that determines if the input table included a header row. If a header row is included, the headers will be used for the field names. Field name information is predicting geometry and time fields. Headers can be set using the hints file. |
Geometry | Determines the geometry type of an input dataset. Option are as follows:
The type of geometry can be modified, and the fields and formatting representing the geometry is set in the fields section. |
Spatial reference (WKID/WKT) | Determines the spatial reference of a dataset. This can be modified to a WKID or WKT string. |
Time | The time type of the input dataset. Options are as follows:
|
Time zone | Denotes the time zone of the time fields. If time type is Instant or Interval, you can specify the time zone. |
Fields
The fields section lists all of the fields in a dataset. When you select a field you will be able to see the following:
- The name of the field.
- The field type.
- If the field contains any temporal or geometry related attributes. If a field contains these attributes, you can define the format.
Parameter | Description |
---|---|
Name | The name of the field. This can be modified for delimited files. It is recommended that you modify this using a hints file for delimited datasets without header names. You cannot modify the field name of a shapefile. |
Type | The type of the field. This can be modified for delimited files. You cannot modify the field type of a shapefile. |
Geometry related attributes | A Boolean to denote if this field contains geometry information. This is only applicable to delimited files that have a geometry specified. If this is selected, an additional parameter will become available to set the geometry format. |
Format (geometry) | The format of the geometry field. |
Time related attributes | A Boolean to denote if this field contains temporal information. This is only applicable to delimited files that have a time specified. If this is selected, an additional parameter will become available to set the temporal format. |
Format (time) | The format of the temporal field. Temporal formatting is described below. |
Role | Intervals require that a role be set on time. The role can be Start or End. This option is not available for instants. |
Time formats
The following table outlines how to represent time when you edit a big data file share though ArcGIS Server Manager or directly in a manifest. The examples show how to represent the time January 2nd, 2016 at 9:45:02.05 PM.
Symbol | Meaning | Example |
---|---|---|
yy | The year, represented by two digits. | 16 |
yyyy | The year, represented by four digits. | 2016 |
MM | The month, represented numerically. | 01 or 1 |
MMM | The month, represented using three letters. | Jan |
MMMM | The month, represented using the complete spelling. | January |
dd | The day. | 02 or 2 |
HH | The hour, using a 24 hour day, values from 0-23. | 21 |
hh | The hour, using a 12 hour day, values from 1-12. | 9 |
mm | The minute, values range from 0-59. | 45 |
ss | The second, values range form 0-59. | 02 |
SSS | The milliseconds, values range from 0-999. | 50 |
a | AM/PM marker. | PM |
The following table outlines examples for different formats of the same date: January 2nd, 2016 at 9:45:02.05 PM:
Input date | Date format |
---|---|
01/02/2016 9:45:02PM | MM/dd/yyyy hh:mm:ssa |
Jan02-16 21:45:02 | MMMdd-yy HH:mm:ss |
January 02 2016 9:45:02.050PM | MMMM dd yyyy hh:mm:ss.SSSa |
Advanced
The following two advanced option are available in the big data file share editor:
- Manifest—Download and upload a big data file share manifest.
- Hints—Download and upload a hints file to assist in generating a big data file share manifest.
Hints allow you to provide help to manifest generation with delimited file parameters such as field names, encoding, field delimiter and quote characters. It is recommended that you upload a hints file before editing individual datasets if:
- You have a CSV without headers and want to apply fields names to your data.
- The quote and delimiter characters were not recognized when the manifest was first generated.
- The encoding of your dataset was not recognized.
If you upload a hints file, you need to regenerate the manifest. Only datasets with hints provided or new datasets will be updated, and changes made to any other datasets not in the hints file will remain the same. To learn more about hints files, see Understanding the hints file. You can also download and change your manifest in a text editor. If you upload a manifest, it will overwrite any changes you have made to your big data file share manifest in the editor and replace the current manifest. To learn more about big data file share manifests, see Understanding the big data file share manifest.
Edit a big data file share
Once you've registered a big data file share, you can view and edit attributes and settings for that item's registered datasets by opening the big data file share manifest editor.
For example, you may want to verify the number of datasets within a registered file share. If, in doing so, you do not see the expected number of datasets in the registered file share, you should check whether the registered location contains valid datasets.
You may also want to review dataset schemas for a registered big data file share. You can modify a selected dataset's schema by updating its geometry, time definition, and field names in its associated manifest resource.
On the advanced tab of the big data file share manifest editor, you can upload a hints file to provide information about a dataset such as presence or absence of a header row, encoding, field delimiter, or record terminator. Regenerating the manifest after uploading a hints file will use the information provided to generate the manifest.
Optionally, you can download the manifest, edit it, and upload the edited manifest file.
Edit big data file share datasets
In the big data file share manifest editor, you can view a selected big data file share and datasets that have been successfully registered within it. When selecting a dataset from the editor drop-down menu, the corresponding parameters are populated. For details about each option on this dialog box, see about big data file share manifest. To edit dataset parameters, do the following:
- On the Registered Data Stores dialog box, locate the big data file share you want to edit.
- Click the Edit pencil to expose details and options for corresponding datasets.
- Click the Datasets tab to expose the registered datasets and their corresponding parameters.
- Select a dataset from the drop-down menu to view the information represented in its manifest. You can hover over the information icon next to the Geometry and Time properties to view detailed settings for the selected dataset. Make updates to your dataset properties as needed.
The next section contains example cases to edit detailed settings for a registered big data file share dataset.
- When you have finished editing dataset properties, click Save.
Example workflows to edit big data file share datasets
The following example workflows can be conducted within the big data file share manifest editor.
- Update geometry type and fields set for a .csv dataset.
- On the Registered Data Stores dialog box, locate the big data file share you want to edit.
- Click the Edit pencil to expose details and options for corresponding datasets.
- On the Datasets tab, select a dataset from the drop-down menu.
- Hover over the Geometry information icon to view a detailed description of the attributes.
- Remove any existing Geometry fields by selecting each field in the Fields section and disabling the This field contains: Geometry related attributes check box. Click Save.
- Click the Geometry type drop-down button and select the desired type.
- Specify the Spatial Reference for the geometry as a well-known ID (WKID) or well-known text (WKT). For a list of supported WKID and WKT entries, see the spatial reference topic in the ArcGIS REST API documentation.
- In the Fields section, select the desired field, enable the This field contains: Geometry related attributes check box, and specify the format (indicate whether the field represents an X or Y value of a POINT geometry or a custom geometry definition).
- Repeat the previous step for additional fields as necessary.
- Click Save.
- Update time reference for a shapefile file dataset.
- On the Registered Data Stores dialog box, locate the big data file share you want to edit.
- Click the Edit pencil to expose details and options for corresponding datasets.
- On the Datasets tab, select a dataset from the drop-down menu.
- Hover over the Time information icon to view a detailed description of the attributes.
- Remove any existing Time fields by selecting each field in the Fields section and disabling the This field contains: Time related attributes check box. Click Save.
- Click the Time drop-down button and select the desired type (Instant or Interval).
- Specify the Time zone for the dataset. The default value is UTC. The time zone value should be specified as it is in the TZ column shown here.
- In the Fields section, select the desired time field, enable the This field contains: Time related attributes check box, and specify the time format and role if applicable.
- Repeat the previous step for additional fields as necessary.
- Click Save.
- Change a field name or field type for a .csv dataset.
- On the Registered Data Stores dialog box, locate the big data file share you want to edit.
- Click the Edit pencil to expose details and options for corresponding datasets.
- On the Datasets tab, select a dataset from the drop-down menu.
- In the Fields section, use the drop-down menu to select the desired field.
- With the desired field selected, enter the new field name on the Name dialog box and/or use the drop-down menu to select a different Type.
- Click Save.
Tip:
For advanced workflows such as specifying a different field delimiter, record terminator, or modifying multiple field names, upload a new hints file with desired rules and regenerate the manifest.
Edit a big data file share manifest or hints file
On the Advanced tab of the big data file share editor, you can edit the associated manifest or hints file by choosing its respective tab. If you upload a manifest, it will overwrite any changes you have made to your big data file share manifest in the editor and replace the current manifest. To learn more, see Understanding a big data file share manifest. To edit a big data file share manifest or hints file, do the following:
- On the Registered Data Stores dialog box, locate the big data file share you want to modify.
- Click the Edit pencil to expose options for modifying the manifest resource.
- Click the Advanced tab.
- From the Advanced tab, choose the Manifest or Hints tab depending on which you are modifying.
- To download the manifest file, click Manifest > Download.
- To download the hints file, click Hints > Download.
- Use a text editor to modify and save changes locally to the downloaded.json manifest file or .dat hints file.
Tip:
The default file format for the hints file is .dat. Once you've downloaded the file, you can change its extension to .txt and edit the file. - To upload an edited file, click the Edit pencil for the big data file share you want to modify.
- To edit the manifest, click Advanced > Manifest > Upload and browse to the updated .json file.
- To edit the hints file, click Advanced > Hints > Upload and browse to the updated .txt file.
- Click Upload.
If you upload a hints file, be sure to regenerate the manifest.
Regenerate the manifest for a big data file share
After a big data file share is created and a manifest has been generated, a regenerate manifest button appears for each entry on the Registered Data Stores dialog box.
You can regenerate a manifest if you have added new data or if you have uploaded a hints file using the edit resource. The hints file provides specifications that are used when regenerating the manifest.