Anotação:
At ArcGIS Enterprise 10.9.1 or later, it is recommended that you add or edit big data file shares through your portal contents page instead of ArcGIS Server Manager.
It is recommended that you register and edit big data file shares in your portal contents page. Use ArcGIS Server Manager if you meet either of these conditions:
- You want to register Azure Data Lake, which can only be registered through ArcGIS Server Manager.
- You are editing or viewing a big data file share created through ArcGIS Server Manager at a previous release.
If you don't meet either of these conditions, register your big data file share through your portal contents page, and modify, edit, and review using the big data file share item.
When registering big data file shares through ArcGIS Server Manager, use ArcGIS GeoAnalytics Server. A big data file share requires a manifest to outline the schema of the input data, as well as the fields and formats that represent geometry and time in a dataset. The manifest is automatically generated when you register a big data file share. You may need to make modifications if there are any changes to your data, or if the manifest generation was unable to determine all the required information—for example, if the automatically generated manifest did not select the correct field for the geometry or time. A big data file share may optionally have output templates used to outline the format of results written to the big data file share. The output templates are generated when you register a big data file share and select to use it as an output location. You may need to modify one or more templates, such as the format of the time and geometry fields, or you may want to add or delete a template.
You can view and edit the datasets and manifest information, as well as the output templates, through ArcGIS Server Manager on your ArcGIS GeoAnalytics Server installation.
Edit a big data file share
Once you have registered a big data file share through ArcGIS Server Manager, you can view and edit attributes and settings for that item's registered datasets by opening the big data file share manifest editor in ArcGIS Server Manager. You can also edit attributes and settings for the optional output templates, which outline how output results will be written to the big data file share.
Anotação:
If you have registered a big data file share through your portal, edit the big data file share through the portal item page.
For example, for input data, you may want to verify the number of datasets in a registered file share. If you do not see the expected number of datasets in the registered file share, you should check whether the registered location contains valid datasets.
For an output template, you may want to format a delimited file output to write a tab-delimited file and use WKT to store the geometry.
You may also want to review dataset schemas for a registered big data file share. You can modify a selected dataset's schema by updating its geometry, time definition, and field names in its associated manifest resource.
On the advanced tab of the big data file share manifest editor, you can upload a hints file to provide information about a dataset, such as the presence or absence of a header row, encoding, field delimiter, or record terminator. Regenerating the manifest after uploading a hints file will use the information provided to generate the manifest.
Optionally, you can download the manifest, edit it, and upload the edited file.
Edit big data file share input datasets
In the big data file share manifest editor, you can view a selected big data file share and the datasets that have been successfully registered in it. When you select a dataset from the editor drop-down menu, the corresponding parameters are populated. For details about each option on this dialog box, see editing parameters in big data file shares. To edit dataset parameters, do the following:
- On the Registered Data Stores dialog box, locate the big data file share you want to edit.
- Click the Edit button to see details and options for corresponding datasets.
- Click the Datasets tab to show the registered datasets and their corresponding parameters.
- Select a dataset from the drop-down menu to view the information represented in its manifest. Make updates to your dataset properties as needed.
- When you have finished editing dataset properties, click Save.
Edit a big data file share manifest or hints file
On the Advanced tab of the big data file share editor, you can edit the associated manifest or hints file by choosing its respective tab. If you upload a manifest, it will overwrite any changes you have made to your big data file share manifest in the editor and replace the current manifest. To learn more about the big data file share manifest, see Big data file share manifest. To learn more about using a hints file, see Hints file. To edit a big data file share manifest or hints file, do the following:
- On the Registered Data Stores dialog box, locate the big data file share you want to modify.
- Click the Edit button to see options for modifying the manifest resource.
- Click the Advanced tab.
- From the Advanced tab, choose the Manifest or Hints tab, depending on which you are modifying.
- To download the manifest file, click Manifest and click Download.
- To download the hints file, click Hints and click Download.
- Use a text editor to modify and save changes locally to the downloaded .json manifest file or .dat hints file.
Dica:
The default file format for the hints file is .dat. Once you've downloaded the file, you can change its extension to .txt and edit the file. - To upload an edited file, click the Edit button for the big data file share you want to modify.
- To edit the manifest, click Advanced > Manifest > Upload and browse to the updated .json file.
- To edit the hints file, click Advanced > Hints > Upload and browse to the updated .txt file.
- Click Upload.
If you upload a hints file, be sure to regenerate the manifest. When you regenerate a manifest, only datasets with hints or new datasets are updated, and changes made to any other datasets that are not in the hints file remain the same.
Regenerate the manifest for a big data file share
After a big data file share is created and a manifest is generated, a regenerate manifest button appears for each entry on the Registered Data Stores dialog box.
You can regenerate a manifest if you add new data or if you upload a hints file using the edit resource. The hints file provides specifications that are used when regenerating the manifest.
Anotação:
When a manifest is regenerated, it updates the manifest for existing datasets that have a hints file or new datasets. Any edits you have make to the manifest are overwritten with the rules defined in the hints file.Big data file share editing parameters
The big data file share editor comprises the following five sections:
- Dataset selector
- Fields
- Geometry
- Time
- Dataset format
It is recommended that you use a hints file before editing your data if manifest generation did not correctly determine field names, encoding, field delimiters, or quote characters.
Dataset selector
A manifest is composed of one or more datasets. The number of datasets is dependent on the number of folders in your big data file share location. When you open the manifest manager, you can view the datasets that have been successfully registered in your big data file share. When you select a dataset from the drop-down menu, the dataset parameters are populated with the dataset information.
If you expected to find more datasets in your manifest or are missing any, do the following:
- Verify that you correctly registered the top-level folder. For more information, see Register your data with ArcGIS Server Manager.
- Confirm that your input data is in an allowable format, such as a collection of delimited files, shapefiles, Parquet, or Optimized Row Columnar (ORC).
- Ensure that the schema of your input dataset of interest is consistent for a collection of files (all files in a single dataset must have the same fields).
Fields
The fields section lists all of the fields in a dataset. When you select a dataset, you can see the following for each field:
- The name of the field
- The field type
The field name and type can be modified for delimited files. If you are modifying more than one field name, it is recommended that you use a hints file.
If the input dataset is a delimited file, there are multiple parameters that can be modified in the manifest in ArcGIS Server Manager.
Geometry
The geometry section lists the type of geometry and how it is represented. The following table outlines the available options, with notes for changes you can make, depending on the input dataset type:
Geometry parameters
Parameter | Description | Delimited files | Shapefiles | ORC files | Parquet files |
---|---|---|---|---|---|
Geometry | The geometry type. Options are Point, Polyline, Polygon, or None. If there is no geometry, the input is a table. | Editable | Cannot be modified | Editable | Editable |
Spatial reference (WKID/WKT) | The spatial reference of the dataset. This option is only shown if the dataset is not a table. | This can be modified. By default, it is set to 4326, WGS 1984. | Cannot be modified | Editable | Editable |
Geometry formatting type | How the geometry is formatted for each feature. Options are XYZ (fields that represent x-, y-, and optionally z-values—XYZ is only applicable to points), WKT (well-known text), GeoJson, EsriJson, and shape. This option is only available if the dataset is not a table and not a shapefile. | Editable | Not available | Editable | Editable |
Time
The time section outlines how time is represented. The following table outlines the available options, with notes for changes you can make, depending on the input dataset type. Time options are the same for all data types, except where noted.
Time parameters
Parameter | Description | Example |
---|---|---|
Time type | The type of the input time. Options are Instant (a single moment in time), Interval (a span of time with a start and end time), and None. | Instant |
Time zone | The time zone of the input time. This option is only available if Time Type is not None. | UTC |
Name and formatting table for time | This table selects the time field or fields and outline how time is defined. Time can use one or more fields to define time and can use one or more formats for a single field. By default, the first field with the name time is used as the time field, with an estimate of the time format. If there is a shapefile, the first field of type date is used. If time is of type Interval, a start and end time must be specified. The time formatting table is only available if Time Type is not None. | Example with a single field used to represent time with two different formats
Example with two fields used to represent time
|
Time formats
The following table outlines how to represent time when you edit a big data file share through ArcGIS Server Manager or directly in a manifest. The examples show how to represent the time 9:45:02.05 PM on January 2, 2016.
Time formats in big data file shares
Symbol | Meaning | Example |
---|---|---|
yy | The year, represented by two digits. | 16 |
yyyy | The year, represented by four digits. | 2016 |
MM | The month, represented numerically. | 01 or 1 |
MMM | The month, represented using three letters. | Jan |
MMMM | The month, represented using the complete spelling. | January |
dd | The date. | 02 or 2 |
HH | The hour when using a 24-hour day; values range from 0 to 23. | 21 |
hh | The hour when using a 12-hour day; values range from 1 to 12. | 9 |
mm | The minute; values range from 0 to 59. | 45 |
ss | The second; values range from 0 to 59. | 02 |
SSS | The millisecond; values range from 0 to 999. | 50 |
a | The AM/PM marker. | PM |
epoch_millis | The time in milliseconds from epoch. | 1509581781000 |
epoch_seconds | The time in seconds from epoch. | 1509747601 |
Z | The time zone offset expressed in hours. | -0100 or -01:00 |
ZZZ | The time zone offset expressed using IDs. | America/Los_Angeles |
'' | Use single quotes to add text that doesn't represent a value outlined in this table. | 'T' |
The following table shows examples of different formats for the same date, January 2, 2016, at 9:45:02.05 PM:
Time format examples
Input date | Date format |
---|---|
01/02/2016 9:45:02PM | MM/dd/yyyy hh:mm:ssa |
Jan02-16 21:45:02 | MMMdd-yy HH:mm:ss |
January 02 2016 9:45:02.050PM | MMMM dd yyyy hh:mm:ss.SSSa |
01/02/2017T9:45:14:05-0000 | MM/dd/yyyy'T'HH:mm:ssZ |
Dataset format
The dataset format section outlines the format the data is in. Data may be in one of the following formats:
- Shapefile (.shp)
- Delimited file (for example, .csv)
- Parquet file
- ORC file
The available parameters differ depending on the dataset. For shapefiles, ORC, and Parquet files, the only parameter is the file type, which cannot be modified. If the input dataset is a delimited file, multiple parameters can be modified. To modify values for a delimited file, use a hints file and regenerate the manifest. These parameters are outlined in the following table:
Dataset formats
Parameter | Description |
---|---|
File extension | Lists the file type extension on the input dataset. Common formats are .csv and .txt. Modify this information for a delimited file using a hints file. |
Field delimiter | Determines the delimiter for each field. Common formats are , and ;. Modify this information for a delimited file using a hints file. |
Record terminator | Determines the terminator for each row of data. Common formats are \n and \t. Modify this information for a delimited file using a hints file. |
Quote character | Determines the character used for quotes. Modify this information for a delimited file using a hints file. |
Has header row | A Boolean value that determines whether the input table includes a header row. If a header row is included, the headers will be used for the field names. Field name information is predicting geometry and time fields. Set header rows using the hints file. |
Encoding | The type of encoding used on the file. By default, this is UTF-8. This is set with a hints file. |
Big data file share output template editing parameters
The big data file share output template editor comprises the following four sections:
- Output template selector
- Geometry
- Time
- Dataset format
Output template selector
A big data file share is optionally composed of one or more templates. The number of templates is determined by the different formats to which you want to write results. When you open the output template manager, you can view the templates that have been successfully registered in your big data file share. When you select a template from the drop-down menu, the template parameters are populated with the output formatting information. If you want to add a new template, select the Add template option, and select the type and name of the new template. If you want to delete a template, select it from the template selector, and select Delete template. You can modify an existing template by selecting it and modifying any of the sections below as needed.
Anotação:
The input big data file shares have a fields section. The output templates do not have a fields section, since the resulting fields are determined by the GeoAnalytics Tools creating the result. ORC only supports field names that include the basic Latin alphabet and numeric characters. All other characters in a field name are replaced with an underscore.
Geometry
The geometry section lists how you want the output geometry to be formatted for each geometry type (point, line, polygon). There are two parts to determining the output geometry:
- The spatial reference—You can leave it empty, and it will use the tool results (default). Optionally, provide a WKID or WKT string, and all results will be projected to that spatial reference. This value is shared across all output geometries.
- The geometry formatting type and fields—This is described in more detail below.
Output geometry formats
Geometry type | Output fields | Delimited files | Shapefiles | ORC files | Parquet files |
---|---|---|---|---|---|
XYZ—An X, Y, and optionally Z field. This option is only available for points. | By default, three new fields are created, named X, Y, and Z. You can optionally change these field names. | ||||
WKT | By default, one new field named Geometry is created. You can optionally change the output field names. | ||||
GeoJSON | By default, one new field named Geometry is created. You can optionally change the output field names. | ||||
EsriJSON | By default, one new field named Geometry is created. You can optionally change the output field names. | ||||
SHP | By default, one new field named Geometry is created. You can optionally change the output field names. | ||||
WKB | By default, one new field named Geometry is created. You can optionally change the output field names. | ||||
Shape Buffer | By default, one new field named Geometry is created. You can optionally change the output field names. |
Time
The time section outlines how output time is represented. Formatting time requires the following information:
- Formatting for both instants and intervals.
- The field names to which time is written.
- The format (String or Date) in which time is written. Note that delimited files can only be formatted with string.
- For intervals, which fields represent the start and end time.
Time formatting is the same as for input big data files. See Time formats in big data file shares.
Dataset format
The dataset format section outlines the output format to which the data is written. Data may be in one of the following formats:
- Shapefile (.shp)
- Delimited file (for example, .csv)
- Parquet file
- ORC file
The available parameters differ depending on the dataset. For shapefiles, ORC, and Parquet files, the only parameter is the file type, which cannot be modified. If the input dataset is a delimited file, multiple parameters can be modified in ArcGIS Server Manager. These parameters are outlined in the following table:
Dataset formats
Parameter | Description |
---|---|
File extension | Extensions are never applied to an output dataset. |
Field delimiter | Determines the delimiter for each field. Common formats are , and ;. |
Record terminator | The terminator for each row of data cannot be set. For Windows, the terminator is \r\n; for Linux, it's \n . |
Quote character | Determines the character used for quotes. |
Has header row | A Boolean value that determines whether the output table includes a header row representing the field names. By default, this is true. |
Encoding | This is always UTF-8. |