The Subscribe to a Kafka Topic for JSON input connector can be used to retrieve and adapt event data records, as generic JSON, from a Kafka Topic. For more information about getting started with Apache Kafka, see Apache Kafka Introduction.
Usage notes
Keep the following in mind when working with the Subscribe to a Kafka Topic for JSON input connector:
- Use this input connector to consume JSON data from a Kafka Topic. This input connector is a consumer of Kafka.
- This input connector pairs the Generic-JSON inbound adapter with the Kafka inbound transport.
- The adapter interprets generic JSON as opposed to feature JSON or GeoJSON.
- A generic JSON record does not have to contain data that represents a geometry.
- The adapter will handle both single JSON records and JSON records organized in an array.
- The adapter supports the ability to construct a point geometry from x, y, and z attribute values.
- This input connector includes a Learning Mode parameter which can be used to allow the input connector to modify a GeoEvent Definition it has constructed. The purpose of this parameter is to temporarily accept that event data received will have a variable schema or data structure. The input connector will use a sample of received data records to learn more about the variable data structure and append new, previously unobserved, attribute fields to an existing GeoEvent Definition.
- Allowing a GeoEvent Definition to be changed on the fly can negatively impact the design of real-time analysis in a GeoEvent Service. As a best practice, it is recommended that if schema variance is expected in your inbound event data that you use the learning mode for as brief a period of time as possible to produce a GeoEvent Definition that supports all expected variants of your inbound data. The learning mode can then be turned off and the autogenerated GeoEvent Definition will be copied and tailored for production deployment.
- The Kafka inbound transport supports TLS 1.2 and SASL security protocols for authenticating with a Kafka cluster or broker.
Parameters
The following are the parameters for the Subscribe to a Kafka Topic for JSON input connector:
| Parameter | Description |
|---|---|
Name | A descriptive name for the input connector used for reference in GeoEvent Manager. |
Override with Custom Kafka Properties | Specifies whether the default GeoEvent Server Kafkaclient properties are overridden. The default is No.
|
Kafka Bootstrap Servers (Conditional) | A list of hostname:port pairs used to establish the initial connection to the Kafka cluster. Hostname:port pairs must be comma separated, for example, broker0.example.com:9092,broker1.example.com:9092,broker2.example.com:9092. This parameter is shown when Override with Custom Kafka Properties is set to No. |
Topic Name(s) | The name of a Kafka topic, or list of Kafka topics, to consume data of interest from. Separate multiple topics with a semicolon. The following are examples:
Note:Specifying multiple Kafka topics is supported in ArcGIS GeoEvent Server 10.8 and later. |
Number of Consumers | The number of consumers for each consumer group. The default number of consumers is 1. Note:The number of consumers is limited by the number of partitions on the Kafka topic. Refer to Apache Kafka Introduction for more information about consumer instances. |
Consumer Group ID (Conditional) | An optional string that uniquely identifies the consumer group for a set of consumers. This is also known as the consumer group name. If no Consumer Group ID is specified, GeoEvent Server assigns a static consumer group ID called geoevent-consumer. This static consumer group ID is shared across all instances of the Kafka connector where the Consumer Group ID is not specified. It is recommended that you specify a custom Consumer Group ID. Refer to Apache Kafka Introduction for more information about consumer groups. This parameter is shown when Override with Custom Kafka Properties is set to No. |
Registered Folder for the Kafka Properties File (Conditional) | The folder registered with GeoEvent Server that contains the Kafka .properties file. The Kafka .properties file defines the custom Kafka properties when Override with Custom Kafka Properties is set to Yes. Ensure that the folder registered with GeoEvent Server is the full path to where the Kafka .properties file is located. This parameter is shown when Override with Custom Kafka Properties is set to Yes. |
Kafka Properties File Name (Conditional) | The name of the Kafka .properties file that contains the custom Kafka properties for client configuration. Specify the name of the file without the .properties extension.
This parameter is shown when Override with Custom Kafka Properties is set to Yes. |
Start from Beginning | Specifies whether records are consumed from the topic starting at the beginning offset or from the last offset committed for the consumer. The default is Yes.
Note:For more information about offsets, refer to Apache Kafka Configuration. |
JSON Object Name | The name of a JSON element that can be used as the root node of a substructure within the received JSON data. When JSON Object Name is used to specify a JSON element by name, the adapter searches for substructures whose object name matches the specified element name. Only data within the identified substructure is considered. When left blank, which is the default, the uppermost JSON object is used as the root of the entire JSON structure. |
Create GeoEvent Definition | Specifies whether a new or an existing GeoEvent Definition is used for the inbound event data. A GeoEvent Definition is required for GeoEvent Server to interpret the inbound event data attribute fields and data types.
|
GeoEvent Definition Name (New) (Conditional) | The name assigned to a new GeoEvent Definition. If a GeoEvent Definition with the specified name already exists, the existing GeoEvent Definition is used. The first data record received is used to determine the expected schema of subsequent data records, and a new GeoEvent Definition is created based on that first data record's schema. This parameter is shown when Create GeoEvent Definition is set to Yes and is hidden when it's set to No. |
|
GeoEvent Definition Name (Existing) (Conditional) | The name of an existing GeoEvent Definition to use when adapting received data to create event data for processing by a GeoEvent Service. This parameter is shown when Create GeoEvent Definition is set to No and is hidden when it's set to Yes. |
Construct Geometry from Fields | Specifies whether the input connector will construct a point geometry using coordinate values received as attributes. The default is No.
|
X Geometry Field (Conditional) | The attribute field in the inbound event data containing the x-coordinate part (for example, horizontal or longitude) of a point location. This parameter is shown when Construct Geometry from Fields is set to Yes and is hidden when it's set to No. |
Y Geometry Field (Conditional) | The attribute field in the inbound event data containing the y-coordinate part (for example, vertical or latitude) of a point location. This parameter is shown when Construct Geometry from Fields is set to Yes and is hidden when it's set to No. |
Z Geometry Field (Conditional) | The attribute field in the inbound event data containing the z-coordinate part (for example, depth or altitude) of a point location. If no value is provided, the z-value is omitted, and a 2D point geometry is constructed. This parameter is shown when Construct Geometry from Fields is set to Yes and is hidden when it's set to No. |
Learning Mode | Specifies whether Learning Mode is active or unavailable. When Learning Mode is set to Yes, the inbound adapter appends new fields to a GeoEvent Definition it has created and is maintaining.
Learning Mode can be useful when you need to allow the input connector to modify a GeoEvent Definition it has constructed. The purpose of this parameter is to temporarily accept that event data received has a variable schema or data structure. The input connector uses a sample of received data records to identify more about the variable data structure and append new, previously unobserved, attribute fields to an existing GeoEvent Definition. Allowing a GeoEvent Definition to be changed on the fly can adversely impact the design of real-time analytics in a GeoEvent Service. If schema variance is expected in the inbound event data, it is recommended that you use Learning Mode for as brief a period of time as possible to produce a GeoEvent Definition that supports all expected variants of the inbound data. Learning mode can then be turned off and the automatically generated GeoEvent Definition copied and tailored for production deployment. |
Default Spatial Reference | The well-known ID (WKID) of a spatial reference to be used when a geometry is constructed from attribute field values whose coordinates are not latitude and longitude values for an assumed WGS84 geographic coordinate system, or when geometry strings are received that do not include a spatial reference. A well-known text (WKT) value or the name of an attribute field containing the WKID or WKT can also be specified. |
Expected Date Format | The pattern used to match expected string representations of date/time values and convert them to Java Date values. The pattern's format follows the Java SimpleDateFormat class convention. While the preferred pattern for date/time values in GeoEvent Server is the ISO 8601 standard, several string representations of date/time values commonly recognized as date values can be converted to Java Date values without specifying an Expected Date Format value. These include the following:
If the date/time values received use a convention other than one of those listed above, you must specify an expected date format pattern so GeoEvent Server can adapt the date/time values. |
As GeoJSON | Specifies whether the incoming geometry is parsed as a GeoJSON geometry object rather than as a feature JSON. For the default value, it is assumed that when a geometry is received as a string, the value is a feature JSON as described in Geometry objects.
|
Authentication Required | Specifies whether the connection to the Kafka cluster, or Kafka broker, requires authentication. The default is No.
|
Authenticate Using (Conditional) | Specifies the security protocol that is used to secure the Kafka cluster. The security protocol options are TLS 1.2 and SASL.
Note:When using Kerberos, ensure that the operating system user account running ArcGIS GeoEvent Server has read access to the keytab file in the Kerberos setup and configuration. This parameter is shown when Authentication Required is set to Yes. |
Registered Folder for Credential File (Conditional) | The folder registered with GeoEvent Server that contains the Kafka cluster's PKI file (x509 certificate). Ensure that the folder registered with GeoEvent Server is the full path to where the Kafka cluster's certificate is located. This parameter is shown when Authentication Required is set to Yes. It is applicable to TLS 1.2 only. |
Credential Configuration File (Conditional) | The name of the Kafka cluster's PKI file (x509 certificate). The certificate and its associated private key must be stored in the PKCS#12 format, which is represented by a file with either the .p12 or .pfx extension. Provide the name of the file in addition to the extension. Note:Only the certificate file name and extension are supported for this parameter. Do not specify the relative paths to the certificate for this parameter. Register the full path to the certificate file using the Registered Folder for Credential File parameter. This parameter is shown when Authentication Required is set to Yes. It is applicable to TLS 1.2 only. |
Keystore Password (Conditional) | The password for the Kafka cluster's PKI file (x509 certificate). This is also known as the certificate's private key. This parameter is shown when Authentication Required is set to Yes. The parameter is applicable to TLS 1.2 only. |
SASL Authentication Type (Conditional) | Specifies the type of SASL authentication mechanism supported by the Kafka cluster. SASL authentication type options are SASL GSSAPI (Kerberos) and SASL PLAIN.
This parameter is shown when Authentication Required is set to Yes. It is applicable to SASL only. |
Kerberos Principal (Conditional) | The Kerberos principal for the specific user, for example, GeoEventKafkaClient1@example.com. This parameter is shown when Authentication Required is set to Yes. The parameter is applicable to SASL/GSSAPI (Kerberos) only. |
Use Key Tab (Conditional) | Specifies whether the key tab in the Kerberos settings is used. The default is Yes.
This parameter is shown when Authentication Required is set to Yes. It is applicable to SASL/GSSAPI (Kerberos) only. |
Store Key (Conditional) | Specifies whether the key is stored in the Kerberos settings. The default is Yes.
This parameter is shown when Authentication Required is set to Yes. It is applicable to SASL/GSSAPI (Kerberos) only. |
Username (Conditional) | The username that is used to authenticate with the Kafka cluster. This is also known as a connection string with certain cloud providers. Refer to the documentation of the chosen cloud provider for the correct syntax. This parameter is shown when Authentication Required is set to Yes. It is applicable to SASL/PLAIN only. |
Password (Conditional) | The password used to authenticate with the Kafka cluster. Refer to the documentation of the chosen cloud provider for the correct syntax. This parameter is shown when Authentication Required is set to Yes. This parameter is applicable to SASL/PLAIN only. |
Considerations and limitations
There are several considerations to keep in mind when using the Subscribe to a Kafka Topic for JSON input connector:
- Insufficiently managing and optimizing consumers will result in certain instances of the Subscribe to a Kafka Topic for JSON input connector not retrieving data. The number of consumers, in a consumer group, is limited by the number of partitions on a Kafka Topic. If the number of consumers in a consumer group exceeds the number of partitions available on a Kafka Topic, the excess consumers will not consume data. Consider optimizing the number of consumers to best align with the number of partitions on the Kafka Topic or implement different consumer groups for each connector to avoid this. For more information on consumers and consumer groups, see the Kafka Documentation.
- The Subscribe to a Kafka Topic for JSON input connector is a client consumer of Kafka. Apply the same considerations to this input connector as would be required for any other client consumer of Kafka. For example, if this input connector is not receiving data from a Kafka Topic but a separate client consumer of Kafka is, consider the factors that are involved with having two client consumers. This includes but is not limited to the configured consumer group ID, number of partitions available on the Topic, and the number of existing consumers. Alternatively, if the input connector is stopped and started in quick succession, consider the implications for Kafka from a consumer point of view. A rebalancing of the Kafka Topic's partitions may prevent the input connector from immediately rejoining as a consumer under the same consumer group.