Setup tab

Pentaho Data Integration

Version
9.3.x
Audience
anonymous
Part Number
MK-95PDIA003-15

Kafka consumer Setup tab

In this tab, define the connections used for receiving messages, topics to which you want to subscribe, and the consumer group for the topics.

Option Description
Connection
Select a connection type:
Direct
Specify the Bootstrap servers from which you want to receive the Kafka streaming data.
Cluster
Specify the Hadoop cluster configuration from which you want to retrieve the Kafka streaming data. In a Hadoop cluster configuration, you can specify information like host names and ports for HDFS, Job Tracker, security, and other big data cluster components. Multiple servers can be specified if these are part of the same cluster. For information on Hadoop clusters, see Connecting to a Hadoop cluster with the PDI client.
Topics Enter the name of each Kafka topic from which you want to consume streaming data (messages). You must include all topics that you want to consume.
Consumer group

Enter the name of the group of which you want this consumer to be a member. Each Kafka consumer step starts a single thread for consuming.

When part of a consumer group, each consumer is assigned a subset of the partitions from topics it has subscribed to, which locks those partitions. Each instance of a Kafka consumer step will only run a single consumer thread.