Job settings tab

Pentaho Data Integration

Version
9.3.x
Audience
anonymous
Part Number
MK-95PDIA003-15


Job settings tab, Amazon Hive Job Executor

This tab includes the following options:

Option Description
Hive job flow name Specify the name of the Hive job flow to execute.
S3 staging directory Specify the Amazon Simple Storage Service (S3) address for the container of stored objects (the bucket) in which your job flow logs will be stored. Artifacts required for execution (for example, Hive Script) will also be stored in this bucket before execution.
Hive script Specify the address of the Hive script to execute within Amazon S3 or on your local file system.
Command line arguments Enter in any command line arguments you want to pass into the specified Hive script. Use spaces to separate multiple arguments.
Keep job flow alive Select if you want to keep your job flow active after the PDI entry finishes. If this option is not selected, the job flow will terminate when the PDI entry finishes.
Enabling blocking

Select if you want the PDI entry to wait until the EMR Hive job completes. Blocking is the only way for PDI to be aware of the status of a Hive job. Additionally, selecting this option enables proper error handling and routing.

When you clear this option, the Hive job is blindly executed and PDI moves on to the next entry.

Logging interval If you Enable blocking, specify the number of seconds between status log messages.