Run configurations

Pentaho Data Integration

Version
9.3.x
Audience
anonymous
Part Number
MK-95PDIA003-15

Some ETL activities are lightweight, such as loading in a small text file to write out to a database or filtering a few rows to trim down your results. For these activities, you can run your transformation locally using the default Pentaho engine. Some ETL activities are more demanding, containing many steps calling other steps or a network of transformation modules. For these activities, you can set up a separate Pentaho Server dedicated for running transformations using the Pentaho engine. Other ETL activities involve large amounts of data on network clusters requiring greater scalability and reduced execution times. For these activities, you can run your transformation using the Spark engine in a Hadoop cluster.

Run configurations allow you to select when to use either the Pentaho (Kettle) or Spark engine. You can create or edit these configurations through the Run configurations folder in the View tab as shown below:


Run Configurations Folder

To create a new run configuration, right-click on the Run configurations folder and select New. To edit or delete a run configuration, right-click on an existing configuration.

Note: Pentaho local is the default run configuration. It runs transformations with the Pentaho engine on your local machine. You cannot edit this default configuration.

Selecting New or Edit opens the Run configuration dialog box that contains the following fields:

Field Description
Name Specify the name of the run configuration.
Description Optionally, specify details of your configuration.
Engine Select the type of engine for running a transformation. You can run a transformation with either a Pentaho or a Spark engine. The fields displayed in the Settings section of the dialog box depend on which engine you select.