Quick Setup mode

Pentaho Data Integration

Version
9.3.x
Audience
anonymous
Part Number
MK-95PDIA003-15

Sqoop Import step Quick Setup mode

Source

The Source refers to the database from which you want to pull your data into a cluster.

Option Definition
Database Connection

Click Choose Available to select an existing database connection that contains the data for import.

If you do not have an existing connection, click New. If you need to modify an existing connection, click Edit.

Edit Click Edit to open the Database Connection dialog box if you need to modify an existing connection database.
New Click New to open the Database Connection dialog box where you can add a new database connection. See Define data connections
Table The name of the source table. If the source database requires a schema, you must supply it in the format: SCHEMA.TABLE_NAME. This table must exist in the destination database and its structure must match the input data’s format.
Browse Click Browse to open the Database Explorer and explore configured database connections.

Target

The target refers to the Hadoop cluster where you want to put your data.

Option Definition
Hadoop Cluster

The name of the Hadoop cluster that contains the data for import. Use the Use Advanced Options to specify configuration information for the host names and ports for HDFS, Job Tracker, and other big data cluster components (default).

Click Choose Available to select an existing cluster to use. If you do not have any existing cluster connections, click New.

Information on Hadoop can be found in Use Hadoop with Pentaho.

Target Directory Path of the HDFS directory from where you want to import.
Browse

Click Browse to display the Open File dialog box, which displays the file system of the cluster. Click the directory to select the directory with your Sqoop data.

Note: Browse only works when you have a valid cluster connection configured.

Open File dialog box

When you have a valid cluster connection, click Browse to display the Open File dialog box to view the cluster files.

Option Definition
Open from Folder Indicates the path and name of the HDFS directory you want to browse. This directory becomes the active directory.
Up One Level Displays the parent directory of the active directory shown in the Open from Folder field.
Delete Deletes a folder from the active directory.
Create Folder Creates a new folder in the active directory.
Active Directory Contents (no label) Displays the active directory, which is the one that is listed in the Open from Folder field.
Filter Applies a filter to the results displayed in the active directory contents.