Input tab

Pentaho Data Integration

Version
9.3.x
Audience
anonymous
Part Number
MK-95PDIA003-15

Input tab, Copybook Input step

The Input tab has the following sections.

Source

These options specify the location of the binary data

.
Option Description
Predefined file Select this option to specify a path to a binary data file that contains the data you want to read into the PDI stream. You can type any VFS path directly into the File field, including any variables, or you can click the Browse button to locate the binary data file.
File defined in a field Select this option to read the names of the binary files from a field name in the previous step. Select the name of the field from the drop-down list.
Data already loaded in a binary field Select this option if the binary data is passed into the step from a binary field on the PDI stream. Select the step generating the binary field from the drop-down list. You can use this option to prepare the output of records by another Copybook Input step. Using this method, you can selectively process fields and avoid conversion errors in definition files that include REDEFINES. See the Store record as a binary field option in the Options tab.

Schema

These options define the location of the copybook definition file and include mapping options for the binary data files.

Option Description
COBOL Copybook file path Specify the file path to the copybook definition file. You can enter any VFS or SFTP file path or click Browse to open the system file browser. After selecting a file, click Validate to verify that the definition file can be accessed and parsed.
COBOL Copybook line structure Specify the line structure of the definition file.
Standard columns (6 to 72)
Select this option when the definition file contains line numbers. The first 6 columns of text from each line are ignored. Any data beyond column 72 is ignored.
Full line
Select this option when the definition file does not contain line numbers.

Binary Format

Use these options to describe the binary format of the selected file.

Option Description
Source architecture Select the machine architecture of the binary data source files. The values are:
Big endian (mainframe)
The most significant byte first and the least significant byte last.
Little endian
The least significant byte first and the most significant byte last.
Source charset name Select the character encoding set of the binary data file.

Mainframe EBCDIC is typically encoded using IBM037 or cp1047 character sets.

For more information about character sets and their aliases, see Supported Encodings in the Oracle® documentation.

Packed decimal (COMP-3) sign convention Select how COMP-3 Packed decimals are parsed from the binary data as it relates to sign convention. For a given field, if validation occurs and fails, a conversion error will occur at runtime. See Use Error Handling for details.
Strict
Must follow the IBM S370FPD specification to avoid validation errors. Validation is performed to verify that all nibbles (half-bytes), except the sign nibble, are decimal digits (0-9). This is the default value.
  • For signed packed decimals, the sign nibble must be C (positive) or D (negative).
  • For unsigned packed decimals, the sign nibble must be F.
Lenient
Validation is performed to verify that all nibbles contain decimal digits and the sign nibble contains a hexadecimal value of A-F. The sign nibble is only used to interpret a negative number if the value is D.
Lenient - unchecked
No validation is performed on the source bytes. The sign nibble may contain any hexadecimal value 0-F, and the last nibble is not included in the result. The sign nibble is only used to interpret a negative number if the value is D.