CSV File Input

Pentaho Data Integration

Version
9.3.x
Audience
anonymous
Part Number
MK-95PDIA003-15

The CSV File Input step reads data from delimited text files into a PDI transformation. While this step is called CSV File Input, you can also use CSV File Input with many other separator types, such as pipes, tabs, and semicolons.

Note: The semicolon (;) is set as the default separator type for this step.
Note: The options for this step are a subset of the Text File Input step. This step differs from the Text File Input step in the following ways:
NIO
Non-blocking I/O is used for native system calls to read the file faster, but is limited to local files. It does not support VFS.
Parallel Running
If you configure this step to run in multiple copies (or in a clustered mode) and you enable parallel running, each copy will read a separate block of a single file. You can distribute the reading of a file to several threads or even several slave nodes in a clustered transformation.
Lazy Conversion
If you are reading many fields from a file and many of those fields will not be manipulated but merely passed through the transformation to land in some other text file or a database, lazy conversion can prevent PDI from performing unnecessary work on those fields (such as converting them into objects like strings, dates, or numbers).

An example of a simple CSV input transformation (CSV Input - Reading customer data.ktr) can be found in the data-integration/samples/transformations directory.