Option | Description | Default Value/Data Type |
---|---|---|
Step name | Specifies the unique name of the XML Input Stream (StAX) step on the canvas. A transformation step can be placed on the canvas several times; however, it represents the same transformation step. You can customize the name or leave it as the default. | |
Filename | File name of the input XML file. Specify your file name by entering its path or clicking Browse. If you connect to a step that precedes the XML Input Stream step, the Browse button is hidden, and the text box becomes a drop-down menu that is populated with the fields from the preceding step. Select a value from the drop-down menu to use as the path to an XML file. You can use internal variables to specify the path. | |
Source is from a previous step | Accept data from a field in a previous step. | |
Source field name | Selects a field from the previous step to use as XML data. | |
Add filename to result? | Adds the processed XML filename to the result of this transformation by passing the filename of the XML input file as a value on each result row. You can then use it in subsequent steps where you want to use the filename as a value. | No |
Skip (Elements/Attributes) | Number of elements or attributes that should be skipped. Use this field for starting the processing at a specific location in a file. The file will still be loaded by the parser, but the rows will not be produced. | 0 |
Limit (Elements/Attributes) | Limits the number of elements or attributes to process. With the Skip and Limit properties, you can enable chunk loading that is defined in an outer loop. | 0 |
Default String Length | The default string length for the XML data name and value fields. | 1024 |
Encoding | Encodes the XML file data in the specified encoding. | UTF-8 |
Add Namespace information? | Adds the XML data type NAMESPACE to the stream. You can add an optional prefix (defined in the XML data name) and URI information (defined in the XML data value). This option adds a defined prefix in the ELEMENT data type to the XML data name, for example, prefix:product. Due to the extra namespace handling, this option slows down the processing throughput. | No |
Trim strings? | Trims all name/value elements and attributes. It eliminates white spaces, tabs, carriage returns, and line feed characters at the beginning and end of the string. | Yes |
Include filename in output? / Fieldname | Adds the processed file name to the specified field name. | xml_filename (String 256) |
Row number in output? / Fieldname | Adds the processed row number (starting with 1) to the specified field name. | xml_row_number (Integer) |
XML data type (numeric) in output? / Fieldname |
Adds the processed data type in numeric format to the specified field name. The following data types are defined: 0 - "UNKNOWN" (Reserved) 1 - "START_ELEMENT" 2 - "END_ELEMENT" 3 - "PROCESSING_INSTRUCTION" (Reserved) 4 - "CHARACTERS" 5 - "COMMENT" (Reserved) 6 - "SPACE" (Reserved) 7 - "START_DOCUMENT" 8 - "END_DOCUMENT" 9 - "ENTITY_REFERENCE" (Reserved) 10-"ATTRIBUTE" 11-"DTD" (Reserved) 12-"CDATA" (Reserved) 13-"NAMESPACE" (When namespace information is selected) 14-"NOTATION_DECLARATION" (Reserved) 15-"ENTITY_DECLARATION" (Reserved). |
xml_data_type_numeric (Integer) |
XML data type (description) in output? / Fieldname |
Adds the processed data type in text format to the specified field name. This option should be used instead of the numeric data type for better readability of the transformation. See the XML data type (numeric) description above for a list of values. Because this option can cause slower processing of strings and extra memory consumption, it is recommended to use the numeric data type format for big data loads. |
xml_data_type_description (String 25) |
XML location line in output? / Fieldname | Adds the processed source XML location line to the specified field name. | xml_location_line (Integer) |
XML location column in output? / Fieldname | Adds the processed source XML location column to the specified field name. | xml_location_column (Integer) |
XML element ID in output? / Fieldname | Adds the processed element number (starting with '0') to the specified field name. In contrast to adding the Row number, this field number is incremented by the count of each new element and not the row number. This numbering ensures that the nesting between levels is correct. | xml_element_id (Integer) |
XML parent element ID in output? / Fieldname |
Adds the parent element number to the specified field name. When you use the XML element ID with the XML parent element ID, a complete XML element tree is available for later usage. |
xml_parent_element_id (Integer) |
XML element level in output? / Fieldname | Adds the processed element level to the specified field name, starting with '0' for the root START_ and END_DOCUMENT. | xml_element_level (Integer) |
XML path in output? / Fieldname | Adds the processed XML path to the specified field name. | xml_path (String 1024) |
XML parent path in output? / Fieldname | Adds the processed XML parent path to the specified field name. | xml_parent_path (String 1024) |
XML data name in output? / Fieldname | Adds the processed data name of elements, attributes, and optional namespace prefixes to the specified field name. | xml_data_name (String 1024 or Default String Length) |
XML data value in output? / Fieldname | Adds the processed data value of elements, attributes and optional namespace URIs to the specified field name. | xml_data_value (String 1024 or Default String Length) |
If a Set/Reset functionality is needed, you can use the Modified Java Script Value scripting step or the User Defined Java Class step to create one. The User Defined Java Class step is recommended because it is faster.