Content tab

Pentaho Data Integration

Version
9.3.x
Audience
anonymous
Part Number
MK-95PDIA003-15


Content tab in Regex Evaluation

The Content tab contains the following options:

Option Description
Ignore differences in Unicode encodings Select to ignore different Unicode character encodings. This action may improve performance, but your data can only contain US ASCII characters.
Enables case-insensitive matching Select to use case-insensitive matching. Only characters in the US-ASCII charset are matched. Unicode-aware case-insensitive matching can be enabled by specifying the 'Unicode-aware case...' flag in conjunction with this flag.

The execution flag is (?i).

Permit whitespace and comments in pattern Select to ignore whitespace and embedded comments starting with # through the end of the line. In this mode, you must use the \s token to match whitespace. If this option is not enabled, whitespace characters appearing in the regular expression are matched as-is.

The execution flag is (?x).

Enable dotall mode Select to include line terminators with the dot character expression match.

The execution flag is (?s).

Enable multiline mode Select to match the start of a line '^' or the end of a line '$' of the input sequence. By default, these expressions only match at the beginning and the end of the entire input sequence.

The execution flag is(?m)

Enable Unicode-aware case folding Select this option in conjunction with the Enables case-insensitive matching option to perform case-insensitive matching consistent with the Unicode standard.

The execution flag is (?u).

Enables Unix lines mode Select to only recognize the line terminator in the behavior of '.', '^', and '$'.\

The execution flag is (?d).