Using Merge rows (diff) on the Pentaho engine

Pentaho Data Integration

Version
9.3.x
Audience
anonymous
Part Number
MK-95PDIA003-15

If you are running your transformation on the Pentaho engine, use the following instructions to set up the Merge rows (diff) step.

Important: When using the Pentaho transformation engine, the reference rows and compare rows must be sorted on the specified keys. When using the Merge rows (diff) step within a PDI transformation, such as with the Sort rows step, sorting works correctly. However, if the data is sorted outside of PDI, such as in a SQL query, you may run into issues with the internal case sensitive/insensitive flag or other collations. If you are using the Merge rows (diff) step with the Spark engine, see Using Merge rows (diff) on the Spark engine.