Dictionaries and data patterns

Get started with Pentaho Data Catalog

Version
10.1.x
Audience
anonymous
Part Number
MK-95PDC001-02

Data identification uses two data discovery methods, dictionaries and data pattern analysis. Data Catalog installs a set of pre-configured dictionaries and patterns, but you can define custom dictionaries and patterns if they are necessary for your specific requirements.

Dictionaries
Dictionaries are word or term lists used to create bitsets and data patterns that you can then use to match column data.
Data patterns
You can use data patterns for a variety of purposes, such as regular expression (RegEx) generation, data identification, and data quality checking.