Pentaho Data Catalog provides a built-in list of data identification methods called dictionaries and data patterns. Dictionaries are lists of words used to create bitsets, HyperLogLogs (HLLs), and data patterns that can be used for column data matching relying on bitset matching. Patterns define the data pattern, regular expression, column alias, and tags used to identify a data column. In addition to this, you can import custom dictionaries and data patterns configuration files that better suit your organization's specific needs.