View dictionaries and data patterns

Administer Pentaho Data Catalog

Version
10.0.x
Audience
anonymous
Part Number
MK-95PDC002-00

In Data Catalog, perform the following steps to view data identification methods dictionaries and data patterns available for use.

  1. Click Management in the left navigation menu.
    The Manage Your Environment page opens.
  2. Click View Methods.
    You can view the list of dictionaries under the Dictionaries tab and the list of data patterns under the Patterns tab.
  3. Locate the dictionaries or data patterns you want to view in the table and select the View Details button (>) in its row.
    For Dictionaries, to view the JSON file details, click the Rules tab.

    It provides insight into logic for the dictionary to apply tags mentioned in the JSON file, such as conditions and confidence scores. Based on these data factors, you can apply dictionaries or patterns to data set.

    For example, in the following JSON file the dictionary rule specifies that the type is "Dictionary". The confidence score is calculated based on the weighted sum of "similarity" and "metadataScore" with conditions set to apply when the confidence score is greater than or equal to 0.7 and the column cardinality is greater than or equal to 1. If these conditions are met, the action is to apply the tag "General" to the dataset. This demonstrates how the provided logic guides the application of tags to datasets based on specified criteria.

    [ 
        { 
            "__typename": "dictionariesRules", 
            "type": "Dictionary", 
            "minSamples": 200, 
            "confidenceScore": { 
                "+": [ 
                    { 
                        "*": [ 
                            { 
                                "var": "similarity" 
                            }, 
                            0.9 
                        ] 
                    }, 
                    { 
                        "*": [ 
                            { 
                                "var": "metadataScore" 
                            }, 
                            0.1 
                        ] 
                    } 
                ] 
            }, 
            "condition": { 
                "and": [ 
                    { 
                        ">=": [ 
                            { 
                                "var": "confidenceScore" 
                            }, 
                            "0.7" 
                        ] 
                    }, 
                    { 
                        ">=": [ 
                            { 
                                "var": "columnCardinality" 
                            }, 
                            "1" 
                        ] 
                    } 
                ] 
            }, 
            "actions": [ 
                { 
                    "applyTags": [ 
                        { 
                            "k": "General" 
                        } 
                    ] 
                } 
            ] 
        } 
    ]