Processing structured data

Use Pentaho Data Catalog

Version
10.0.x
Audience
anonymous
Part Number
MK-95PDC000-00

Perform the following steps to process the structured data:

You must perform Metadata Ingest, Data Profiling, and Data Identification to process structured data.
  1. Select the structured resource you want to investigate in Data Canvas.
    This can be a table or column.
  2. Click Process.
    The Choose Process pane opens with Metadata Ingest, Data Profiling, and Data Identification options.
    Choose process
  3. In the Metadata Ingest tile, click Start to begin the metadata ingest process.
    You can view the status of metadata ingest on the Manage Workers page.
  4. To perform the data profiling, click the Data Profiling tile.
    The Profiling page opens with an option to configure data profiling. You can use Skip Recent (days) to skip profiling for recently profiled tables. For example, if the days field is set to 7, any table profiled within the last 7 days will be skipped.
    Note: When configuring data profiling, it is recommended to use the default settings as they are suitable for most situations.
  5. To perform data identification, click the Data Identification tile.
    Note: You must perform data profiling before proceeding with data identification.
    If data profiling is not done, Data Catalog highlights it as Required. You can start data profiling from the Data Identification pane by clicking Start.
    Profiling
  6. Click Select Methods and select the Dictionaries and Patterns, click Apply, and then click Start.
    You can view the status of metadata ingest on the Manage Workers page.
  7. Go to Data Canvas to view tags.