Adding data sources is the first step when building your Data Catalog. Data sources are the building blocks in configuring your catalog. You can connect the different data sources in your data lake, both on premises and hosted in the cloud. As part of this step, you will test the data source connections and ingest schemas. You should have already planned your data sources, as described in Planning your Data Catalog.
For the steps to add data sources, see the Administer Pentaho Data Catalog document.
- Test connections
- Before you can save newly-configured data sources, you need to test the connections. This process tests the data source configuration and connectivity, returning helpful information if there is an issue.
- Ingest schemas
- Before you can save the newly-configured data sources, you must also load basic database schemas and associated metadata information into Data Catalog.