Local File System data source

Administer Pentaho Data Catalog

Version
10.0.x
Audience
anonymous
Part Number
MK-95PDC002-00
You can add data to Data Catalog from your local file system by adding Local File System as a data source.
To access files on your local system, make the following changes to the vendor/docker-compose.yml file to ensure that it is accessible by the ws_default container.
  1. Open the vendor/docker-compose.yml file and add the following lines under the ws_default service.
    services:
      ws_default:
        volumes:
          - /my/path/to/file:/tmp/my-path

    You can also include a remote file share as a Local File System. As an example, refer to the following code snippet for adding cifs-share to the Local File System.

    services:
      ws_default:
        volumes:
          - cifs-share:/cifs-share
          
          // Following are optional settings to add cifs share to local file system
          - cifs-share:/cifs-share //Remote file share
    volumes:
      cifs-share:
        driver_opts:
          type: cifs
          o: "username=<user1>,password=<password>,file_mode=0777,dir_mode=0777"
          device: "<IP Address>”
    
  2. Save changes.
  3. Restart the ws_default container for the changes to take effect.

Perform the following steps to identify your data source within Data Catalog:
  1. Click Management in the left toolbar.
    The Manage Your Environment page opens.
  2. In the Resources tile, click Add Data Source.
    The Create Data Source page opens.
  3. Specify the following basic information for the connection to your data source:
    Field Description
    Data Source Name Specify the name of your data source. This name is used in the Data Catalog interface. It should be something your Data Catalog users recognize.
    Note: Names must start with a letter, and must contain only letters, digits, and underscores. White spaces in names are not supported.
    Data Source ID (Optional) Specify a permanent identifier for your data source. If you leave this field blank, Data Catalog generates a permanent identifier for you.
    Note: You cannot modify Data Source ID for this data source after you specify or generate it.
    Description (Optional) Specify a description of your data source.
    Data Source Type Select the database type of your source. You are then prompted to specify additional connection information based on the file system or database type you are trying to access.
  4. In the Path field, specify the path to your local file system.
  5. Click Test Connection to test your connection to the specified data source.
  6. (Optional) Enter a Note for any information you need to share with others who might access this data source.
  7. Click Create Data Source to establish your data source connection.
  8. Click Scan Files. This process loads files and folders to the system.
    You can monitor the status of the file scan on the Workers page.