SMB/CIFS data source

Administer Pentaho Data Catalog

Version
10.0.x
Audience
anonymous
Part Number
MK-95PDC002-00

Server Message Block (SMB) and Common Internet File System (CIFS) are Windows filesharing protocols used in storage systems. You can add data to Data Catalog from a filesharing protocol CIFS or SMB to the remote agent or local agent, thereby enabling the creation of a data source as CIFS or SMB with the local file system path.

This protocol uses a client-server model where the server provides the shared file system and the client mounts the file system to access the shared files as if they were on a local disk. You can add data to the Data Catalog from any file-sharing network system if it is transferrable via the Server Message Block (SMB) and Common Internet File System (CIFS).

Perform the following steps to add SMB/CIFS as a data source
  1. Click Management in the left toolbar.
    The Manage Your Environment page opens.
  2. In the Resources tile, click Add Data Source.
    The Create Data Source page opens.
  3. Specify the following basic information for the connection to your data source:
    Field Description
    Data Source Name Specify the name of your data source. This name is used in the Data Catalog interface. It should be something your Data Catalog users recognize.
    Note: Names must start with a letter, and must contain only letters, digits, and underscores. White spaces in names are not supported.
    Data Source ID (Optional) Specify a permanent identifier for your data source. If you leave this field blank, Data Catalog generates a permanent identifier for you.
    Note: You cannot modify Data Source ID for this data source after you specify or generate it.
    Description (Optional) Specify a description of your data source.
    Data Source Type Select the database type of your source. You are then prompted to specify additional connection information based on the file system or database type you are trying to access.
  4. After you have specified the basic information, specify the following additional connection information based on the file system or database type you are trying to access.
    Field Description
    Affinity

    This default setting specifies which agents should be associated with the data source in a multi-agent deployment.

    Configuration method By default, it is URI.
    • URI: URIs are used to identify and locate resources on the internet or within a network. For example, the URI would look like smb/cifs://server.example.com
    • Path: NFS path to access the data source. For example the path would look like smb/cifs://server:/path/to/resource

    After you have specified the detailed information according to your data source type, test the connection to the data source and add the data source.

  5. Click Test Connection to test your connection to the specified data source.
  6. (Optional) Enter a Note for any information you need to share with others who might access this data source.
  7. Click Create Data Source to establish your data source connection.
  8. Click Scan Files. This process loads files and folders to the system.
    You can monitor the status of the file scan on the Workers page.