These steps require VFS connections.
To use the Read Metadata or Write Metadata steps:
- Set up a VFS connection to a stand-alone instance of Data Catalog and provide your role access credentials. For more information, see Access to Pentaho Data Catalog.
To use the Catalog Input and Catalog output steps:
- Set up a VFS connection to a stand-alone instance of Data Catalog and provide your role access credentials. For more information see Access to Pentaho Data Catalog.
- Configure S3 as the Default S3 Connection in VFS Connections to access S3 storage. For details, see Connecting to Virtual File Systems.
- You must have an established PDI connection to the cluster(s) you plan on using. For example, a Hadoop driver must be configured as a named connection for your distribution for accessing HDFS. For information on named connections, see Connecting to a Hadoop cluster with the PDI client.