How to tier a file system path to HCP via a repository

Ops Center Protector User Guide

Version
7.7.x
Audience
anonymous
Part Number
MK-99PRT002-08
ft:lastEdition
2023-10-26
Note: Tiering repository data to HCP is only available when using a generation 1 repository node and a generation 1 HCP node.
Note: In order to tier data from a repository store to HCP, a new tiering data flow, using a new, unpopulated repository store is required. Adding a tiering mover and HCP node to an existing repository data flow will not work.
It is assumed that the following tasks have been performed:
  • The Protector Master software has been installed and licensed on a dedicated node. See Installation Tasks and License Tasks.
  • The Protector Client software has been installed on the source node where the file system path resides.
  • The Protector Client software has been installed on the destination node where the Repository will reside.
  • HCP generation 1 node has been set up as per the Protector requirements and prerequisites. Refer to Generation 1 Hitachi Content Platform prerequisites.
  • Permissions have been granted to enable the Protector UI, required activities and participating nodes to be accessed. In this example all nodes will be left in the default resource group, so there is no need to allocate nodes to user defined resource groups. Refer to How to configure basic role based access control.

This task describes the steps to follow when tiering data that resides on a file system, to HCP. These files are first ingested by a repository, using batch mode backup, and then immediately moved from the repository to an associated namespace within a tenant on HCP. Once data is tiered to HCP, the repository retains only the metadata describing the backed up files system. The data tiered to HCP can be located and restored by following the same workflow as that used for restoring repository data (see How to restore a repository snapshot of a file system path). The data flow and policy are as follows:

Figure. Tiering Data Flow

If you need frequent backups available in a local repository as well as long term backups on HCP, then implement a repo-to-repo data flow (see How to backup an onsite repository to an offsite repository) and tier to HCP from the second repository. The first repository will hold frequent local backups while the second manages long term retention on HCP.

Table. Path Backup Policy
Classification Type Parameters Value
Path Include C:\testdata
Operation Type Parameters Value Assigned Nodes
Backup RPO 1 Week Repository (this controls the retention of data on HCP)
Retention 5 Years
Run Options Run on RPO
Tier None N/A Hitachi Content Platform
  1. Locate the source node in the Nodes Inventory and check that it is authorized and online. This node is where the production data to be backed up resides.
    For a file system backup using a Path classification, a basic OS Host node is required. It is not necessary to create the source node in this case since all Protector client nodes default to this type when installed. See How to authorize a node.
  2. Locate the intermediate node in the Nodes Inventory and check that it is authorized and online.
    This node is where the repository will be hosted and is identified as the Proxy Node when creating the repository in the next step.
  3. Create a new generation 1 Repository node using the Repository Storage Node Wizard and check that it is authorized and online.
    The Repository node type is grouped under Storage in the Node Type Wizard. You can direct data from multiple nodes to a single repository so there is no need to create a new repository if a suitable one already exists. See How to add a node, and How to authorize a node.
  4. Create a new generation 1 Hitachi Content Platform node using the Hitachi Content Platform Storage Node Wizard and check that it is authorized and online.
    The Hitachi Content Platform node type is grouped under Storage in the Node Type Wizard. You can direct data from multiple repository stores to a single HCP node (each repository store maps to a separate namespace within the HCP tenant), so there is no need to create a new HCP node if a suitable one already exists.
    Note: Each HCP node in Protector represents a single tenant, so if you need to strictly segregate repository data then create separate HCP nodes for each tenant and consider placing them in separate RBAC resource groups.
    See How to add a node, and How to authorize a node.
  5. Define a policy as shown in the table above using the Policy Wizard, Path Classification Wizard, Backup Operation Wizard and Tier Operation Wizard.
    The Path classification is grouped under Physical in the Policy Wizard. See How to create a policy.
  6. Draw a data flow as shown in the figure above, using the Data Flow Wizard, that shows the OS Host source node connected to the Repository intermediate node via a Batch mover, then to the Hitachi Content Platform destination node via a second Batch mover.
  7. Assign the Path-Backup-Tier policy to the OS Host source node, the Backup operation to the Repository node and the Tier operation to the HCP node on the data flow.
    Select the Standard Store Template when assigning the operation to the repository. There is no value in selecting a template that performs source-side or repository-side deduplication in this situation. See How to apply a policy to nodes on a data flow.
  8. Compile and activate the data flow, checking carefully that there are no errors or warnings.
    See How to activate a data flow.
    Note: Do not deactivate tiering data flows unless they are longer required. Subsequent reactivation will force the source and repository to undergo resynchronization, leading to all files being re-tiered to HCP.
  9. Locate the active data flow in the Monitor Inventory and open its Monitor Details.
    The policy will be invoked automatically to create an initial backup and then repeatedly according to the RPO specified in the policy. The policy can also be manually triggered from the source node in the monitor data flow. See How to trigger an operation from an active data flow.
  10. Watch the active data flow via the Monitor Details to ensure the policy is operating as expected.
    For a healthy data flow you may periodically see:
    • An animated resynchronization icon appear above the batch mover into the repository each time the RPO is reached.
    • An animated tiering icon appear above the batch mover into the HCP node each time the repository tiers data.
    • Repository Statistics - Queues - Tier values changing, indicating objects queued and actively being tiered to HCP.
      Tip: Check the tier queue if RPO is not being met for tiered data flows.
    • Transient Node Status icons appear over nodes and associated information messages displayed to the right of the data flow area.
    • Network/Cache Utilization fluctuations within normal limits if large amounts of data are being backed up to the repository.
    • Backup jobs appearing in the Jobs area below the data flow that cycle through stages and ending in Progress - Completed. Note there is no Tiering job since this process takes place on an ad hoc basis.
    • Information messages appearing in the Logs area below the data flow indicating rules activation, HCP namespace creation, resynchronization and ingestion throttling events.
    For a problematic data flow you may see:
    • Permanent Node Status icons appear over nodes and associated warning messages displayed to the right of the data flow area.
    • Local/Remote Memory Cache constantly at excessively high levels if large amounts of data are being backed up, indicating data transfer issues.
    • Backup jobs appearing in the Jobs area below the data flow that cycle through stages and terminating in Progress - Failed.
    • Warning and error messages appearing in the Logs area below the data flow indicating failed events.
  11. Review the status of the Repository via the relevant Generation 1 Repository Details and the stores via the relevant Gen1 Repository Store Details, to ensure backup snapshots are being created. Also monitor the health of HCP via its Tenant Management Console, especially Namespaces - Usage.
    The UUID of the repository store (used to name the corresponding HCP namespace) can be found in the Gen1 Repository Store Details.

    Repositories require ongoing surveillance to ensure that they are operating correctly and sufficient resources are available to store your data securely. See How to view the status of a repository. A repository store that has been completely tiered to HCP will report a zero size.

    The space reported by HCP to tier a repository may appear much larger than that reported by the source node's file system. HCP allocation size is 8KB minimum per object, and each file requires at least 2 HCP objects (one per stream). When tiering many small files, the size reported HCP usage will appear larger than expected. Add to this the fact that HCP will create 2 or more replicas depending on the DPL setting.

    New snapshots will appear in the Gen1 Repository Store Details periodically as dictated by the RPO of the policy. Old snapshots will be removed periodically as dictated by the Retention Period of the policy. The retention period of individual snapshots can be modified here if required.