This section describes settings for fault tolerance of user data and management functions in Single-AZ configuration.
The following table shows combinations of settings that must be specified to create a plus-1 or plus-2 redundant configuration. Settings other than those listed in the table are not allowed.
If a failure occurs in the storage cluster while the write back mode with cache protection is disabled, data on the snapshot volume might be lost. When the write back mode with cache protection is enabled, data on the snapshot volume is protected.
However, even if the write back mode with cache protection is enabled, if failures exceed the storage controller redundancy, the data on the snapshot volume is not protected.
Configuration |
Function settings |
|||
---|---|---|---|---|
User data protection method |
Redundancy of the storage controller1 |
Number of cluster master nodes |
Number of fault domains |
|
1. Plus-1 redundant configuration2 |
Mirroring Duplication |
OneRedundantStorageNode (degree = 2) |
3 nodes |
1 |
2. Plus-2 redundant configuration2 |
HPEC 4D+2P |
TwoRedundantStorageNodes (degree = 3) |
5 nodes |
1 |
1. Do not explicitly specify the degree of redundancy of the storage controller, because it is automatically determined depending on the user data protection method. 2. In the following cases, only one failure is assumed to have occurred irrespective of the number of failures:
|
Provides an overview, features, and notes for each function.
User data protection methods
VSP One SDS Block supports HPEC (Hitachi Polyphase Erasure Coding) and Mirroring to protect user data. HPEC is the Hitachi proprietary data protection method developed for SDS systems with low network bandwidth between storage nodes. HPEC stores user data on a local drive. Mirroring is a data protection method that stores a copy of user data on another storage node.
Configure by selecting 4D+2P for HPEC and Duplication for Mirroring. Note, however, that selecting either of these options might cause restrictions on the combination of functions that can be used.
-
HPEC 4D+2P (4 data +2 parity): Set if you want to focus on the number of failures allowed. The number of storage nodes or drive failures allowed is two.
-
Mirroring Duplication (1 data +1 copydata): Select this method if performance is priority. The Mirroring method is superior to the HPEC methods in fault tolerance against storage node or drive failures, as well as in performance during normal operation. The number of storage nodes or drive failures allowed is one.
for HPEC 4D+2P
-
User data and its parities are stored on six or more different storage nodes for redundancy.
-
At least six storage nodes are required.
-
You can use a maximum of 50% to 65% of physical capacity.
However, if the rebuild capacity policy (rebuildCapacityPolicy) is set to "Fixed" (default), you can use a maximum of 50% to 65% of the physical capacity excluding the rebuild capacity on each storage node. For details about the rebuild capacity, see Rebuild capacity of a storage pool in the VSP One SDS Block System Administrator Operation Guide.
-
The number of storage node or drive failures allowed is two. The number is the sum of the number of defective storage nodes and the number of defective drives. However, the number is counted as one failure in the following cases.
- One or more drive failures occurred on a faulty storage node.
- Drive failures occurred on a single storage node.
(A) Store data locally and reduce network communication during read
(B) Primary coding: Coding reduces data volume for two redundancies
(C) Secondary coding: Data storage capacity is reduced to achieve capacity efficiency that is equivalent to EC (Erasure Coding)
for Mirroring Duplication
-
User data and its copies are stored on two different storage nodes for redundancy.
-
At least three storage nodes are required.
-
You can use a maximum of 40% to 48% of physical capacity.
However, if the rebuild capacity policy (rebuildCapacityPolicy) is set to "Fixed" (default), you can use a maximum of 40% to 48% of the physical capacity excluding the rebuild capacity on each storage node. For details about the rebuild capacity, see Rebuild capacity of a storage pool in the VSP One SDS Block System Administrator Operation Guide.
-
The read performance of this method is equivalent to the HPEC 4D+2P method, but the write performance is superior to the HPEC 4D+2P method. Additionally, the fault tolerance against storage node or drive failures is also superior to the HPEC methods.
-
The allowable number of defective storage nodes or drives is 1. The number is the sum of the number of defective storage nodes and the number of defective drives. However, the number is counted as one failure in the following cases.
- One or more drive failures occurred on a faulty storage node.
- Drive failures occurred on a single storage node.
However, two or more failures can be allowed except in the following cases.
-
Condition 1: Storage node or drive failures occur on both storage nodes that belong to redundant storage controllers. For details about storage controllers, see Redundancy of the storage controller.
CAUTION:Failures might not be allowed even when Condition 1 is not met for the following cases:
-
After adding storage nodes until drive data relocation is completed
-
-
Condition 2: Failures occur on two or more cluster master nodes. For details about cluster master nodes, see Redundancy of the cluster master node.
For how to design the capacity for the HPEC 4D+2P or Mirroring Duplication method, see Capacity Design (for HPEC 4D+2P) or Capacity design (for Mirroring) in the VSP One SDS Block System Administrator Operation Guide.
Redundancy of the storage controller
Storage controllers are part of VSP One SDS Block processes that manage storage node capacities and volumes.
An equal number of storage controllers and storage nodes manage the capacity and volumes of each storage node. Each storage controller manages one storage node and can also manage one or more other storage nodes for redundancy in case a storage node failure occurs.
The two settings following that determine the degree of redundancy of the storage controller are available. Do not explicitly specify the degree of redundancy of the storage controller, because it is automatically determined depending on the user data protection method.
OneRedundantStorageNode (degree = 2)
The system can continue to operate if a maximum of one storage node becomes faulty.
The following example shows assigning storage controllers to storage nodes if OneRedundantStorageNode is selected:
TwoRedundantStorageNodes (degree = 3)
The system can continue to operate if a maximum of two storage nodes become faulty.
The following example shows assigning storage controllers to storage nodes if TwoRedundantStorageNodes is selected:
Redundancy of the cluster master node
Storage nodes are classified into cluster master nodes and cluster worker nodes. Cluster master nodes are further classified into primary and secondary nodes. Only one cluster master node (primary) in a storage cluster manages and controls the entire storage cluster. If a failure occurs on the cluster master node (primary), one of the cluster master nodes (secondary) becomes the primary node, so that the entire storage cluster can continue to operate.
The number of cluster master nodes is determined from the following two options. The determined number of storage nodes are automatically selected as the cluster master nodes in order beginning from storage node1.
For the selected storage nodes, which node becomes the primary node (and which storage nodes become secondary nodes) is automatically determined in the storage cluster.
3 nodes
This option applies when the user data protection method is Mirroring Duplication. The system can continue to operate if a maximum of one cluster master node becomes faulty.
The following example shows the 3-node configuration:
5 nodes
This option applies when the user data protection method is HPEC 4D+2P. The system can continue to operate if a maximum of two cluster master nodes become faulty.
The following example shows the 5-node configuration:
Spread placement group
-
A group for EC2 instances, each deployed on different hardware in the AWS data center.
-
By defining a maximum of six storage nodes as a single spread placement group, the system can tolerate more failures than the redundant configuration allows with respect to failures among different spread placements.
HPEC 4D+2P
-
A single spread placement group is defined by a unit of six nodes.
-
A configuration can comprise 6, 12, or 18 storage nodes.
-
Storage nodes can be added in multiples of 6.
-
The allowable number of storage node or drive failures that can be allowed is as follows:
-
If a storage node or drive failure occurs across different spread placement groups, the system can allow 3 or more failures.
-
If a storage node or drive failure occurs in a single spread placement group, the system can allow no more than 2 failures.
The number is the sum of the number of defective storage nodes and the number of defective drives. However, the number is counted as one failure in the following cases.
- One or more drive failures occurred on a faulty storage node.
- Drive failures occurred on a single storage node.
-
Mirroring Duplication
-
A single spread placement group is defined by a unit of three nodes.
-
A configuration can comprise 3, 6, 9, 12, 15, or 18 storage nodes.
-
Storage nodes can be added in multiples of 3.
-
The tolerable number of storage node or drive failures that can be tolerated is as follows:
-
If storage node or drive failures occur across different spread placement groups, the system can allow 2 or more failures.
-
If storage node or drive failures occur in a single spread placement group, the failures can be tolerated unless any of the following conditions are met:
-
Condition 1: Storage node or drive failures occur on both storage nodes that redundant storage controllers belong to. For details about storage controllers, see Redundancy of the storage controller.
-
Condition 2: Failures occur on two or more cluster master nodes. For details about cluster master nodes, see Redundancy of the cluster master node.
-
-