This section describes how to replace an initiator node that is blocked for some reason during data migration.
-
Required role: Service and Storage
Note on the procedure
In the following procedure, long command lines begin on a new line delimited by "\."
- Log in to the controller node.
-
Obtain information about the storage cluster to verify the version of the
storage software.
REST API: GET /v1/objects/storage
CLI: storage_show
-
Verify the version of the deployment constituting the storage cluster.
Run the following command.
az deployment group show \ --name mainTemplate \ --resource-group <resource-group-name> \ --query "properties.outputs.templateVersion.value"
If the version of the deployment constituting the storage cluster is the same as the storage software version, go to the next step.
If the versions are different, update the version of the storage cluster resource according toTasks required after the software is updated in the VSP One SDS Block and SDS Cloud System Administration, and then go to the next step.
Note:If the templateVersion cannot be confirmed because the mainTemplate deployment was deleted or for some other reason, also check the procedure in Tasks required after the software is updated.
-
See Exporting configuration files (Cloud for Microsoft
Azure) in the VSP One SDS Block and SDS Cloud System Administration to obtain
configuration files from VSP One SDS Block for
storage node replacement.
When exporting configuration files, specify "ReplaceStorageNode" for the exportFileType parameter, and specify the machineImageId parameter. This mandatory step ensures that you use the latest configuration files.
CAUTION:-
Do not decompress the exported ConfigurationFiles_<YYYYYMMDD>_<hhmmss>.tar.gz. If you decompress and use it, there is a possibility that the job will fail. If you made a mistake in specifying the wrong value for a parameter, specify the correct value and rerun the configuration file export to obtain the latest configuration file.
-
Use the same VM image as that for the storage node to be replaced for the machineImageId parameter. For the VM image URN to use, see VM image URN (Cloud for Microsoft Azure) in the VSP One SDS Block and SDS Cloud System Administration.
-
-
Verify the ID and status of the initiator node to be replaced.
REST API: GET/v1/objects/storage-nodes
CLI: storage_node_list
Verify that the status of the initiator node to be replaced is "MaintenanceBlockage", "TemporaryBlockage", "PersistentBlockage", "InstallationFailed", "RemovalFailedAndTemporaryBlockage", "RemovalFailedAndMaintenanceBlockage", or "RemovalFailedAndPersistentBlockage", and then go to the next step.
-
Replace the initiator node.
You can perform this only for the cluster master node (primary) or a load balancer.
REST API: POST /v1/objects/storage-nodes/<id>/actions/replace-with-configuration-file/invoke
CLI: storage_node_replace_with_configuration_file
When running the preceding command, the following information must be specified as parameters. Do not decompress this file and specify it as a parameter.
-
File exported in step 4
ConfigurationFiles_<YYYYMMDD>_<hhmmss>.tar.gz
Verify the job ID that is displayed after the command is run.
Note:If the error message file is displayed in the event log, additional information will be uploaded to the Azure Blob Storage container created during installation.
-
-
Verify the state of the job.
Run either of the following commands with the job ID specified.
REST API: GET /v1/objects/jobs/<jobid >
CLI: job_show
After running the command, if you receive a response indicating "Succeeded" as the state, the job is completed.
Until replacement is completed, it might not be possible to perform other operations. If any other operation is unsuccessful, take action according to the error message or event log for that operation.
Note:-
When you perform this procedure, event log KARS16018-W or KARS16143-W might be output. Also, when replacing a storage node with another one whose drives differ from those used at the time of blocking, event logs KARS05010-E, KARS07005-E, and KARS06596-E might be output for the drives before replacement. Generation of these event logs at this timing is not a problem because the subsequent processing addresses them, so you can continue with this procedure.
-
If an error occurs in the storage node replacement job for any reason, the error handling process will power off the initiator node VM. However, the power-off process might not be executed properly due to network failure or other reasons. If the storage node replacement job is terminated because of an error, take action according to the event logs or the VSP One SDS Block and SDS Cloud Troubleshooting Reference.
CAUTION:-
When the job is completed, event log KARS10965-W might be output. In such a case, take action according to Action to be taken when resource deletion locking is not correctly set (Cloud for Microsoft Azure) in the VSP One SDS Block and SDS Cloud Troubleshooting Reference.
-
When the minimum memory capacity set for the protection domain is 128 GiB, it takes approximately 40 minutes to complete. When the minimum memory capacity is 256 GiB, it takes approximately 50 minutes to complete. In addition, when Rebuild (data rebuild) processing is in progress, it might take a maximum of approximately five more minutes.
-
Furthermore, the following external factors might increase or decrease the processing time. If the processing time increases, verify that there are no problems with the following items:
-
Load status of Microsoft Azure
-
Network communication status (mutual communication status between the controller node, storage node, and Microsoft Azure)
-
Performance status of Azure Virtual Machines
-
-
-
Obtain a list of storage nodes, and then verify the status of the replaced
initiator node ID.
REST API: GET /v1/objects/storage-nodes
CLI: storage_node_list
If the status is "Ready" or "RemovalFailed", replacement of the initiator node is complete.