Alerts

Content Platform for Cloud Scale Administration Guide

Version
2.6.x
File Size
1950 KB
Audience
anonymous
Part Number
MK-HCPCS008-11

Alert messages notify you of situations that need attention. Alerts can have a severity of Info, Warning, Severe, or Critical. You can view system alerts through the Admin App, CLI, or REST API, and storage component alerts through the Object Storage Management app.

Each alert corresponds to a system event.

Tip: Before you deal with an alert, refresh the page. The underlying condition might have changed since the alert was raised.

System alerts

Severity Alert Description Action
Severe Instance ip-address disk usage severe threshold

The specified instance has less than 10% free disk space. Add additional storage to the instance.

Important: If an instance runs out of disk space, the system can become unresponsive.

Severe Master Instance ip-address is down

Do one of these:

  • Restart the instance hardware or virtual machine.
  • Restart the script run on the instance. This script is located in the folder bin in the installation folder.
Severe Service is down

Verify the health of the instances. If one is down, do one of these:

  • Restart the instance hardware or virtual machine.
  • Restart the script run on the instance. This script is located in the folder bin in the installation folder.

Otherwise, if the instances are healthy and the problem persists, contact Support.

Severe Worker Instance ip-address is down

Do one of these:

  • Restart the instance hardware or virtual machine.
  • Restart the script run on the instance. This script is located in the folder bin in the installation folder.
Warning Instance ip-address disk usage warning threshold

The specified instance has less than 25% free disk space. Add additional storage to the instance.

Important: If an instance runs out of disk space, the system can become unresponsive.

Warning Package installation failed

Your system failed to install a package that you uploaded.

Warning Service below recommendation

The service is currently running on fewer than the minimum number of instances. Configure this service to run on additional instances.

Warning Service under-protected

A service has lost redundancy; that is, one or more instances on which that service is running are unresponsive.

Verify the health of the instances. If one is down, do one of these:

  • Restart the instance hardware or virtual machine.
  • Restart the script run on the instance. This script is located in the folder bin in the installation folder.

Otherwise, if the instances are healthy and the problem persists, contact Support.

Warning SSL server certificate chain expires soon

A certificate in the SSL server certificate chain for this system expires soon. If the certificate chain expires, users can't access the system.

Warning SSL server certificate chain expired

The SSL server certificate chain for this system contains an expired certificate. Users cannot access the system until the certificate chain is replaced.

Warning The certificate for the storage component (storage-id) is about to expire in n days Renew the storage component certificate.
Info Edited Service "service" The configuration of the specified service has been edited.
Info Issue with Service service resolved An issue with the specified service is resolved.
Info Package installation in progress

Your system is currently installing a package that you uploaded. Depending on the contents of the package, this might take a while.

Info The storage component (storage-id) is unavailable Verify that the storage component ID is correct and valid and that the storage component is active.
Info Update migration in progress. Current state: state Update migration from a version before v2.3 is in progress. The state of migration is either OLD, MIGRATING, or CLEANUP.
Info Updated Service "service" The configuration of the specified service has been updated.

Storage component alerts

Severity Message Description
Warning Available capacity is below n {% | bytes} in the system for object data The free capacity on the HCP for cloud scale system has fallen below the specified threshold (either a percentage of the total or a byte value).
Warning Certificate for Storage component id is about to expire in n days The SSL certificate for the storage component id is set to expire in n days. If the certificate expires, HCP for cloud scale will not be able to read from or write to the storage component.
Warning Storage component id is now inaccessible The storage component id is in the state INACCESIBLE. HCP for cloud scale cannot read from or write to the storage component.
Severe Certificate for Storage component id expired The SSL certificate for the storage component id has expired. HCP for cloud scale cannot read from or write to the storage component. Install a new certificate.
Severe Error communicating with a vault node. Node IP: ip_address One of the vault nodes can't be reached. If other active nodes are available service continues, but attend to this issue immediately.

Examine the vault instance logs to determine the cause of this issue.

Severe Failed to connect to KMS server One of the vault nodes can't be reached. If other active nodes are available service continues, but attend to this issue immediately.

If ingest is halted, then investigate why the KMS service is failing to run on all nodes. If ingest is still working, the original active node has failed over. Examine the vault instance logs to determine the cause of the failure.

Severe Failed to connect to KMS server as it is completely sealed The vault service (Key Management Server service) is completely sealed.

Unseal it using the unseal keys you obtained when you turned on encryption.

Severe Service error: There is a critical issue with the Metadata Gateway database. Shutting down the Metadata Gateway Service.

A Metadata Gateway instance has encountered an issue and shut down. Use the System Management Services function Repair to restart it.

If restarting the service doesn't resolve the issue, contact Support.

Severe Vault node is sealed. Node IP: ip_address One of the vault nodes is sealed. If other active nodes are available service continues, but attend to this issue immediately.

Unseal it using the unseal keys you obtained when you turned on encryption.

Critical Available capacity is below n {% | bytes} in Storage component id The free capacity on the named HCP S Series Node storage component has fallen below the specified threshold (either a percentage of the total or a byte value).
Critical Failed to connect to KMS server The Key Management System service is not available. Until the service is available, data on encrypted storage components can't be read or written.

When KMS service restarts, if there is only one active instance log in to HCP for cloud scale on port 8200 and provide unseal keys to reopen the vault.

Critical Failed to retrieve capacity usage from Storage component id System can't retrieve metrics from an HCP S Series Node storage component. Possible reasons are:
  • The storage component is not reachable
  • The system was upgraded from before v2.1
  • The management username or password is not valid
  • HCP S Series Node version is not supported
Critical Failed verification for retrieved encryption key for StorageComponent_ID{uuid=uuid} The encryption key returned from the Key Management System server doesn't match the key for the storage component uuid.

Verify that the KMS service is available. If the service is available, verify that you have provided the service with a quorum of unseal keys. If objects on the storage component still can't be read, contact Support.

Critical Metadata-Coordination cannot communicate with Sentinel service to get state information The Sentinel service is not responding to requests for state information. Using the System Management application, immediately review the health of the Metadata-Coordination and Sentinel services and ensure that the Sentinel container has adequate heap size for the configuration of the cluster.

Client certificate alerts

Severity Message Description
Warning Certificate for SubjectDN dn will expire in n days The SSL certificate for the specified client sync-to or sync-from target (specified by its Distinguished Name) is set to expire in n days. If the certificate expires, HCP for cloud scale will not be able to synchronize to or from the target. You might need to obtain a new client certificate.
Severe Certificate for SubjectDN dn expired on dd-mmm-yyyy The SSL certificate for the specified client sync-to or sync-from target (specified by its Distinguished Name) had expired. HCP for cloud scale cannot synchronize to or from the target. You must obtain a new client certificate.