Management of drive retention policies

Content Software for File User Guide

Version
4.2.x
Audience
anonymous
Part Number
MK-HCSF000-03

Since the Content Software for File system is a highly scalable data storage system, data storage policies in tiered configurations cannot be based on cluster-wide FIFO methodology, because clusters can contain billions of files. Instead, drive retention is managed by time-stamping every piece of data, where the timestamp is based on a resolution of intervals which may extend from minutes to weeks. The Content Software for File system maintains the interval in which each piece of data was created, accessed, or last modified.

Users only specify the drive Retention Period and based on this, each interval is one-quarter of the drive Retention Period. Data written, modified, or accessed prior to the last interval is always released, even if SSD space is available.

Note: The timestamp is maintained per piece of data in chunks of up to 1 MB, and not per file. Consequently, different parts of big files may have different tiering states.

For example: In a Content Software for File system that is configured with a drive Retention Period of 20 days, data is split into 7 interval groups, with each group spanning a total of 5 days in this scenario (5 is 25% of 20, the drive Retention Period). If the system starts operating on January 1, then data written, accessed, or modified between January 1-5 is classified as belonging to interval 1, data written, accessed, or modified between January 6-10 belongs to interval 2, and so on. In such a case, the 7 intervals will be timestamped and divided as follows:


Seven Time-stamped Intervals

In the above scenario, there are seven data intervals on the SSDs (the last one is accumulating new/modified data). In addition, another interval is currently being released to the object-store. Yes, the retention period is almost twice as long as the user specifies, as long as there is sufficient space on the SSD. Why? If possible, it provides better performance and reduces unnecessary release/promotion of data to/from the object-store if data is modified.