To determine the system size that you need:
- Determine how many documents you need to index.
-
Based on the number of documents you want to index, use the following tables to
determine:
- How many instances you need
- How much RAM each instance needs
- The Index service configuration needed to support indexing the number of documents you want
Total documents to be indexed System configuration 15 million 25 million 50 milliona Total instances required: 1b
Instances running the Index service: 1
Index service configuration required:
- Shards per index: 1
- Index Protection Level per index: 1
- Container memory: 200MB greater than Heap settings
- Heap settings: Depends on instance RAM.
Instance RAM Heap setting 16 GB 1800m 32 GB 9800m 64 GB 25800m
16 GB 32 GB 64 GB Instance RAM needed (for each instance running the Index service) a Contact Hitachi Vantara for guidance before trying to index this many documents on this number of instances. At this scale, your documents and required configuration settings can greatly affect the number of documents you can index.
b Single-instance systems are suitable for testing and development, but not for production use.
Total documents to be indexed System configuration 45 million 75 million 150 milliona Total instances required: 4
Instances running the Index service: 3
Index service configuration required:
- Shards per index: 3
- Index Protection Level per index: 1
- Container memory: 200MB greater than Heap settings
- Heap settings: Depends on instance RAM.
Instance RAM Heap setting 16 GB 1800m 32 GB 9800m 64 GB 25800m
16 GB 32 GB 64 GB Instance RAM needed (for each instance running the Index service) a Contact Hitachi Vantara for guidance before trying to index this many documents on this number of instances. At this scale, your documents and required configuration settings can greatly affect the number of documents you can index.
Total documents to be indexed System configuration 75 million 125 million 250 milliona Total instances required: 8
Instances running the Index service: 5
Index service configuration required:
- Shards per index: 5
- Index Protection Level per index: 1
- Container memory: 200MB greater than Heap settings
- Heapb settings: Depends on instance RAM.
Instance RAM Heap setting 16 GB 7800m 32 GB 15800m 64 GB 31000m
16 GB 32 GB 64 GB Instance RAM needed (for each instance running the Index service) a Contact Hitachi Vantara for guidance before trying to index this many documents on this number of instances. At this scale, your documents and required configuration settings can greatly affect the number of documents you can index.
b With an 8-instance system, the Index service should be the only service running on each of its 5 instances. With the Index service isolated this way, you can allocate more heap space to the service than you can on a single or 4-instance system.
Total documents to be indexed System configuration 195 million 325 million 650 milliona Total instances required: 16
Instances running the Index service: 13
Index service configuration required:
- Shards per index: 13
- Index Protection Level per index: 1
- Container memory: 200MB greater than Heap settings
- Heapb settings: Depends on instance RAM.
Instance RAM Heap setting 16 GB 7800m 32 GB 15800m 64 GB 31000m
16 GB 32 GB 64 GB Instance RAM needed (for each instance running the Index service) a Contact Hitachi Vantara for guidance before trying to index this many documents on this number of instances. At this scale, your documents and required configuration settings can greatly affect the number of documents you can index.
b With a 16-instance system, the Index service should be the only service running on each of its 13 instances. With the Index service isolated this way, you can allocate more heap space to the service than you can on a single or 4-instance system.
For example, if you need to index up to 150 million documents, you need at minimum a 4-instance system with 64 GB RAM per instance.
-
Determine how fast you need to index documents, in documents per second.
For example:
- To index 100 million documents in 2 days, you need an indexing rate of 578 documents per second.
- To continuously index 1 million documents every day, you need an indexing rate of 12 documents per second.
- Determine the base indexing rate for your particular dataset and processing pipelines:
- Install a single-instance HCI system with that has the minimum required hardware resources.
- Run a workflow with the pipelines you want and on a representative subset of your data.
- Use the workflow task details to determine the rate of documents processed per second.
- To determine the number of cores you need per instance, replace Base rate in this table with the rate you determined in step 4.
Number of instances you need Cores per instance 4 (minimum required) 8 (recommended) 1 Base rate 70% Base rate 4 300% Base rate 500% Base rate 8 600% Base rate 900% Base rate More than 8 Contact Hitachi Vantara for guidance For example, if you had previously determined that:
- You need a 4-instance system.
- You need to process 500 documents per second.
- The base processing rate for your data and pipelines is 100 documents per second.
You need 8 cores per instance.
- Multiply the number of instances you need times the number of cores per instances to determine the total number of cores that you need for your system.
- After your system is installed, configure it with the index settings you determined in step 2.
For information on index shards, Index Protection Level, and moving the Index service, see the Administrator Help, which is available from the Admin App.