Before onboarding models, verify that iQ Studio is installed and that you meet the following requirements.
This release supports only GPU‑bound models running on the vLLM runtime. CPU‑only models and models served by non‑vLLM runtimes are not supported.
Cluster and network requirements
- The Kubernetes cluster is running, and the target namespace is available.
- The cluster has outbound internet connectivity to access HuggingFace Hub.
Model source
- The model to deploy is publicly available in the HuggingFace Hub.
- Kubernetes-native model serving system (KServe) is installed and operational in the cluster, and the vLLM 0.18.0 runtime is deployed and integrated with KServe.
- Make sure that sufficient CPU, memory, and GPU when required, are available for model deployment.