Limits and Limitations

This reference covers a list of all the limits and limitations that apply on ModelZ.

Please contact us on Discord (opens in a new tab) if you have any questions about the current limits.

General limits

The maximum replicas per deployment is set to 5.

Please contact us on Discord (opens in a new tab) or via email if you need more.

Inference

Startup time

The maximum startup time is 600 seconds. The inference will fail if the startup time exceeds 600 seconds.

Inference time

The maximum inference time per request is 60 seconds.

Request body size

The maximum request body size is 5MB.

Registry

Image

Currently, we only support public images. Private images are not supported yet.

Huggingface

We only support public Huggingface models if you are using the Huggingface Hub.

Log

At the moment, our system is designed to display only the most recent 30 minutes of logs. However, we are actively working on expanding this feature to include a longer historical range of logs.

Payment & Billing FAQ