FAQ

Answers to frequently asked questions about Upbox.

What frameworks are supported?

PyTorch, TensorFlow, JAX, scikit-learn, XGBoost, ONNX, and HuggingFace Transformers. Custom runtimes supported via Docker.

How fast are deployments?

Most deployments complete in under 10 seconds. Large models (10GB+) may take up to 60 seconds for initial deployment.

What's the pricing model?

Pay per inference request plus compute time. Scale-to-zero means you pay nothing when idle. Volume discounts available for enterprise.

Is there a free tier?

Yes - 100,000 free inference requests per month, forever. No credit card required to start.

What's the latency?

Sub-50ms P95 latency globally via our edge network. For latency-critical apps, always-on mode guarantees sub-10ms.

Can I deploy private models?

Absolutely. Models are deployed in isolated containers. VPC deployment available for additional security.

What's the max model size?

No hard limit. We've deployed models up to 175B parameters. Large models automatically use model parallelism across GPUs.

Do you support fine-tuning?

Upbox focuses on deployment. For training, use your preferred platform and deploy the resulting model to Upbox.

Was this page helpful?