AI Gateway
How QuickCloud routes AI inference through a tenant-isolated gateway — and how to connect from your environment.
AI Gateway
Every QuickCloud product that uses AI routes inference through the QuickCloud AI Gateway — a tenant-isolated proxy that manages authentication, rate limiting, cost attribution, and model routing. You don't need your own Anthropic or OpenAI API keys.
How it works
Your QuickCloud product (Docker container)
│
│ HTTPS, mutual TLS
▼
QuickCloud AI Gateway (gateway.quickcloud.co)
│
├─── Anthropic Claude ──▶ Mainframe Migration, Database Migration,
│ Identity & Access (IAM) Migration, Modernization, Security & Cost Intelligence (AI)
│
└─── OpenAI GPT ────────▶ QA Automation (AI), Performance & Load TestingThe Gateway:
- Authenticates each request using your tenant API key (injected automatically into the Docker container)
- Enforces per-tenant rate limits so one customer's usage doesn't affect another
- Routes to the appropriate model based on the product
- Logs every request for usage reporting in your dashboard and the admin portal
- Retries on transient failures with exponential backoff
Connecting from your environment
Your Docker containers connect to the Gateway automatically — no configuration needed for standard deployments.
If you're running in a locked-down network (air-gapped VPC, strict egress firewall), you need to allow outbound HTTPS to:
gateway.quickcloud.co:443AWS — Security Group / NACL
Type: HTTPS (443)
Protocol: TCP
Destination: gateway.quickcloud.coResolve the IP range at deploy time:
dig +short gateway.quickcloud.coAdd those IPs to your egress rules or use a DNS-based firewall rule.
Azure — NSG outbound rule
Destination: Service Tag or FQDN gateway.quickcloud.co
Port: 443
Protocol: TCP
Action: AllowOn-premises / private cloud
Route outbound HTTPS through your existing internet proxy. Set the standard proxy environment variables in your container:
environment:
HTTPS_PROXY: http://your-proxy.internal:3128
NO_PROXY: localhost,127.0.0.1,.internalThe Gateway client respects standard proxy env vars and will use your proxy for all AI inference calls.
Tenant API key
Your tenant API key is provisioned automatically when your subscription is activated. It's injected into the Docker container via an environment variable — you don't need to manage it manually.
You can view your current key and rotate it from the dashboard under AI Gateway → API Key.
Usage and costs
All AI inference costs are bundled into your subscription — there are no per-token charges. Usage is visible in your dashboard and in the admin portal.
Rate limits
| Plan | Requests / minute | Tokens / minute |
|---|---|---|
| Migration Bundle | 60 | 100,000 |
| Cloud Ops Bundle | 60 | 100,000 |
| Full Platform | 200 | 500,000 |
| Enterprise | Custom | Custom |
If you hit rate limits, the Gateway queues and retries automatically. Sustained overages trigger an alert to the QuickCloud team to discuss a plan upgrade.