Skip to content
QuickCloud Docs

AI Gateway

How QuickCloud routes AI inference through a tenant-isolated gateway — and how to connect from your environment.

AI Gateway

Every QuickCloud product that uses AI routes inference through the QuickCloud AI Gateway — a tenant-isolated proxy that manages authentication, rate limiting, cost attribution, and model routing. You don't need your own Anthropic or OpenAI API keys.

How it works

Your QuickCloud product (Docker container)

        │  HTTPS, mutual TLS

QuickCloud AI Gateway (gateway.quickcloud.co)

        ├─── Anthropic Claude ──▶ Mainframe Migration, Database Migration,
        │                          Identity & Access (IAM) Migration, Modernization, Security & Cost Intelligence (AI)

        └─── OpenAI GPT ────────▶ QA Automation (AI), Performance & Load Testing

The Gateway:

  • Authenticates each request using your tenant API key (injected automatically into the Docker container)
  • Enforces per-tenant rate limits so one customer's usage doesn't affect another
  • Routes to the appropriate model based on the product
  • Logs every request for usage reporting in your dashboard and the admin portal
  • Retries on transient failures with exponential backoff

Connecting from your environment

Your Docker containers connect to the Gateway automatically — no configuration needed for standard deployments.

If you're running in a locked-down network (air-gapped VPC, strict egress firewall), you need to allow outbound HTTPS to:

gateway.quickcloud.co:443

AWS — Security Group / NACL

Type: HTTPS (443)
Protocol: TCP
Destination: gateway.quickcloud.co

Resolve the IP range at deploy time:

dig +short gateway.quickcloud.co

Add those IPs to your egress rules or use a DNS-based firewall rule.

Azure — NSG outbound rule

Destination: Service Tag or FQDN gateway.quickcloud.co
Port: 443
Protocol: TCP
Action: Allow

On-premises / private cloud

Route outbound HTTPS through your existing internet proxy. Set the standard proxy environment variables in your container:

environment:
  HTTPS_PROXY: http://your-proxy.internal:3128
  NO_PROXY: localhost,127.0.0.1,.internal

The Gateway client respects standard proxy env vars and will use your proxy for all AI inference calls.

Tenant API key

Your tenant API key is provisioned automatically when your subscription is activated. It's injected into the Docker container via an environment variable — you don't need to manage it manually.

You can view your current key and rotate it from the dashboard under AI Gateway → API Key.

Usage and costs

All AI inference costs are bundled into your subscription — there are no per-token charges. Usage is visible in your dashboard and in the admin portal.

Rate limits

PlanRequests / minuteTokens / minute
Migration Bundle60100,000
Cloud Ops Bundle60100,000
Full Platform200500,000
EnterpriseCustomCustom

If you hit rate limits, the Gateway queues and retries automatically. Sustained overages trigger an alert to the QuickCloud team to discuss a plan upgrade.