PRICING
Match or beat what you pay today.
Consumption-based pricing. No seat fees. No platform tax. Pay for compute, get the fastest inference on earth.
Start Free — $500 in credits, no card required →Custom Model Hosting
Have a fine-tuned or proprietary model?
Run your own weights on PolarGrid edge infrastructure. Same low latency, no infra to manage. Pricing based on model size and usage.
Starter
+ compute at standard rates
- ✓$500 in free credits
- ✓Access to all edge nodes
- ✓All models (STT, LLM, TTS)
- ✓Python SDK + TypeScript SDK
- ✓Python SDK + CLI
- ✓Playground access
- ✓Pay-as-you-go after credits
Reserved Capacity
Volume-based rate · locked for contract duration
- ✓Everything in Starter
- ✓Reserved GPU allocation
- ✓Priority capacity across all nodes
- ✓Early-bird rate locked in
- ✓Dedicated support
- ✓SLA guarantees
- ✓Custom contract terms
Enterprise
For organizations with high-volume inference needs
- ✓Everything in Reserved
- ✓Multi-region private deployment
- ✓Dedicated hardware options
- ✓HIPAA / SOC 2 (roadmap)
- ✓Private Slack support
- ✓Volume discounts
- ✓ML engineering services
Model Pricing
Pay per token or per minute. No platform fees, no data transfer charges, no storage fees.
Large Language Models
| Model | Input / 1M tokens | Output / 1M tokens |
|---|---|---|
| Qwen 3.5 27B | $0.20 | $0.75 |
| Qwen 3.5 9B | $0.055 | $0.085 |
Speech-to-Text
| Model | Price |
|---|---|
| Whisper Large V3 Turbo | $0.004 / min |
| Cohere Transcribe | $0.004 / min |
Text-to-Speech
| Model | Price |
|---|---|
| Hume AI TADA | $0.008 / min |
| Kokoro 82M | $0.008 / min |
Voice Pipeline
| Model | Price |
|---|---|
| PersonaPlex (STT + LLM + TTS) | $0.070 / min |
All prices in USD. Volume discounts from 5–15% starting at $5K/month committed spend. Contact hello@polargrid.ai for enterprise pricing.
Faster than the cloud.
PolarGrid runs open weight models on owned GPU infrastructure. No hyperscaler markup. No API abstraction tax.
How PolarGrid compares
Lower latency. No unnecessary hops.
| Provider | P95 Latency | TTS (1K chars) | LLM (1M out tokens) | Edge Routing |
|---|---|---|---|---|
| PolarGrid | <30ms | $0.008 | $0.45 | ✓ Yes — auto-routed |
| OpenAI | 200–400ms | $0.015 | $2.00 | ✗ Centralized |
| Groq | 80–150ms | — | $0.90 | ✗ Centralized |
| Fireworks.ai | 100–200ms | — | $0.90 | ✗ Centralized |
| ElevenLabs | 300–600ms | $0.18 | — | ✗ Centralized |
Latency figures are approximate P95 estimates based on typical North American usage. TTS pricing reflects 1,000 characters. LLM pricing reflects output tokens. Contact us for a custom side-by-side with your actual workload.
Common questions
When do my free credits expire?
$500 in free credits are applied on signup and are valid for 6 months. No credit card required to start.
Can you match my current provider's pricing?
Yes — that's our commitment. Share your current inference bill and we'll build a side-by-side. If we can't match your rate, we'll tell you upfront. No migration required to test.
What's the minimum commitment for Reserved Capacity?
Minimum 3-month contract. We lock in your rate for the full contract duration. Early-access customers get our lowest-ever pricing.
Is there a free tier after credits are used?
After your $500 in free credits, you pay only for compute at standard rates. No monthly platform fee, no seat fee, no minimums.
Your first $500
is on us.
Sign up, open the playground, run your pipeline. See what PolarGrid can do for your latency — before you pay anything.
No credit card required · Cancel any time · $500 free credits