PRICING

Match or beat what you pay today.

Consumption-based pricing. No seat fees. No platform tax. Pay for compute, get the fastest inference on earth.

Start Free — $500 in credits, no card required →

Custom Model Hosting

Have a fine-tuned or proprietary model?

Run your own weights on PolarGrid edge infrastructure. Same low latency, no infra to manage. Pricing based on model size and usage.

Learn more →
FREE TIER

Starter

$0/month

+ compute at standard rates

  • $500 in free credits
  • Access to all edge nodes
  • All models (STT, LLM, TTS)
  • Python SDK + TypeScript SDK
  • Python SDK + CLI
  • Playground access
  • Pay-as-you-go after credits
Get Started Free →
EARLY ACCESS

Reserved Capacity

Custom

Volume-based rate · locked for contract duration

  • Everything in Starter
  • Reserved GPU allocation
  • Priority capacity across all nodes
  • Early-bird rate locked in
  • Dedicated support
  • SLA guarantees
  • Custom contract terms
Talk to Us →
ENTERPRISE

Enterprise

Custom

For organizations with high-volume inference needs

  • Everything in Reserved
  • Multi-region private deployment
  • Dedicated hardware options
  • HIPAA / SOC 2 (roadmap)
  • Private Slack support
  • Volume discounts
  • ML engineering services
Contact Us →

Model Pricing

Pay per token or per minute. No platform fees, no data transfer charges, no storage fees.

Large Language Models

ModelInput / 1M tokensOutput / 1M tokens
Qwen 3.5 27B$0.20$0.75
Qwen 3.5 9B$0.055$0.085

Speech-to-Text

ModelPrice
Whisper Large V3 Turbo$0.004 / min
Cohere Transcribe$0.004 / min

Text-to-Speech

ModelPrice
Hume AI TADA$0.008 / min
Kokoro 82M$0.008 / min

Voice Pipeline

ModelPrice
PersonaPlex (STT + LLM + TTS)$0.070 / min

All prices in USD. Volume discounts from 5–15% starting at $5K/month committed spend. Contact hello@polargrid.ai for enterprise pricing.

Faster than the cloud.

PolarGrid runs open weight models on owned GPU infrastructure. No hyperscaler markup. No API abstraction tax.

ModelProviderInput / 1M tokensOutput / 1M tokens
GPT-4oOpenAI$2.50$10.00
GPT-4o miniOpenAI$0.15$0.60
Qwen 3.5 9BPolarGrid$0.055$0.085
Qwen 3.5 27BPolarGrid$0.20$0.75

How PolarGrid compares

Lower latency. No unnecessary hops.

ProviderP95 LatencyTTS (1K chars)LLM (1M out tokens)Edge Routing
PolarGrid<30ms$0.008$0.45✓ Yes — auto-routed
OpenAI200–400ms$0.015$2.00✗ Centralized
Groq80–150ms$0.90✗ Centralized
Fireworks.ai100–200ms$0.90✗ Centralized
ElevenLabs300–600ms$0.18✗ Centralized

Latency figures are approximate P95 estimates based on typical North American usage. TTS pricing reflects 1,000 characters. LLM pricing reflects output tokens. Contact us for a custom side-by-side with your actual workload.

Credit grants

Building something big? We can help with more.

Early-Stage Startups

Pre-seed or seed stage? Get up to $5,000 in additional free compute credits.

Apply →

Academic Research

University labs and researchers can get up to $2,500 in free credits for AI research projects.

Apply →

Common questions

When do my free credits expire?

$500 in free credits are applied on signup and are valid for 6 months. No credit card required to start.

Can you match my current provider's pricing?

Yes — that's our commitment. Share your current inference bill and we'll build a side-by-side. If we can't match your rate, we'll tell you upfront. No migration required to test.

What's the minimum commitment for Reserved Capacity?

Minimum 3-month contract. We lock in your rate for the full contract duration. Early-access customers get our lowest-ever pricing.

Is there a free tier after credits are used?

After your $500 in free credits, you pay only for compute at standard rates. No monthly platform fee, no seat fee, no minimums.

Your first $500
is on us.

Sign up, open the playground, run your pipeline. See what PolarGrid can do for your latency — before you pay anything.

No credit card required · Cancel any time · $500 free credits