Bring Your Own Model

Your models.
Our edge. Their speed.

Fine-tuned LLMs, proprietary voice models, custom embeddings — run them on PolarGrid's edge GPU network. Sub-30ms RTT, full data control, no infrastructure to manage.

Bring Your Own Models

Take your model library to the next level on PolarGrid's distributed edge network.

You've fine-tuned for your domain, your voice, your data. Running it on a distant hyperscaler defeats the purpose. PolarGrid brings it to the edge — closer to users, faster responses.

Deployment options

How to deploy.

Fine-Tuned Model Hosting

Custom weights

You've fine-tuned a model on your data. We serve it on PolarGrid edge nodes — same low-latency infrastructure, no infra for you to manage. Bring LoRA adapters, full fine-tunes, or GGUF weights.

  • Any HuggingFace-compatible checkpoint
  • LoRA & full fine-tune support
  • Pay-per-token, same rates as shared models
  • OpenAI-compatible API
  • No cold starts
Get started →

Proprietary Model Deployment

Enterprise

Your team has built or licensed a model you want running at the edge — not on shared infrastructure. Dedicated GPU allocation, reserved VRAM, production SLA, and full isolation.

  • Dedicated GPU allocation
  • Reserved VRAM — no queue, no sharing
  • Node-level SLA
  • Full model isolation
  • Priority routing across nodes
Get started →

How it works

Upload. Deploy. Infer.

01

Upload your model.

Point us to your HuggingFace repo, or upload weights directly. Any architecture we support on the inference stack — transformers, diffusers, custom.

02

We deploy to the edge.

Your model is loaded onto PolarGrid GPU nodes and served via our intelligent auto-router. Requests are intelligently routed to the node that provides each specific user the lowest TTFT for that request.

03

Call it like any other model.

Use the standard OpenAI-compatible API with your custom model ID. Same SDKs, same integration — just your model at the edge.

custom-model.ts
// Your fine-tuned model — same API as everything else
const client = new OpenAI({
  baseURL: "https://autorouter.edge.polargrid.ai/v1",
  apiKey: process.env.POLARGRID_API_KEY,
});

// Call your model by its custom ID
const response = await client.chat.completions.create({
  model: "your-org/your-fine-tuned-model",
  messages: [{ role: "user", content: "..." }],
});

// Edge-routed. Sub-300ms. Your model.

Ready to bring your model to the edge?

Talk to us about your model, your use case, and your scale. We'll find the right deployment option.