Smart LLM routing and cost control. OpenAI-compatible, your keys.

StepBlend sits between your app and AI providers, routing each request to the best model based on your strategy—lowest cost, fastest, or most reliable.

Your API keys. Your control. No token markup. Use like OpenAI: set base_url to your StepBlend API and your JWT as api_key.

Try it live

Demo mode — limited models and usage

Supported for savings: GPT-4, GPT-4 Turbo, GPT-4.1, GPT-4.1 Mini

Max 800 tokens (~3200 chars). Demo limited to cheaper models only.

Sign up to use your own API keys and get unlimited access

Try Optimizer

Instant integration — OpenAI compatible

Use your existing OpenAI or LangChain code. No new wrappers — just change base URL and key.

OpenAI Python SDK

from openai import OpenAI
client = OpenAI(
  base_url="https://stepblend.com/api/v1",
  api_key="YOUR_STEPBLEND_JWT"
)
r = client.chat.completions.create(
  model="lowest-cost",
  messages=[{"role": "user", "content": "Hi"}]
)

LangChain

from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
  openai_api_base="https://stepblend.com/api/v1",
  openai_api_key="YOUR_JWT",
  model="balanced"
)
llm.invoke("Hello!")

curl

curl -X POST https://stepblend.com/api/v1/chat/completions \
  -H "Authorization: Bearer YOUR_JWT" \
  -H "Content-Type: application/json" \
  -d '{"model":"balanced","messages":[{"role":"user","content":"Hi"}]}'

See full OpenAI-compatible docs →

The problem we solve

Most teams hardcode one model, guess at costs, and have no visibility into provider exposure.

You end up:

  • Hardcode one model
  • Guess at cost
  • Ignore vendor exposure
  • Lack spend visibility
  • Have no governance layer

StepBlend adds a control layer: route intelligently, cap costs, track everything. Your keys, your infrastructure.

What you get

A routing layer that handles model selection, cost control, and observability.

  • Real-time routing across providers
  • Deterministic strategy enforcement
  • Max cost caps per request
  • Automatic failover
  • Vendor exposure tracking
  • Spend visibility dashboard
  • Seamless drop-in for OpenAI SDK, LangChain, LlamaIndex, Vercel AI SDK

Plus a dashboard to see what's happening across all your AI calls.

How it works

Replace direct model calls with one endpoint.

  1. 1Send requests to /api/route
  2. 2StepBlend classifies and scores models
  3. 3Applies strategy + cost constraints
  4. 4Executes with your API keys
  5. 5Logs and surfaces metrics in dashboard

We never touch your API keys or resell tokens. Everything runs through your infrastructure.

Simple API, powerful routing

Use the OpenAI-compatible endpoint for drop-in compatibility, or the native API for advanced control. Both use the same routing engine, cost caps, logs, and dashboard.

Request

POST /api/v1/chat/completions
Content-Type: application/json
Authorization: Bearer <your_jwt>

{
  "model": "lowest-cost",
  "messages": [{"role": "user", "content": "Summarize this..."}],
  "stepblend_max_cost": 0.01
}

Response

{
  "id": "...",
  "choices": [{ "message": { "content": "Your summary..." } }],
  "usage": { "prompt_tokens": 10, "completion_tokens": 20 },
  "model": "gpt-4.1-mini"
}

OpenAI-compatible API docs →

See what's happening

Track spend, model usage, and provider exposure across all your requests.

  • Total AI spend
  • Premium model usage %
  • Provider exposure
  • Fallback rate
  • Latency metrics

Available for Growth and Scale plans.

Try it yourself

Test routing with your own prompts. See which model gets selected and why.

Open Optimizer

Trusted by developers and teams

7+

AI models routed

4

Routing strategies

0

Token resale

Cut our AI spend by 40% without changing a line of code. The routing picks cheaper models when quality isn't critical.

Sarah K.

Engineering Lead

One endpoint instead of managing five different provider SDKs. Our keys stay with us, which was a hard requirement.

Marcus T.

CTO

We see cost estimates before each call and can cap spend per request. Makes budgeting predictable.

Jen L.

Product Manager

Ready to add control to your AI calls?

Start routing intelligently today. Free tier includes 1,000 requests per month.

Your API keys onlyNo token resaleAPI-first integration