The world's first marketplace forAI inference.

Track how you use AI and how much it's costing you

Know how every dollar of inference is spent across your organization — by model, team, project and expense type (CAPEX / OPEX). Real-time, free to start with Keld Atlas.

Understand.
Route your AI workflows through the marketplace

Send a job with a deadline and a ceiling price; Keld routes it through the marketplace to the best-value provider of the model you need — without sacrificing response quality.

Optimize.
Your AI operations in a trusted, enterprise-grade environment

Enterprise-grade data security, guaranteed delivery within your deadline, and a backup model if your first choice can't deliver in time. Zero data retention · SOC 2 (in progress) · GDPR.

Protect.

Free forever for spend mapping · No code changes · Optimise when you're ready

40%
Average cost reduction
100+
Model providers
99.9%
Jobs delivered within deadline
Real-time
Spend visibility & alerts
What is Keld

The platform that gives you the best intelligence at the best price.

Keld maps how your apps and agents use AI, then optimises it. You keep your stack exactly as it is. The shift is in the unit of work: you don't send a prompt — you send a job, a deadline and a ceiling price, naming a model or just a use case, and Keld runs it on the cheapest provider that fits.

See what AI really costs

Every dollar attributed by team, model, project and use case — mapped in real time, free to start.

Maximize your AI investment

Send a job, a deadline and a ceiling price; Keld runs it on the cheapest provider of the model you need — often 40% cheaper.

Stay neutral and in control

No lock-in, no black box. Keld is neutral across 100+ models and providers — your choice, always.

Map your AI spend.
Then optimise it.

Keld Atlas starts free, mapping what AI costs you across every team, model, project and use case. When you're ready, optimise the workloads that don't need a premium model or a real-time SLA — run them as Deadline jobs on the cheapest provider that fits.

Atlas · Map every workflow

Every dollar, mapped.

See exactly what AI costs across teams, models and use cases — in real time. Atlas flags which workloads are eligible for a Deadline job, and the savings on the table.

Job categorySpend / moKeld DeadlineSave
Batch summarization$12,400✓ Eligible−42%
Document extraction$8,100✓ Eligible−45%
Call transcription$6,900✓ Eligible−47%
Subtitles & captioning$3,600✓ Eligible−40%
Live chat & copilots$9,200Real-time
Atlas · Deadline jobs

Run at the best price.

Submit a job by naming a specific model — or just the use case. Keld maintains curated model collections for every common AI task, so it runs your job on the cheapest provider that fits, often far below list.

submit by model or by use case
Use caseRuns on$/1MSave
SummarizationLlama 3.3 70B$0.36−40%
ExtractionQwen2.5 72B$0.24−44%
CodingDeepSeek V3$0.17−39%
TranscriptionWhisper large-v3$0.06−48%
Live optimisation

Watch Keld optimise every job.

Every dot is a job. Keld picks the best-fit model at the best price and fills a micro-batch against a provider's open capacity. The moment the batch is ready — full, or its deadline window elapses — it runs on their fleet as one concurrent run. Hover any job to see how it was optimised.

Jobs inBest model pickedBatchedRuns on provider
Two audiences, one platform

Who Keld is built for

Two audiences, one platform

Up and running in minutes

Enterprises

One import.
Every workflow mapped.

Replace your OpenAI client with Keld's drop-in SDK. Atlas maps spend by workflow from day one — no other code changes needed.

1 Map — Atlas tags workflows and shows spend by type. Free to start.
2 Optimise — send cost-tolerant work as Deadline jobs, save ~40%.
UI Atlas console — full control for teams who prefer no code.

Works with LangChain, OpenAI SDK and more

SDK docs
# Replace your import — nothing else changes
from keld.openai import openai

client = openai.OpenAI()  # reads KELD_API_KEY

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": prompt}
    ],
    extra_body={
        "keld_ceiling_usd": 0.012,
        "keld_deadline_ms": 8000,
        "keld_use_case":    "summarise",
    }
)
# Atlas maps it. Runs as a Deadline job if price fits.
from keld.trade import FleetWorker, TradeStrategy

worker = FleetWorker(
    api_key=os.environ["KELD_PROVIDER_KEY"],
    fleet_id="gpu-fleet-us-east",
    max_batch_size=32,
)

@worker.on_batch
async def handle(batch):
    results = await my_inference(
        batch.prompts
    )
    return [{"text": r} for r in results]

# Trade auto-adjusts your orders to demand
worker.start(
    strategy=TradeStrategy.MAX_YIELD
)
AI Model Providers

Idle capacity,
turned into revenue.

Keld Trade micro-batches incoming jobs into your fleet and monitors live demand, adjusting your orders automatically — maximising yield on every GPU-hour.

1 Micro-batching — accept batched jobs, fill gaps in capacity.
2 Trade — automated order strategy, max yield per GPU-hour.

Any inference runtime

AI Model Provider tools
Savings Calculator · free & public

See what you'd save — in 30 seconds.

Pick a use case and your monthly volume; see your estimated cost and what Keld saves. No account needed. Or browse the Keld Models Index.

Open the calculator →
What Keld owes you

Neutral. Transparent. Discoverable.

Neutrality

Keld favours no provider. Each job runs on the best fit by price, deadline and performance — not on who's paying us.

Transparency

Your spend is mapped by team, model, project and use case, with the savings on the table in plain sight. No black-box pricing.

Discoverability

Models delivering near-frontier quality at a fraction of the cost get found and used — not overlooked.

Zero Data Retention

Prompts and completions are never stored.

SOC 2 (in progress) · GDPR

Enterprise-grade controls, certification underway.

Host your own Atlas

Run Atlas in your own infrastructure.

Consistent evals

Quality is continuously checked as you optimise.

Drop-in, no code changes

Integrations for the platforms you already use

Every integration implements IXP, so mapping spend — and later optimising — is a switch you flip, not a migration.

OpenAI SDK LangChain LiteLLM Vercel AI SDK LlamaIndex n8n Zapier Python · TS SDK

Start by mapping your spend

It's free, takes minutes, and doesn't touch your production code. Optimising with Deadline jobs is there when you're ready.