Know how every dollar of inference is spent across your organization — by model, team, project and expense type (CAPEX / OPEX). Real-time, free to start with Keld Atlas.
Send a job with a deadline and a ceiling price; Keld routes it through the marketplace to the best-value provider of the model you need — without sacrificing response quality.
Enterprise-grade data security, guaranteed delivery within your deadline, and a backup model if your first choice can't deliver in time. Zero data retention · SOC 2 (in progress) · GDPR.
Free forever for spend mapping · No code changes · Optimise when you're ready
Keld maps how your apps and agents use AI, then optimises it. You keep your stack exactly as it is. The shift is in the unit of work: you don't send a prompt — you send a job, a deadline and a ceiling price, naming a model or just a use case, and Keld runs it on the cheapest provider that fits.
Every dollar attributed by team, model, project and use case — mapped in real time, free to start.
Send a job, a deadline and a ceiling price; Keld runs it on the cheapest provider of the model you need — often 40% cheaper.
No lock-in, no black box. Keld is neutral across 100+ models and providers — your choice, always.
Keld Atlas starts free, mapping what AI costs you across every team, model, project and use case. When you're ready, optimise the workloads that don't need a premium model or a real-time SLA — run them as Deadline jobs on the cheapest provider that fits.
See exactly what AI costs across teams, models and use cases — in real time. Atlas flags which workloads are eligible for a Deadline job, and the savings on the table.
Submit a job by naming a specific model — or just the use case. Keld maintains curated model collections for every common AI task, so it runs your job on the cheapest provider that fits, often far below list.
Every dot is a job. Keld picks the best-fit model at the best price and fills a micro-batch against a provider's open capacity. The moment the batch is ready — full, or its deadline window elapses — it runs on their fleet as one concurrent run. Hover any job to see how it was optimised.
Every enterprise building apps and agents. Map your AI spend across teams, then optimise it with Deadline jobs. Keld Atlas plus drop-in Integrations.
Explore Keld for Enterprises →Turn spare capacity into revenue. Keld Trade is the trading platform to place and manage orders, with micro-batching in front of your fleet.
Explore Keld for AI Model Providers →Replace your OpenAI client with Keld's drop-in SDK. Atlas maps spend by workflow from day one — no other code changes needed.
Works with LangChain, OpenAI SDK and more
SDK docs# Replace your import — nothing else changes from keld.openai import openai client = openai.OpenAI() # reads KELD_API_KEY response = client.chat.completions.create( model="gpt-4o", messages=[ {"role": "user", "content": prompt} ], extra_body={ "keld_ceiling_usd": 0.012, "keld_deadline_ms": 8000, "keld_use_case": "summarise", } ) # Atlas maps it. Runs as a Deadline job if price fits.
from keld.trade import FleetWorker, TradeStrategy worker = FleetWorker( api_key=os.environ["KELD_PROVIDER_KEY"], fleet_id="gpu-fleet-us-east", max_batch_size=32, ) @worker.on_batch async def handle(batch): results = await my_inference( batch.prompts ) return [{"text": r} for r in results] # Trade auto-adjusts your orders to demand worker.start( strategy=TradeStrategy.MAX_YIELD )
Keld Trade micro-batches incoming jobs into your fleet and monitors live demand, adjusting your orders automatically — maximising yield on every GPU-hour.
Any inference runtime
AI Model Provider toolsPick a use case and your monthly volume; see your estimated cost and what Keld saves. No account needed. Or browse the Keld Models Index.
Keld favours no provider. Each job runs on the best fit by price, deadline and performance — not on who's paying us.
Your spend is mapped by team, model, project and use case, with the savings on the table in plain sight. No black-box pricing.
Models delivering near-frontier quality at a fraction of the cost get found and used — not overlooked.
Prompts and completions are never stored.
Enterprise-grade controls, certification underway.
Run Atlas in your own infrastructure.
Quality is continuously checked as you optimise.
Every integration implements IXP, so mapping spend — and later optimising — is a switch you flip, not a migration.
It's free, takes minutes, and doesn't touch your production code. Optimising with Deadline jobs is there when you're ready.