How do I track OpenAI API costs?

Install the Cohrint SDK with one line: pip install cohrint. Wrap your OpenAI client with createOpenAIProxy() and every call is automatically tracked with cost, tokens, latency and model details.

Which LLM providers does Cohrint support?

Cohrint supports OpenAI, Anthropic (Claude), Google (Gemini), Mistral, Cohere, Meta (Llama) and any OpenAI-compatible API endpoint.

Yes. Cohrint is free to get started with no credit card required. The dashboard, SDK and API are all free for individual developers and small teams.

How does Cohrint compare LLM pricing?

Cohrint includes a real-time model pricing calculator that compares costs across GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 Pro, Llama 3 and 30+ other models based on your exact token usage.

Cohrint — Cut Your AI Coding Bill 40%

// how it works

Routing that works while you sleep

Classifies every request by intent

Autocomplete, generation, refactor, or explanation — classified in under 50ms. Each intent has a different cost-quality optimum.

Routes to the cheapest model that qualifies

Not the cheapest model, period. The cheapest model that meets your quality bar for that specific task. Quality is continuously sampled and measured.

Publishes real-time savings to your dashboard

Every routing decision is logged and overridable. You see exactly what was routed where, and exactly how much was saved. No black boxes.

// works everywhere

Drop into your stack in 60 seconds

🤖

Claude Code

Native MCP support — add one config block and you're live

MCP SERVER

💻

VS Code

MCP server — add .vscode/mcp.json and ask about costs in chat

MCP SERVER

🐍

Python

Two-line drop-in proxy for OpenAI and Anthropic SDKs

SDK

🟨

TypeScript / JS

createOpenAIProxy() wraps any existing client with zero changes

SDK

🖊️

MCP-compatible editors

Works with any editor supporting the Model Context Protocol — one config block

MCP SERVER

⌨️

Cohrint CLI

Transparent wrapper — optimize, forward, track. Works with any AI CLI agent.

CLI TOOL

📡

OTel Collector

Native OpenTelemetry ingestion — auto-track Claude Code, Codex CLI, and Gemini CLI

OTEL

🔒

Local Proxy

Privacy-first HTTP proxy — your keys and prompts never leave your machine

PROXY

📟

Codex CLI

Track OpenAI Codex usage via OTel or CLI wrapper

OTEL

// integration

Two lines. Seriously that's it.

Python

TypeScript

MCP (Claude Code)

CLI Wrapper

# Before
from openai import OpenAI

# After — only 2 lines changed
import cohrint
from cohrint.proxy.openai_proxy import OpenAI

cohrint.init(api_key="crt_your_key")
client = OpenAI(api_key="sk-...")

# Everything else is identical — Cohrint wraps transparently
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)
# ✓ Tokens: 12 in, 8 out  ✓ Cost: $0.000110  ✓ Latency: 423ms
# ✓ Cheapest alternative: gemini-1.5-flash — save 94%
// Before
import OpenAI from "openai";

// After — only 2 lines changed
import { init, createOpenAIProxy } from "cohrint";
import OpenAI from "openai";

init({ apiKey: "crt_your_key" });
const openai = createOpenAIProxy(new OpenAI());

// Identical API — Cohrint wraps every call automatically
const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
});
// ✓ Captured: tokens, cost, latency, cheapest alternative
// ~/.claude/mcp.json  (or claude-code / VS Code equivalent)
{
  "mcpServers": {
    "cohrint": {
      "command": "npx",
      "args": ["-y", "cohrint-mcp"],
      "env": {
        "COHRINT_API_KEY": "crt_your_key",
        "COHRINT_ORG_ID": "your_org_id"
      }
    }
  }
}

// Then ask Claude Code in chat:
// "How much did I spend on AI this week?"
// "Which model is cheapest for my summarisation workflow?"
// "Show requests wasting the most tokens"
# Install globally
$ npx cohrint-cli

# Pipe mode — optimize + forward to Claude
$ echo "Could you please explain kubernetes pods" | cohrint
  ⚡ Optimized: 16 → 12 tokens (saved 4, -25%)

  A Kubernetes pod is the smallest deployable unit...

  💰 Cost: $0.0065 | 💾 Saved: 4 tokens

# REPL mode — switch agents on the fly
$ cohrint
cohrint [claude] ▸ explain load balancers
cohrint [claude] ▸ /gemini summarize this in 2 lines
cohrint [claude] ▸ /compare what is DNS
cohrint [claude] ▸ /session  # interactive mode with /compact, /clear
cohrint [claude] ▸ /summary # dashboard stats in terminal

// live demo

See it in action — interactive preview

cohrint.com / app.html

INTERACTIVE

MTD Spend

$4,821

↑ 12%

Tokens Used

182M

↓ eff +8%

Developers

3 providers

Budget Used

64%

$3,086 left

Daily spend by provider — last 30 days

CLAUDE CODE

$2,140

8 developers

CODEX CLI

$1,430

5 developers

GEMINI CLI

$820

4 developers

Agent	Model	Time	Tokens	Cost
★ gemini	gemini-2.0-flash	1.2s	320	$0.0001
claude	claude-sonnet-4-6	2.1s	420	$0.0065
codex	gpt-4o	1.8s	380	$0.0048

Create free account → Open full dashboard →

No credit card required · Free for teams under $500/mo in AI spend

// pricing

You only pay when you save

FREE

^$0

for teams under $500/mo in AI spend

Start for free →

Routing with quality control
Real-time savings dashboard
No credit card

Cut your AI coding bill 40% —
without changing a line of code.

Your team is overpaying.
We'll fix that.

Cut your AI coding bill 40% —without changing a line of code.

Your team is overpaying.We'll fix that.

Cut your AI coding bill 40% —
without changing a line of code.

Your team is overpaying.
We'll fix that.