AI Tools

7 free AI tools that make LLM costs measurable in 2026

LLM token prices vary by more than 100x and agentic apps burn tokens fast. Seven free tools to make your 2026 LLM spend visible, attributable, and controllable.

Read time: 13 min
Word count: 2.1K
Sections: 14
FAQs: 8

By Manu Shukla

Founder & Director June 25, 2026

Measuring LLM spend is the first step to controlling it.

On this page · 14 sections

1. LiteLLM: per-key cost attribution at the gateway
2. Langfuse: cost and latency on every trace
3. Cloudflare AI Gateway: a free cost dashboard with no infrastructure
4. Arize Phoenix: open-source tracing across agent steps
5. PostHog LLM analytics: cost next to product data
6. Traceloop OpenLLMetry: vendor-neutral cost telemetry
7. Opik by Comet: cost intelligence per trace
A note on Helicone
How the seven compare
India-specific considerations
How to choose
FAQ
How eCorpIT can help
References

Summary. As of June 2026, published LLM API prices run from about $0.14 per million input tokens for DeepSeek V4 Flash to $30 for GPT-5.4 Pro, a spread of more than 100x, with Claude Sonnet 4.6 at $3/$15 and Gemini 3.1 Pro at $2/$12 for input and output (CloudZero pricing comparison). Industry prices fell roughly 80% from 2025 to 2026, yet most engineering teams cannot say which feature, team, or customer spent that money. That is a measurement problem, and the first pillar of FinOps for AI is visibility: you cannot improve what you do not measure. This guide covers seven free tools that turn raw token usage into a per-call, per-team dollar figure: the LiteLLM gateway, Langfuse (acquired by ClickHouse on 16 January 2026), Cloudflare AI Gateway, Arize Phoenix, PostHog LLM analytics, Traceloop OpenLLMetry, and Opik by Comet. Every one has a free tier or a self-hostable open-source build, and the comparison and pricing below are dated to June 2026.

The reason this matters more in 2026 than in 2024 is agentic workloads. A single agent run can fan out into dozens of model calls, each with its own retries, tool calls, and long context windows. Cheaper per-token prices do not cap the bill when call volume grows faster than prices drop. Cost observability and performance observability are the same problem: you cannot explain a cost spike without knowing which prompts ran, which models were called, and how many retry loops fired. This article is written for engineering leads, platform teams, and founders running LLM workloads who want that visibility without buying an enterprise contract first.

A quick definition. An LLM cost tool reads the input and output token counts the provider returns for each call, multiplies them by that model's published per-token price, and attaches the result to a trace, a user, a team, or an API key. Some tools sit inline as a gateway or proxy; others ingest telemetry your application emits. The seven below split into those two camps, and most teams end up running one of each.

Tool	What it measures	Free option
LiteLLM	Per-key, per-team, per-user spend at the gateway	Open source, self-hosted
Langfuse	Cost and latency per trace, prompt, and model	Free cloud tier and MIT self-host
Cloudflare AI Gateway	Tokens, cost, caching, and errors per request	Free core, no infrastructure
Arize Phoenix	Traces and token costs across agent steps	Open source, self-hosted
PostHog LLM analytics	Cost and usage events alongside product data	100k events per month free
Traceloop OpenLLMetry	Token and latency telemetry via OpenTelemetry	Open source SDK, Apache 2.0
Opik by Comet	Token usage and model cost per trace	Open source and free cloud tier

1. LiteLLM: per-key cost attribution at the gateway

LiteLLM is an open-source proxy that sits between your application and 100-plus model providers behind one OpenAI-compatible API. Because every call passes through it, LiteLLM can track every token and every dollar and attribute spend per virtual key, per team, and per user. You create virtual keys with budgets and rate limits, attach tags to requests, and let each tag carry its own budget. For cost attribution depth, nothing else free comes close, because the gateway sees the whole flow. The open-source build is free to self-host and needs a database such as PostgreSQL; a paid Enterprise tier adds SSO and audit logs for teams that need them.

There is a serious caveat that engineering teams must weigh. LiteLLM had a hard security year in 2026. In March, attackers published backdoored versions of the litellm PyPI package (1.82.7 and 1.82.8) after stealing publishing credentials through a compromised CI step; the packages were live for about 40 minutes before PyPI quarantined them. Separately, CVE-2026-42208, a pre-auth SQL injection rated CVSS 9.3, was exploited within roughly 36 hours of disclosure and added to the CISA Known Exploited Vulnerabilities catalog on 8 May 2026. Later advisories covered a command-injection flaw and an authentication bypass affecting versions before 1.84.0. The lesson is not to avoid LiteLLM, which remains the strongest open-source cost gateway, but to treat the proxy as a security-sensitive component: run version 1.84.0 or later, keep it patched, put it behind authentication, and isolate it on the network.

2. Langfuse: cost and latency on every trace

Langfuse is the most widely adopted open-source LLM observability platform, with more than 21,000 GitHub stars as of February 2026. It records a trace for each LLM interaction and tracks cost and latency alongside prompt management, evaluation, and datasets. The free Hobby tier gives 50,000 units per month, 30-day retention, and two users. The core platform is MIT-licensed, so you can self-host the full tracing, evaluation, and cost-tracking stack with no event limit and no licence fee. Paid cloud plans run $29 per month for Core and $199 for Pro, with extra units billed at $8 per 100,000.

In January 2026, ClickHouse acquired Langfuse alongside a $400 million Series D round, at a reported $15 billion valuation. The acquisition formalised a dependency that already existed, since Langfuse had moved its data layer to ClickHouse for high-throughput ingestion. As Langfuse chief executive Marc Klingen put it, "We built Langfuse on ClickHouse because LLM observability and evaluation is fundamentally a data problem." For teams, the practical news is that the MIT licence and the free tier did not change. Langfuse is the default pick when you want self-hostable cost tracking tied to real traces.

3. Cloudflare AI Gateway: a free cost dashboard with no infrastructure

If running a database and a proxy yourself is more than you want, Cloudflare AI Gateway gives you a cost view with zero infrastructure to manage. You point your model calls through the gateway, and the dashboard shows requests, tokens, cost, caching, and errors, with per-request logs that include token usage, cost, and duration. The core features, including analytics, caching, rate limiting, and spend limits, are free. The main cost to watch is log streaming: Logpush to an external service adds $0.05 per million records beyond 10 million per month on paid plans. For a team that wants a cost dashboard live in an afternoon, this is the lowest-effort option on the list.

4. Arize Phoenix: open-source tracing across agent steps

Arize Phoenix is an open-source observability and evaluation tool that runs locally, in a notebook, in Docker, or in the cloud. It is fully free and self-hostable with no feature gates, under the Elastic License 2.0, and it instruments popular frameworks such as LangChain and LlamaIndex through OpenInference. For cost work, Phoenix shines on agentic traces: it shows the full tree of spans for an agent run, so you can see which step, tool call, or retry loop is burning tokens. Arize also offers a managed cloud, AX, with a free tier and a paid AX Pro plan. Phoenix is the pick when your costs hide inside multi-step agents rather than single calls.

5. PostHog LLM analytics: cost next to product data

PostHog folds LLM observability into the product-analytics platform many teams already run. Its AI observability features are free for 100,000 LLM events per month with 30-day retention. It captures the details of each generation, builds an aggregated metrics dashboard, and lets you query the data with SQL, so you can put LLM cost next to conversion, retention, and the rest of your product metrics. If you already use PostHog, you add cost visibility with no new contract and no separate tool to learn. The value here is correlation: tying a cost number to the product behaviour that caused it.

6. Traceloop OpenLLMetry: vendor-neutral cost telemetry

OpenLLMetry, maintained by Traceloop, is an open-source SDK under the Apache 2.0 licence that extends OpenTelemetry for LLM observability. It captures model name and version, prompt and completion token counts, latency, and errors, then exports that telemetry to a wide range of destinations, including Traceloop, Dynatrace, SigNoz, and a plain OpenTelemetry Collector. The advantage is portability: you instrument once with an open standard and avoid lock-in to any single vendor's backend. For platform teams that already run OpenTelemetry for the rest of their stack, OpenLLMetry keeps LLM cost data in the same pipeline as everything else.

7. Opik by Comet: cost intelligence per trace

Opik, from Comet, is an open-source platform under the Apache 2.0 licence for tracing, evaluating, and monitoring LLM and RAG applications. Its cost tracking is included free in both the open-source build and the free cloud tier, giving visibility into token usage and model cost per trace, with cost intelligence to track agent usage and spend across engineering teams. You can run it locally from the GitHub source or use the hosted version. Opik is a good fit when you want evaluation and cost in one open tool, so quality regressions and cost regressions show up on the same dashboard.

A note on Helicone

Helicone deserves a mention because it appears on most older lists, and its status changed in 2026. Mintlify acquired Helicone in March 2026, and the project moved into maintenance mode. The open-source proxy, under the Apache 2.0 licence, still routes traffic to 100-plus models and ships security fixes, bug fixes, and new model support, but active feature development has stopped and hosted customers are being migrated. By the acquisition, Helicone had served more than 16,000 organisations and processed over 14.2 trillion tokens. Self-hosting the open-source code remains supported, so Helicone is still usable, but new teams should weigh the frozen roadmap before adopting it.

How the seven compare

The split that matters is gateway versus telemetry. A gateway (LiteLLM, Cloudflare) sits inline and enforces budgets, so it can both measure and block. A telemetry tool (Langfuse, Phoenix, PostHog, OpenLLMetry, Opik) records what happened and explains it, but does not stop a runaway agent on its own. Most production setups pair one of each: a gateway for control, an observability tool for depth. The second axis is hosting. A managed option like Cloudflare AI Gateway gets you a cost view in an afternoon, while a self-hosted open-source build keeps prompt logs and spend data inside your own environment at the cost of running the infrastructure yourself.

Tool	Free tier (June 2026)	Licence or hosting
LiteLLM	Full proxy free to self-host	MIT, self-host (use 1.84.0+)
Langfuse	50k units/month cloud; unlimited self-host	MIT core, cloud or self-host
Cloudflare AI Gateway	Free core dashboard and logs	Managed, no self-host needed
Arize Phoenix	Fully free, no feature gates	Elastic License 2.0, self-host
PostHog	100k LLM events/month	Open source, cloud or self-host
Traceloop OpenLLMetry	Free SDK, unlimited	Apache 2.0, self-host backend
Opik by Comet	Free cloud tier and self-host	Apache 2.0, cloud or self-host

For context, the prices these tools attach to your tokens are moving targets. The table below shows published list prices as of June 2026; treat them as a snapshot, not a contract.

Model	Input per 1M tokens	Output per 1M tokens
GPT-5.4	$2.50	$15.00
GPT-5.4 Pro	$30.00	$180.00
Claude Sonnet 4.6	$3.00	$15.00
Gemini 3.1 Pro	$2.00	$12.00
DeepSeek V4 Flash	$0.14	$0.28

India-specific considerations

For engineering teams in India, two factors push toward the self-hosted, open-source options on this list. The first is data residency and privacy. Prompt and completion logs routinely contain personal data, and under the Digital Personal Data Protection Act 2023 and the DPDP Rules 2025, that data carries consent and handling duties. Self-hosting Langfuse, Phoenix, Opik, or an OpenLLMetry backend keeps those logs inside your own environment rather than a third-party cloud, which simplifies the compliance story. The second is cost control on the cost tool itself: per-seat SaaS observability at $29 to $199 per user per month adds up for a growing team, whereas the MIT and Apache builds cost only the infrastructure they run on. A practical pattern for an India-based product team is LiteLLM as a self-hosted gateway for budgets and attribution, paired with a self-hosted Langfuse or Phoenix for trace-level cost, so both spend data and prompt logs stay in country. For the wider picture on running AI in production, see our generative AI enterprise strategy for 2026.

How to choose

Start with the question you most need answered. If it is "which team or customer spent this," begin with a gateway: LiteLLM if you can run it safely, Cloudflare AI Gateway if you want zero infrastructure. If it is "why did this agent run cost so much," begin with trace-level observability: Langfuse or Arize Phoenix. If you already run PostHog or OpenTelemetry, extend what you have with PostHog LLM analytics or OpenLLMetry rather than adding a tool. None of these requires a paid plan to start, so the real cost is the integration time, not the licence. Pick one gateway and one observability tool, instrument a single high-volume workload first, and expand once the numbers line up with your provider invoice.

FAQ

How eCorpIT can help

eCorpIT is a CMMI Level 5, senior-led engineering organisation in Gurugram that builds and instruments production LLM systems. We set up the gateway-plus-observability pattern described above, wire per-team and per-feature cost attribution into your stack, and self-host the open-source tools so prompt logs and spend data stay inside your environment and aligned with DPDP requirements. If your LLM bill is growing faster than you can explain it, talk to our team or read more about how we work.

References

CloudZero, LLM API pricing comparison in 2026

FinOps Foundation, FinOps for AI overview

PostHog, 7 best free and open source LLM observability tools

LiteLLM docs, spend tracking

LiteLLM, security update: suspected supply chain incident (March 2026)

The Hacker News, LiteLLM CVE-2026-42208 SQL injection exploited within 36 hours

Langfuse, pricing

ClickHouse, ClickHouse welcomes Langfuse

InfoWorld, ClickHouse buys Langfuse as data platforms race to own the AI feedback loop

Cloudflare AI Gateway, pricing

Cloudflare AI Gateway, costs and observability

Arize Phoenix, GitHub repository

PostHog, LLM analytics documentation

Traceloop, OpenLLMetry GitHub repository

Comet, Opik cost tracking documentation

Helicone, joining Mintlify

_Last updated: 25 June 2026._

Frequently asked

Quick answers.

01 What is the best free tool to track LLM costs?

There is no single best tool; it depends on your stack. LiteLLM gives the deepest per-key and per-team cost attribution as a gateway, Langfuse pairs tracing with cost and latency tracking, and Cloudflare AI Gateway adds a free cost dashboard with no infrastructure. Most teams combine a gateway with an observability tool.

02 Can I measure LLM spend without sending data to a third party?

Yes. Langfuse, Arize Phoenix, Opik, and Traceloop OpenLLMetry are open source and can be self-hosted at no licence cost, so your prompts and cost data stay on your own infrastructure. This matters under the DPDP Act when prompt logs contain personal data that should not leave your environment.

03 Is LiteLLM safe to use after the 2026 security incidents?

Only on a current, patched version. LiteLLM had a supply-chain compromise in March 2026 and several actively exploited CVEs, including a SQL injection rated CVSS 9.3 that CISA added to its Known Exploited Vulnerabilities catalog. Run version 1.84.0 or later, behind authentication, on isolated infrastructure.

04 How is LLM cost tracking different from normal cloud cost tracking?

Cloud cost tracking works at the resource level, like servers or storage. LLM cost depends on tokens, which vary per request with prompt length, model choice, and retries. These tools convert token counts into a dollar figure for each call, then attribute it to a user, team, feature, or key.

05 What free tier does Langfuse offer in 2026?

Langfuse keeps a free Hobby tier with 50,000 units per month, 30-day data retention, and two users, plus community support. Its core platform is MIT-licensed, so you can self-host the full tracing, evaluation, and cost-tracking stack with no event limit. ClickHouse acquired Langfuse in January 2026 without changing this.

06 Does Cloudflare AI Gateway cost anything?

The core of Cloudflare AI Gateway is free, including the dashboard that shows requests, tokens, cost, caching, and errors, plus per-request logs and spend limits. You route your model calls through it with no infrastructure to run. Heavy log streaming through Logpush adds a small per-record fee on paid plans.

07 What happened to Helicone?

Mintlify acquired Helicone in March 2026, and the open-source proxy moved into maintenance mode. It still routes traffic to 100-plus models and ships security and bug fixes, but active feature work has stopped, and hosted customers are being migrated. Self-hosting the Apache-2.0 code remains supported for now.

08 How do these tools calculate the cost of a request?

Each tool reads the token counts an LLM provider returns, the input and output tokens for the call, and multiplies them by that model's published per-token price. Gateways like LiteLLM and Cloudflare do this inline; observability tools like Langfuse and Opik attach the cost to the trace they record.

About the author

Manu Shukla

Founder & Director

Founder of eCorpIT. Hands-on engineer leading senior-only delivery for AI apps, custom software, and cloud systems for global clients.

One engineering note a week. No fluff, no spam.

Senior-architect playbooks on AI agents, mobile apps, cloud, security, data, and marketing — delivered every Wednesday.

Past the reading

Read enough. Let's build something.

A senior architect responds in 24 working hours with scope, indicative cost, and a timeline. NDA before any technical conversation.

Talk to an architect Browse the 10 practices