Kimi-K2.6 · Cloudflare AI Gateway

Claude Code alternative, on your own Cloudflare account.

An open-source terminal coding agent powered by Kimi K2.6 and routed through your own Cloudflare AI Gateway — with per-request logs, response caching, and authoritative cost out of the box.

Install View source
kimiflare - kimi k2.6 cli code editor on cloudflare ai gateway | Product Hunt
zsh
kimiflare Ready when you are.
  › Explain this codebase
  › Find and fix a bug
  › Refactor a file
Type a message or /help for commands · ctrl-c to exit · shift+tab to cycle modes
add a /health endpoint
────────────────────────────────────────
thinking… I'll need to read the server file first, then add the endpoint.
read(src/server.ts)
read(src/server.ts)
edit src/server.ts
Permission requested
tool: edit
action: edit src/server.ts
@@ -42,6 +42,10 @@
  app.get('/', …)
+  app.get('/health', (_, res) => res.json({ ok: true }))
  …
Allow once   Allow for this session   Deny
edit src/server.ts
Done — added /health that returns { ok: true }.
────────────────────────────────────────
■ Update landing page terminal (12s · ↑ 2.3k tokens)
Read current UI components
Update terminal simulation
Create PR
[edit] k2.6 · medium · thinking…
in 2,847 (1,203 cached) · out 412 · ctx 12% · 0.00321

Recently shipped

  • Turn supervisor architecture with graceful preemption
  • Web search, GitHub read-only, and headless browser tools
  • Tiered skill routing with TUI visibility
  • Extensible JSON themes with WCAG contrast validation
  • KIMI.md drift detection with memory-based staleness
  • Fuzzy @ file picker with inline filtering
  • AI Gateway as the default backend with per-request logs, caching, and cost reconciliation
  • Per-turn latency and per-feature cost breakdown in /cost
  • Context-window guardrails

Coming next

  • Session tree for branching conversations
  • Cost attribution dashboard
  • More MCP server integrations
See full changelog →

What it does

01

Plan / Edit / Auto modes

Plan mode blocks all mutating tools for safe research. Edit mode prompts per call. Auto mode approves everything for trusted tasks.

02

Live task panel

For multi-step work, the agent publishes a task list with progress icons, elapsed time, and token deltas. Multi-step work feels managed.

03

Image understanding

Drop image paths (PNG, JPG, WebP, GIF, BMP up to 5 MB) into any prompt. The model sees them inline — perfect for UI reviews, diagrams, and screenshots.

04

Extensible JSON themes

Fully customizable color palettes with WCAG contrast validation. Pick from built-in presets or define your own. Live preview with Ctrl+T.

05

Web search, GitHub, and headless browser

Research the web, read GitHub repos, and fetch JavaScript-rendered pages — all without leaving your terminal.

06

LSP semantic code intelligence

Hover, go-to-definition, references, and diagnostics via Language Server Protocol. Auto-configured per project with an interactive wizard.

07

Turn supervisor with tiered skills

The agent picks the right skill depth for the task — from quick edits to deep research — with graceful preemption and visible TUI indicators.

08

Streaming reasoning

Toggle the model's chain-of-thought with /reasoning or Ctrl-R. See how it thinks in real time.

09

Session persistence

Every turn is auto-saved. /resume lists past sessions with message counts in a paginated picker. Never lose your place.

10

Smart permissions

Bash session-allow is keyed by the first token (allow all git commands). Write/edit show a unified diff before you approve.

11

262K context window

Read entire modules, large configs, and full stack traces without the model losing track. All on your Cloudflare account.

12

MCP server integration

Plug in external tools via the Model Context Protocol — local stdio servers or remote SSE endpoints. GitHub, Sentry, docs search, databases, and more.

13

AI Gateway — observability by default

Every request flows through your own Cloudflare AI Gateway. Per-request logs, response caching with configurable TTL, authoritative per-turn cost from the logs API, and a per-feature breakdown in /cost. The status bar shows the gateway-confirmed cost and latency for the last turn.

14

Explicit cross-session memory

The agent never surveils your conversation. Memories are stored only when you ask — via remember, recall, and forget tools — with SQLite + embeddings for durable, privacy-respecting retrieval across sessions.

kimiflare Node.js TUI
user msg → agent loop → runKimi()
POST SSE via your AI Gateway
gateway.ai.cloudflare.com logs · cache · cost

Workers AI @cf/moonshotai/kimi-k2.6
tool result ← tool executor ← tool_calls
permission modal for write / edit / bash

Three ways to work

01

Plan

Read-only research. Mutating tools are hard-blocked. Ask "plan a refactor" and the agent investigates without touching your filesystem. Review, then exit plan mode to execute.

02

Edit

Default mode. The agent calls tools freely for read-only work; mutating tools pause for your approval with a unified diff preview.

03

Auto

Autonomous execution. Every tool call is auto-approved. Use for trusted, well-scoped tasks. The agent still warns before irreversible actions.

Get started

bash
# Install
npm install -g kimiflare

# Run — onboarding asks BYOK or Cloud
kimiflare

# Or start in Cloud mode directly
kimiflare --cloud

Or run without installing: npx kimiflare

bash
# Interactive TUI
kimiflare

# One-shot mode
kimiflare -p "summarize PLAN.md"

# Auto-approve for scripts
kimiflare -p "..." --dangerously-allow-all

# Override model
kimiflare --model @cf/moonshotai/kimi-k2.6

# Stream reasoning to stderr
kimiflare --reasoning

# Image understanding — reference images inline
kimiflare
kimiflare -p "explain this diagram.png"

Core tools

Tool Permission Description
read auto Read a text file (≤ 2MB) with optional line range
write prompt Create or overwrite a file. Shows a diff before approval
edit prompt Replace an exact substring. Fails unless unique match
bash prompt Run a shell command. Session-allow keyed by first token
glob auto Match files by pattern, sorted by mtime
grep auto Regex search. Uses ripgrep if available
web_fetch auto Fetch a URL, convert HTML → markdown (≤ 100KB)
web_search auto Search the web and return summarized results
github_read auto Read files and issues from public GitHub repos
browser_fetch auto Headless browser for JavaScript-rendered pages
tasks_set auto Publish a live task list for multi-step work

Plus LSP intelligence (hover, go-to-definition, references, diagnostics), cross-session memory (remember / recall / forget), and MCP extensibility for plugging in external tool servers.