An open-source terminal coding agent powered by Kimi K2.6 and routed through your own Cloudflare AI Gateway — with per-request logs, response caching, and authoritative cost out of the box.
Occasional release notes, technical write-ups, and early experiments on building AI coding agents on Cloudflare.
No spam. Only meaningful ships.
/health that returns { ok: true }.
/costPlan mode blocks all mutating tools for safe research. Edit mode prompts per call. Auto mode approves everything for trusted tasks.
For multi-step work, the agent publishes a task list with progress icons, elapsed time, and token deltas. Multi-step work feels managed.
Drop image paths (PNG, JPG, WebP, GIF, BMP up to 5 MB) into any prompt. The model sees them inline — perfect for UI reviews, diagrams, and screenshots.
Fully customizable color palettes with WCAG contrast validation. Pick from built-in presets or define your own. Live preview with Ctrl+T.
Research the web, read GitHub repos, and fetch JavaScript-rendered pages — all without leaving your terminal.
Hover, go-to-definition, references, and diagnostics via Language Server Protocol. Auto-configured per project with an interactive wizard.
The agent picks the right skill depth for the task — from quick edits to deep research — with graceful preemption and visible TUI indicators.
Toggle the model's chain-of-thought with /reasoning or Ctrl-R. See how it thinks in real time.
Every turn is auto-saved. /resume lists past sessions with message counts in a paginated picker. Never lose your place.
Bash session-allow is keyed by the first token (allow all git commands). Write/edit show a unified diff before you approve.
Read entire modules, large configs, and full stack traces without the model losing track. All on your Cloudflare account.
Plug in external tools via the Model Context Protocol — local stdio servers or remote SSE endpoints. GitHub, Sentry, docs search, databases, and more.
Every request flows through your own Cloudflare AI Gateway. Per-request logs, response caching with configurable TTL, authoritative per-turn cost from the logs API, and a per-feature breakdown in /cost. The status bar shows the gateway-confirmed cost and latency for the last turn.
The agent never surveils your conversation. Memories are stored only when you ask — via remember, recall, and forget tools — with SQLite + embeddings for durable, privacy-respecting retrieval across sessions.
Read-only research. Mutating tools are hard-blocked. Ask "plan a refactor" and the agent investigates without touching your filesystem. Review, then exit plan mode to execute.
Default mode. The agent calls tools freely for read-only work; mutating tools pause for your approval with a unified diff preview.
Autonomous execution. Every tool call is auto-approved. Use for trusted, well-scoped tasks. The agent still warns before irreversible actions.
Or run without installing: npx kimiflare
| Tool | Permission | Description |
|---|---|---|
| read | auto | Read a text file (≤ 2MB) with optional line range |
| write | prompt | Create or overwrite a file. Shows a diff before approval |
| edit | prompt | Replace an exact substring. Fails unless unique match |
| bash | prompt | Run a shell command. Session-allow keyed by first token |
| glob | auto | Match files by pattern, sorted by mtime |
| grep | auto | Regex search. Uses ripgrep if available |
| web_fetch | auto | Fetch a URL, convert HTML → markdown (≤ 100KB) |
| web_search | auto | Search the web and return summarized results |
| github_read | auto | Read files and issues from public GitHub repos |
| browser_fetch | auto | Headless browser for JavaScript-rendered pages |
| tasks_set | auto | Publish a live task list for multi-step work |
Plus LSP intelligence (hover, go-to-definition, references, diagnostics), cross-session memory (remember / recall / forget), and MCP extensibility for plugging in external tool servers.