PocketCoder utilizes the Google Gemini (Antigravity), Anthropic Claude (Kiro), and MiniMax (OpenCode) families of LLMs for code generation. DeepWiki was used to understand the open source codebases.

Artificial Intelligence and LLM applications, Developer tools and infrastructure, Cybersecurity and data privacy

Inspiration

I am deeply inspired by the open-source community, which is currently moving at lightning speed to build AI tools that consistently rival—and sometimes exceed—the capabilities of proprietary tech giants.

We are seeing amazing open-source projects (like OpenClaw) prove that mobile AI coding is a reality. However, taking these raw, powerful concepts and making them truly secure for enterprise-grade infrastructure is a massive challenge. Currently, developers are forced into a harsh compromise:

  • Tethered & Vendor-Locked Ecosystems: Tools like Claude Remote Control solve the security issue, but they are far from "mobile-first." Sessions must be manually kick-started from a desktop terminal before you can actually leave your desk. Furthermore, they are closed-source and strictly locked to a single LLM provider, defeating the purpose of flexible, sovereign infrastructure.

  • No Irrefutable Audit Trail (Cryptographic Deniability): Chat apps powered by the Signal Protocol (like WhatsApp) utilize the Double Ratchet algorithm and shared symmetric Message Authentication Codes (MACs). This is explicitly engineered for human privacy and cryptographic deniability. Because the AI server shares the symmetric key, it can mathematically forge messages to look exactly like they came from the user. If a compromised AI wipes your infrastructure, it is mathematically impossible to prove whether you authorized the command or the server forged it. Managing infrastructure requires strict non-repudiation.

  • Infinite Blast Radius: Current AI IDEs and agents run raw on the host OS. Unless you go through the pain of creating a heavily restricted secondary user account, a hallucinating agent can read or write files entirely out of scope.

  • Malware-Prone Tooling: Open-source "skills" are often arbitrary, community-driven executables. Feeding unverified scripts to a stochastic LLM with host access is a recipe for a compromised machine.

  • Mobile UX Nightmares: Trying to read 500-line code diffs or type complex syntax into a mobile SSH terminal is an exercise in frustration.

The Vision & The Stack: I wanted to build a Sovereign AI platform that captures the magic of open-source mobile agents, but with "paranoid-by-default" security. This inspired me to curate a stack of my absolute favorite tools to solve this exact fracture:

  • OpenCode & Tmux: I chose OpenCode for its unmatched reasoning tooling and paired it with Tmux to give the execution sub-agents a flawless, battle-tested terminal experience.

  • PocketBase: I chose this because it is a brilliantly simple, single compiled Go binary—making it the perfect, lightweight gatekeeper and zero-trust state ledger.

  • Docker MCP Gateway: I used this because it solves two massive problems at once: it dynamically discovers tools (which drastically reduces the LLM's context window overload) AND strictly isolates every single MCP execution.

  • Flutter: Picked for its brilliant multiplatform development capabilities, allowing me to replace insecure chat-bots with a natively secure, JWT-authenticated mobile app.

What it does

PocketCoder is a Sovereign AI orchestration platform. It replaces chat bridges with a dedicated Flutter client and replaces raw host execution with a heavily fortified, Dockerized backend.

Here is how PocketCoder solves the mobile agent crisis:

  • Secure Mobile App (No Messenger Hacks): Instead of relying on Messenger bots, PocketCoder uses a custom Flutter app backed by PocketBase. It utilizes standard JWT authentication to guarantee a verifiable audit trail, meaning it can safely sit on the open internet (or behind Tailscale). Crucially, all sensitive AI intents are caught by a Relay and held for explicit, binary "Approve/Deny" UI gates on your phone instead of parsing dangerous qualitative text.
  • The Dockerized Fortress (Zero Host Access): Instead of running raw on your machine, everything is Dockerized. The AI can only see the specific volumes you explicitly mount. Furthermore, we enforce strict separation of concerns: the AI reasoning engine (OpenCode) lives in an isolated container, while actual execution happens in a separate, hardened Sandbox container.
  • Certified, Ephemeral Tooling (No Malware): PocketCoder rejects risky community scripts. It strictly utilizes Docker-certified Model Context Protocol (MCP) servers. Using the Docker MCP Gateway, the system dynamically spins up exactly what the agent needs (e.g., an n8n mcp container), executes the approved task, and instantly discards the container. You control exactly which MCPs are allowed.
  • "Antigravity" Mobile Orchestration: Terminals suck on mobile. Inspired by tools like Antigravity IDE, PocketCoder shifts the mobile UX from nitty-gritty coding to high-level management. You oversee the main agent (Poco) from your phone, approving plans while it delegates the complex terminal work to specialized subagents in the background.

How I built it

PocketCoder was built by a solo developer, leveraging an "Alpine Linux" philosophy—using a tiny surface area of custom code to connect standard, battle-tested Unix tools. The tech stack includes:

  • PocketBase (Go): The core backend, zero-trust state ledger, authentication layer, and Relay logic.
  • OpenCode: The core LLM-agnostic reasoning engine (utilizing Gemini/Claude models).
  • CAO (CLI Agent Orchestrator) / Python: Manages the hierarchical multi-agent system and coordinates isolated Tmux sessions.
  • Flutter / Dart: The declarative UI framework used to build the "Green Terminal" mobile client.
  • Docker & Docker Socket Proxy: Provides ephemeral sandboxing, strict network isolation bridging, and restricts backend container execution privileges.
  • Rust: Used to build the lightweight Proxy/Shell Bridge for synchronous command execution.
  • MCP Gateway: Dynamically discovers and manages tool servers (e.g., n8n, Terraform).
  • SQLPage: Provides internal database observability.

Challenges I ran into

The biggest hurdle was solidifying the architecture to achieve true "Reasoning vs. Execution Isolation." Because I was gluing together several different complex codebases for the first time (using DeepWiki to navigate the repositories), finding the most efficient architectural design required heavy iteration.

  • The Inter-Process Communication (IPC) Trap: I naively started by using an SSH tunnel to relay messages between the Sandbox sub-agents and the main OpenCode agent. This created a massive ~400 MB RAM overhead! Swapping that out for a simple, internal HTTP request eliminated the bloat entirely and kept the system featherweight.

  • Asynchronous State Management: The event-driven state management was incredibly tricky. Catching asynchronous Server-Sent Events (SSE) from the OpenCode server in PocketBase, ordering them chronologically, managing the user vs. agent conversational turns, and then streaming those updates securely to the Flutter client via another SSE connection required some serious backend mental gymnastics.

Accomplishments that I'm proud of

As a solo developer, architecting the PocketBase system and getting the full multi-agent workflow operational is a huge win, but I am particularly proud of how the Docker MCP Gateway was incorporated.

Because it is an ultra-new tool, successfully wiring it into the architecture was a challenge, but the payoff is massive. It allows PocketCoder to dynamically provision capabilities for the agents on the fly, leveraging the massive Docker MCP ecosystem. Seeing the system flawlessly catch an intent, route it to the Flutter app for a JWT-authenticated approval, and then dynamically spin up those ephemeral Docker tools is incredibly satisfying.

PocketCoder is engineered for high-performance efficiency. The entire orchestration stack—including the database, reasoning engine, execution sandbox, and security proxies—operates within a small memory envelope.

Component Role RAM Usage
OpenCode Reasoning Engine 364.7 MiB
Sandbox Execution Env (Tmux/CAO) 270.9 MiB
MCP Gateway Dynamic Tooling 88.1 MiB
PocketBase Security Relay & Ledger 39.1 MiB
Docker Proxy Secure Socket Bridge 30.3 MiB
SQLPage Observability Dashboard 25.3 MiB
Total Stack ~818.4 MiB

What I learned

Building PocketCoder was an exercise in system architecture and secure container orchestration.

  • Connecting Distributed Systems: I learned firsthand about the myriad of ways to connect systems—shared volume mounts, HTTP REST APIs, Unix sockets, SSH, and SSE streams. It turns out there are a ton of ways to glue services together, but you have to pick the right protocol for the right circumstance (like learning the hard way that SSH tunnels are overkill for internal agent routing).

What's next for PocketCoder

  • Mobile Dashboard (In Development): Connect SQLPage and Docker Compose logs to Flutter to monitor the PocketCoder backend in real-time.
  • Mobile File Viewer (In Development): Serve the files that the AI generated via PocketBase to the Flutter client.
  • Mobile Device Push Notifications (In Development): Transitioning the current mobile client to support push notifications via Ntfy (foss) and Firebase Cloud Messaging (proprietary) for asynchronous tool approvals.
  • Mobile Bootstrap Deployment (Planned): Implementing cloud-init from the Flutter mobile app to self-deploy the backend via IaC APIs (e.g. Linode).
  • Terminal Checking In (Planned): Implement SSH into the Tmux sessions to check in on the subagents and see in real-time how their work is progressing.
  • Advanced CAO Planning (Future): Implementing parallel and series job orchestration for the supervisor agent to manage complex projects efficiently.

Built With

Share this project:

Updates