Inspiration
AI agents are powerful, but interaction is still tool-centric: prompts, configs, dashboards.
We built an interface where delegation is spatial and direct — like walking up to a coworker and stating the outcome.
What It Does
BossRoom is a real-time multiplayer 3D office where AI agents are persistent coworkers.
Real Execution (Not Chat)
Agents execute real work across 900+ enterprise integrations :
- Messaging & Communication: Slack, Microsoft Teams, Discord, WhatsApp Business
- Email & Calendar: Gmail, Outlook, Google Calendar
- Docs & Knowledge: Notion, Confluence, Google Docs/Drive, Airtable
- Whiteboarding & Design: Miro, Figma
- Project & Issue Tracking: Linear, Jira, GitHub, GitLab
- CRM & Sales: Salesforce, HubSpot
- DevOps & Cloud: AWS, GCP, Azure
- Databases & Infra: PostgreSQL, MongoDB, Supabase
- Commerce & Payments: Stripe, Visa Intelligent Commerce (MCP)
- Search & Data: SERP APIs, live product search, external data sources
All actions execute in real user-scoped accounts via secure OAuth — no API key's, no configuration.
Dynamic Multi-Agent Teams
- A Receptionist receives a goal (e.g., “research competitors + write report”).
- It dynamically spawns 3–12 agents via tool calls.
- The LLM decides roles, skills, and leadership.
- A lead agent delegates subtasks to workers.
- Workers post updates to a shared scratchpad feed.
- The lead compiles and finalizes output.
- All agents and workspace state persist in PostgreSQL.
No hardcoded bots. Every workspace builds itself from the goal.
Interaction Layer
- Push-to-talk voice input
- Real-time transcription → execution → spoken response
- Agents have visible states (listening, thinking, working, done, error)
- Proximity-based player voice chat (WebRTC spatial audio)
World Layer
- Procedurally generated chunked terrain (simplex noise, LOD)
- Physics-based controls
- Multiple avatar models
- In-world 3D speech bubbles + visual state indicators
- In-world integrated views from your favorite apps
Architecture
Frontend
- Next.js 16 + React 19 + TypeScript + Tailwind v4
- React Three Fiber + drei
- Rapier physics
- Zustand (14 stores) synced via WebSocket
- 46 typed WebSocket message types, validated with Zod
- shadcn/ui for panels, scratchpad, product cards
Backend
- Node.js WebSocket game server (domain-driven modules)
- PostgreSQL 15 + Drizzle ORM (7 tables)
- Dynamic agent creation + runtime skill system
- Vercel AI SDK (
streamText, multi-step tool calls) - Vercel AI Gateway (Gemini / Claude / GPT-4o swappable)
- Composio OAuth for Gmail, Calendar, Linear, Stripe etc
- MCP support for external tool servers (Visa Intelligent Commerce)
Voice
Two independent spatial pipelines (shared AudioContext):
Agent voice loop
- Mic → WebSocket → Deepgram (STT)
- LLM execution
- Inworld TTS → HRTF spatial playback
Player voice
- PeerJS WebRTC
- HRTF panner per remote player
- Distance-based rolloff
Infrastructure
- Terraform-managed infrastructure
- Google Cloud Run (WebSocket server, 3600s timeout)
- Cloud SQL (Postgres)
- Cloudflare Pages (frontend)
- Firebase Auth (Google Sign-In)
- Cloud Build → Artifact Registry → Docker deploy
Fully deployed. Not localhost.
Challenges
- Building a physics-based 3D world with responsive third-person controls
- Real-time multiplayer state sync (positions, agent state, scratchpad, products) over a single multiplexed WebSocket connection
- Designing and validating 46 typed WebSocket message types (end-to-end Zod schema enforcement)
- Dynamic agent spawning (3–12 agents per workspace) with persistent storage and zero race conditions during streaming tool calls
- Multi-step LLM tool orchestration (up to 25 steps/turn) without blocking or state corruption
- Maintaining per-agent memory, role separation, and runtime skill creation
- Dual spatial audio pipelines (agent TTS + WebRTC player voice) sharing one AudioContext without interference
- Real-time STT → LLM → TTS voice loop with spatial playback tied to 3D coordinates
- OAuth scoping per user across 900+ integrations (secure isolation per Firebase UID)
- MCP tool server integration (Visa Intelligent Commerce) with fallback payment rails
- Cloud Run WebSocket deployment (HTTP/1.1, 3600s timeout, SQL proxy sidecar, keepalive strategy)
- Streaming AI responses while preserving deterministic game-state updates
- Procedural chunked terrain generation with LOD and performance constraints
- Shipping production infra (Terraform, Cloud Build, Docker, Cloud SQL, Cloudflare Pages) during a 36-hour hackathon
Accomplishments
- Turned “agent workflows” into a game loop: walk up → ask → watch progress → get the outcome.
- Made non-technical users effective on day one — no prompt craft, no dashboards, no setup rituals.
- Converted messy, multi-step execution into a single clear interaction: users state intent, the system handles planning + delegation + tool actions.
- Made agent work observable: you can see who’s doing what and hear responses spatially, instead of guessing in a black box.
- Built a collaborative feel (multiplayer + proximity voice) so delegating to AI feels like working in a room, not using a tool.
- Shipped real-world execution end-to-end (emails, tickets, meetings, payments) inside a fully deployed product in 36 hours.
~179 commits
6,000+ lines of TypeScript
46 WebSocket message types
14 Zustand stores
7 database tables
Full infrastructure-as-code deployment
This is a working system, not just a prototype.
What We Learned
The interface layer matters as much as the model.
Dynamic team creation — letting the LLM design the org structure per task — was the key architectural unlock.
Coordination becomes intuitive when agents are embodied, stateful, and observable.
What’s Next
- Automatic model routing per task type
- Visible in-world agent-to-agent collaboration
- Cross-workspace skill marketplace
- Expanded MCP integrations
- Deeper in-world commerce flows
Every team will manage fleets of AI agents.
BossRoom is the interface layer.
Built With
- cloudflare-pages
- composio
- deepgram
- drizzle-orm
- firebase
- gemini
- google-cloud-run
- google-cloud-sql
- inworld-ai
- next.js
- node.js
- nx
- peerjs
- postgresql
- rapier
- react
- react-three-fiber
- tailwind-css
- three.js
- typescript
- vercel-ai-sdk
- websocket
- zustand

Log in or sign up for Devpost to join the conversation.