Inspiration
Editing videos is powerful but painful. Traditional editors require scrubbing through timelines, hunting clips, and stacking effects. We asked: what if editing was as easy as telling your computer what to do?
That’s how this project was born — inspired by Cursor, but reimagined for video editing. Instead of cut() and drag-and-drop, you just say:
“Trim the part where the guy speaking is silent.”
…and it happens.
How We Built It
- Backend (FastAPI + TwelveLabs + FFmpeg): Handles video/audio processing, trims clips, adds effects, and generates previews.
- NLP (Cohere): Parses natural language into structured editing commands.
- Video Search (Twelve Labs): Finds key moments (like “LeBron silent” or “when the person dies”).
- Frontend: A minimal Cursor-like UI with chat-driven commands, video preview, and instant feedback.
- Dev Environment (Windsurf): Used Windsurf’s AI-first IDE to rapidly prototype, refactor, and debug the entire stack under hackathon time pressure. It cut down iteration time massively.
The workflow looks like this:
User Command → Cohere (parse intent) → Twelve Labs (find moment)
→ Executor (FFmpeg) → Preview video
What We Learned
- How much AI-powered development environments like Windsurf accelerate building — almost like pair-programming with a senior engineer on demand.
- How multimodal AI (language + video) can completely reshape creative tools.
- The trade-offs between real-time performance vs. hackathon prototyping .
- How important it is to scope ruthlessly: better to demo 3 magical features than 10 broken ones.
- That even for creative tools, structured pipelines (NLP → search → execution) make everything easier.
Challenges We Faced
- Latency: Running Cohere + Our Server + Twelve Labs can get heavy. We solved it with mocks + pre-indexed demos.
- Parsing ambiguity: Natural language is messy. We had to carefully prompt Cohere and fallback to heuristics.
- Media handling: Combining audio overlays with video timelines isn’t trivial —
ffmpegsaved us here. - Time pressure: 30 hours forced us to keep scope razor-thin.
What’s Next
- More robust effect libraries (glitch, transitions, auto-cuts).
- Real-time collaboration — multiple users editing the same video via chat.
- Scaling up beyond demos: distributed video rendering for fast results.
Final Thought
We set out to answer one question:
What if video editing was as simple as talking to your video?
This project is our first step toward that future.
Built With
- cohere
- css
- fastapi
- ffmpeg
- html
- javascript
- python
- twelvelabs
- windsurf

Log in or sign up for Devpost to join the conversation.