Inspiration

We wanted to build an ad tool that feels less like forced placement and more like a creative editing assistant. A lot of digital ads interrupt the experience they are trying to monetize, so we asked a different question: what if a system could understand the context of a scene first, then generate an ad moment that actually fits?

That idea became CAFAI: Context-Aware Fused Ad Insertion. The goal was to make product placement feel native instead of awkward, and to prove that the same idea could work across both video and website media.

What it does

CAFAI is a creative generation workflow with two lanes:

1. Video ad insertion

A user uploads a source video and product, the system analyzes the video for candidate insertion windows, ranks the best moment, generates a short branded bridge clip, and stitches that clip back into the original footage as a previewable final cut.

2. Website ad generation

A user provides product info plus article context, and the system generates a matching banner ad and vertical ad that are designed to fit the tone and topic of the page. Those assets can then be shown inside example website layouts.

In short, CAFAI helps turn product promotion into something more contextual, more visual, and more watchable.

How we built it

We built CAFAI as a full-stack hackathon project with a custom frontend and backend pipeline.

Frontend

We built the user-facing product in React, with a playful pink voxel-inspired interface. The frontend includes:

  • a homepage and proof wall
  • a gallery for processed outputs
  • an upload flow for both video ads and website ads
  • a simple About page
  • review pages for generated results

Backend

We used Go for the control plane and API layer. The backend manages:

  • job creation and workflow stages
  • analysis and slot selection
  • generation requests
  • preview rendering
  • website ad asset generation and delivery
  • metadata storage through SQLite

AI / media services

We connected multiple services depending on the task:

  • Azure Video Indexer for scene analysis
  • Azure OpenAI for ranking insertion slots and generating creative prompts
  • Higgsfield Kling as the primary video generation path
  • Azure ML as fallback generation wiring
  • Hugging Face / Stable Diffusion XL for website ad image generation

We also used local file storage for previews and generated assets, plus optional Notion logging for job audit history.

Challenges we ran into

One of the biggest challenges was making the project feel like one product instead of a pile of disconnected AI features. Video insertion, static ad generation, previews, galleries, and job tracking all had different technical needs, but the experience still had to feel coherent.

We also ran into the practical challenge of building a pipeline that depends on multiple external services. Different providers have different inputs, speeds, and failure cases, so a lot of work went into keeping the workflow understandable even when the backend was doing complex orchestration.

Another challenge was proving that the generated result was actually believable. It was not enough to say “AI made a clip.” We needed a proof-oriented UI that showed the original scene, the selected insert window, the generated bridge, and the final stitched output.

Accomplishments that we're proud of

We are proud that CAFAI is not just a concept mockup. It is a working multi-step system with a real frontend, backend routes, stored assets, and demo outputs.

Some highlights we are especially proud of:

  • building both a video ad lane and a website ad lane in one project
  • creating a polished frontend with a strong visual identity
  • showing proof assets instead of just final outputs
  • supporting real generated website ads through the backend
  • stitching branded video moments back into source footage
  • making the whole project feel playful on the surface while still technically serious underneath

What we learned

We learned a lot about designing AI workflows as products, not just demos. The most important lesson was that orchestration matters as much as generation. Picking the right moment, structuring the pipeline, showing evidence, and handling state between steps are what make the experience feel useful.

We also learned how much presentation affects trust. A cleaner interface, clearer job states, and visible proof artifacts make users more willing to believe the output. On the engineering side, we got deeper experience with React, Go APIs, provider integration, media handling, and building around imperfect AI outputs.

What's next for CAFAI

Next, we want to make CAFAI more robust and more automatic.

Our next steps would be:

  • asynchronous website ad jobs with better progress tracking
  • stronger failover between providers
  • richer article ingestion from URLs instead of pasted text
  • more export options and packaged deliverables
  • better automation around review and approval
  • more examples, more products, and stronger production-readiness

Long term, we want CAFAI to become a real creative system for context-aware advertising across multiple formats, not just a one-off demo.

Built With

  • azure-blob-storage
  • azure-ml
  • azure-openai
  • azure-video-indexer
  • css
  • go
  • higgsfield-kling
  • hugging-face-inference-api
  • logging
  • mcp
  • notion
  • react
  • sqlite
  • stable-diffusion-xl
  • typescript
  • vite
Share this project:

Updates