Frontend (Next.js) + backend (Flask) for adaptive post‑production product placement.
frontend/ # Next.js app
backend/ # Flask API + prompt logic
Frontend:
cd frontend
npm install
npm run dev
Backend:
cd backend
pip install -r requirements.txt
python main.py
The backend uses the Gemini API (via GOOGLE_API_KEY) to generate contextual
text and drive the video workflow.
The Flask server listens on http://localhost:8000 by default and exposes three endpoints:
- Returns
200with{"status":"healthy"}when the API is up. 500on internal errors.
- Targeted by the frontend “Process items” button.
- Request:
multipart/form-datawithvideo: required video file (mp4,avi,mov,mkv).image: required reference image (png,jpg,jpeg,webp).text: optional textual instructions/prompts.
- Response: binary stream of the generated video (
Content-Typeinferred from the file). The Gemini summary is echoed in theX-Generated-Textheader when available. Errors return JSON{"error": "..."}with status400(bad input) or500(processing failure).
Example:
curl -o output.mp4 \
-H "Accept: video/mp4" \
-F "video=@/path/input.mp4" \
-F "image=@/path/reference.jpg" \
-F "text=Swap the soda can for sparkling water" \
http://localhost:8000/analyze- Same payload as
/analyze. - Response JSON:
{ "video": "<base64-encoded binary>", "text": "<Gemini narrative>", "filename": "processed_<uuid>.mp4" } - Errors mirror
/analyze(400/500with{"error": "..."}).
Temporary uploads and generated outputs are deleted automatically once the response is sent, so callers should persist the response locally if needed.