Media Generation

Generate images, videos, and 3D models directly from chat conversations using state-of-the-art AI models.

Multi-Modal Capabilities

ChatRAG supports text-to-image, image-to-image, text-to-video, image-to-video, and text-to-3D generation through multiple providers.

Image Generation

Enable Image Generation

NEXT_PUBLIC_IMAGE_GENERATION_ENABLED=true
IMAGE_GENERATION_PROVIDER=fal  # or "openai" or "replicate"

FAL.ai

Recommended

Fast, high-quality image generation with multiple models

FAL_API_KEY=...
FAL_IMAGE_MODEL=fal-ai/bytedance/seedream/v4/text-to-image
FAL_IMAGE_TO_IMAGE_MODEL=fal-ai/bytedance/seedream/v4/edit

OpenAI DALL-E

High-quality, prompt-faithful generations

OPENAI_API_KEY=...
OPENAI_IMAGE_MODEL=gpt-image-1

Replicate

Access to Stable Diffusion and other models

REPLICATE_API_TOKEN=...

Supported Features

Text-to-image generation
Image-to-image editing
Style transfer
Automatic storage in Supabase
Inline display in chat

Video Generation

Enable Video Generation

NEXT_PUBLIC_VIDEO_GENERATION_ENABLED=true
VIDEO_GENERATION_PROVIDER=fal  # Default: FAL.ai
USE_REPLICATE_PROVIDER=false   # Set to true for Replicate

FAL.ai Video Models

State-of-the-art video generation

FAL_API_KEY=...
FAL_VIDEO_TEXT_MODEL=fal-ai/veo3/fast
FAL_VIDEO_IMAGE_MODEL=fal-ai/veo3/image-to-video

• Text-to-video
• Image-to-video animation
• Fast generation times

Alternative Providers

Switch to Replicate for more options

• Runway ML models
• Luma AI
• Kling AI
• Custom model support

3D Model Generation

Enable 3D Generation

NEXT_PUBLIC_3D_GENERATION_ENABLED=true
FAL_3D_MODEL=fal-ai/trellis  # Default Trellis model
USE_REPLICATE_PROVIDER=false # Or use Replicate alternative

Trellis (FAL.ai)

High-quality 3D model generation from text

• Text-to-3D
• GLB/GLTF export
• Interactive 3D viewer
• Texture generation

Meshy AI (Alternative)

Professional-grade 3D models

REPLICATE_3D_MODEL=...

Storage Configuration

Generated media is automatically stored in Supabase Storage buckets:

chat-images

Stores all generated images from chat conversations

chat-videos

Stores generated video files with metadata

3d-models

Stores 3D model files (GLB/GLTF format)

Buckets are automatically created by the supabase/complete_setup.sql script with proper access policies.

Usage in Chat

Generate Images

Example prompts:

"Generate an image of a sunset over mountains"
"Create a logo for a tech startup"
"Draw a cartoon character with blue hair"

Generate Videos

Example prompts:

"Create a video of waves crashing on a beach"
"Animate this image [upload]"
"Generate a timelapse of a city at night"

Generate 3D Models

Example prompts:

"Create a 3D model of a coffee mug"
"Generate a 3D spaceship design"
"Make a 3D model of a modern chair"

Best Practices

Start with FAL.ai

FAL.ai offers the best balance of speed, quality, and cost for most use cases

Monitor API Costs

Media generation can be expensive. Set up usage alerts in your provider dashboard

Optimize Storage

Configure Supabase Storage lifecycle policies to archive or delete old media

Use Detailed Prompts

More detailed prompts yield better results. Include style, mood, colors, and composition details

Test Generation Times

Different models have different generation times. Test to find the right speed/quality balance

Complete Configuration Reference

# Enable Features
NEXT_PUBLIC_IMAGE_GENERATION_ENABLED=true
NEXT_PUBLIC_VIDEO_GENERATION_ENABLED=true
NEXT_PUBLIC_3D_GENERATION_ENABLED=true

# Providers
IMAGE_GENERATION_PROVIDER=fal
VIDEO_GENERATION_PROVIDER=fal
USE_REPLICATE_PROVIDER=false

# FAL.ai Configuration
FAL_API_KEY=your_fal_api_key
FAL_IMAGE_MODEL=fal-ai/bytedance/seedream/v4/text-to-image
FAL_IMAGE_TO_IMAGE_MODEL=fal-ai/bytedance/seedream/v4/edit
FAL_VIDEO_TEXT_MODEL=fal-ai/veo3/fast
FAL_VIDEO_IMAGE_MODEL=fal-ai/veo3/image-to-video
FAL_3D_MODEL=fal-ai/trellis

# OpenAI (optional)
OPENAI_API_KEY=your_openai_key
OPENAI_IMAGE_MODEL=gpt-image-1

# Replicate (optional)
REPLICATE_API_TOKEN=your_replicate_token
REPLICATE_3D_MODEL=custom_model_id

# Storage (created automatically)
# Buckets: chat-images, chat-videos, 3d-models

← Previous: Authentication Next: MCP Integration →