Video AI Indexing

Video AI indexing enables your AI agent to search and answer questions about video content. Videos are automatically analyzed, transcribed, and indexed for semantic search with timestamp context.

How It Works

Video Upload → Processing Pipeline → AI Knowledge Base
     │                │                     │
     │                ▼                     │
     │         ┌─────────────┐             │
     │         │ Segmentation│             │
     │         │ (60s chunks)│             │
     │         └─────────────┘             │
     │                │                     │
     │                ▼                     │
     │         ┌─────────────┐             │
     │         │ Transcription│            │
     │         │ (Audio→Text)│             │
     │         └─────────────┘             │
     │                │                     │
     │                ▼                     │
     │         ┌─────────────┐             │
     │         │ AI Analysis │             │
     │         │ (Content)   │             │
     │         └─────────────┘             │
     │                │                     │
     │                ▼                     │
     └────────► Embeddings ────────────────┘
  1. Segmentation - Video split into time-based chunks (default: 60 seconds)
  2. Transcription - Audio converted to text for each segment
  3. AI Analysis - Visual and audio content analyzed
  4. Embedding - Segments embedded for semantic search
  5. Indexing - Content added to knowledge base with timestamps

Enabling Video AI

From Media Library

  1. Go to CMSMedia tab
  2. Upload a video or select an existing one
  3. In the file details panel, find AI Indexing
  4. Toggle Index for AI to ON

Configuration Options

OptionDescriptionDefault
Segment DurationLength of each chunk in seconds60
Detect TechniquesEnable domain-specific recognitionOn
Extract AudioTranscribe speech to textOn
Digestion InstructionsCustom context for AI analysisNone

Starting Processing

  1. Configure your options
  2. Click Start Processing
  3. Monitor progress in the status indicator

Processing Status

Status Indicators

StatusIconDescription
Not IndexedVideo not processed
QueuedWaiting to start
UploadingSent to processor
ProcessingAnalyzing content
CompletedReady for AI search
FailedError occurred

Progress Tracking

While processing:

  • Progress bar - Overall completion (0-100%)
  • Stage - Current processing step
  • Estimated time - Approximate time remaining

Processing Stages

  1. Queued - Job created, waiting in queue
  2. Uploading - Video sent to processing service
  3. Segmenting - Breaking video into chunks
  4. Transcribing - Converting audio to text
  5. Analyzing - AI analyzing each segment
  6. Embedding - Creating vector embeddings
  7. Complete - Indexed and searchable

Configuration Details

Segment Duration

Controls how the video is chunked:

DurationBest For
30 secondsFast-paced content, frequent topic changes
60 secondsGeneral purpose (recommended)
120 secondsLectures, long-form discussions
300 secondsMinimal segmentation

Shorter segments = more precise timestamps, more processing time.

Detect Techniques

When enabled, AI identifies domain-specific elements:

  • Educational - Topics, concepts, examples
  • Technical - Code, diagrams, demonstrations
  • Product - Features, comparisons, use cases

Techniques appear in the processing results.

Extract Audio

When enabled:

  • Speech is transcribed to text
  • Transcription indexed for search
  • AI can quote or reference speech

When disabled:

  • Only visual content is analyzed
  • Faster processing
  • Use for music, silent content, etc.

Digestion Instructions

Provide context to improve AI analysis:

Example instructions:

"This is a cooking tutorial. Focus on ingredients,
techniques, and timing. Note any safety warnings."

"This is a software demo. Identify features shown,
keyboard shortcuts mentioned, and tips shared."

"This is an interview. Track who is speaking and
summarize key points made by each person."

Good instructions help AI:

  • Focus on relevant details
  • Use appropriate terminology
  • Extract the most useful information

Querying Video Content

Automatic Integration

Once indexed, video content is automatically searched when users ask questions:

User: "What did the tutorial cover about error handling?"

Agent: "In the tutorial, error handling is covered starting at
       timestamp 12:30. The key points were:
       1. Always wrap API calls in try-catch blocks
       2. Log errors with context for debugging
       3. Show user-friendly messages to customers"

Timestamp References

AI can reference specific moments:

  • "At 5:30 in the video..."
  • "The section starting at 15:00 discusses..."
  • "Between 10:00-12:00, you can see..."

Video Context in Chat

When users are watching a video, provide the video context in your chat request:

const response = await fetch('/api/sdk/agent/chat', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'X-Fig1-API-Key': 'your-api-key'
  },
  body: JSON.stringify({
    message: "What technique is being shown right now?",
    context: {
      video: {
        videoId: 'video-id',           // ID from video processing
        currentTimestamp: 330,          // Playback position in seconds (5:30)
        title: 'React Hooks Tutorial',  // Optional: video title for context
        duration: 1800                  // Optional: total duration (30 min)
      }
    }
  })
});

The AI agent will automatically:

  1. Prioritize video content near the current timestamp
  2. Reference specific timestamps in responses
  3. Include play_video actions to suggest jumping to relevant moments

Full Integration Example

Here's a complete React integration for a video training app:

import { useState, useRef } from 'react';

function VideoTrainingChat({ exercise }) {
  const videoRef = useRef<HTMLVideoElement>(null);
  const [sessionId, setSessionId] = useState<string | null>(null);

  const sendMessage = async (message: string) => {
    const video = videoRef.current;

    const response = await fetch('/api/chat', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        message,
        sessionId,
        context: {
          // Link to the CMS content being viewed
          page: `/exercises/${exercise.slug}`,
          contentIds: [exercise._id],

          // Video playback context
          video: {
            videoId: exercise.videoId,
            currentTimestamp: video?.currentTime || 0,
            title: exercise.title,
            duration: video?.duration || 0
          },

          // Optional: user subscription for access control
          user: {
            isAuthenticated: true,
            subscriptions: ['premium']
          }
        }
      })
    });

    const data = await response.json();

    if (data.success) {
      setSessionId(data.data.sessionId);

      // Handle video navigation actions
      if (data.data.actions) {
        for (const action of data.data.actions) {
          if (action.type === 'play_video' && action.payload.timestamp) {
            video?.seekTo?.(action.payload.timestamp);
          }
        }
      }

      return data.data.message;
    }
  };

  return (
    <div>
      <video ref={videoRef} src={exercise.videoUrl} />
      {/* Chat UI component */}
    </div>
  );
}

Key Context Fields

FieldTypeDescription
video.videoIdstringRequired. Must match the ID from video processing
video.currentTimestampnumberCurrent playback position in seconds
video.titlestringVideo title for AI context
video.durationnumberTotal duration for progress context
contentIdsstring[]CMS content IDs (exercise, lesson, etc.)
pagestringCurrent page/route identifier

Example Conversation

User watching at 0:40:

"What technique is being shown here?"

AI Response:

At 0:40, the instructor is demonstrating hand placement for the rear naked choke.
Key points:
- Wrap the choking arm around the neck
- Align bicep against one side of the throat
- Keep elbow directly under the chin

Would you like me to explain the finishing squeeze shown at 0:55?

With action:

{
  "actions": [{
    "type": "play_video",
    "payload": { "videoId": "vid_123", "timestamp": 55 },
    "label": "Jump to finishing technique"
  }]
}

Managing Indexed Videos

Viewing Results

After processing completes:

  1. Select the video in Media Library
  2. View the AI Index section
  3. See:
    • Total segments created
    • Techniques detected
    • Transcript length
    • Processing date

Re-indexing

To re-process a video:

  1. Select the video
  2. Click Re-index
  3. Optionally change settings
  4. Start processing

Previous index is replaced when complete.

Removing from Index

To stop AI from using a video:

  1. Select the video
  2. Toggle Index for AI to OFF
  3. Confirm removal

Video file is preserved; only the AI index is removed.

API Integration

Manage video processing via SDK:

// Start video processing
const job = await fig1.knowledge.processVideo({
  videoUrl: 'https://cdn.fig1.ai/videos/tutorial.mp4',
  fileName: 'tutorial.mp4',
  duration: 3600,
  analysisConfig: {
    segmentDuration: 60,
    detectTechniques: true,
    extractAudio: true,
    digestionInstructions: 'This is a coding tutorial...'
  }
});

console.log('Job started:', job.jobId);

// Check processing status
const status = await fig1.knowledge.getVideoJobStatus(job.jobId);
console.log(`Progress: ${status.progress}%`);

// List all video jobs
const jobs = await fig1.knowledge.listVideoJobs({
  status: 'processing'
});

// Wait for completion
const result = await fig1.knowledge.waitForVideoProcessing(job.jobId, {
  pollInterval: 5000,
  onProgress: (s) => console.log(`${s.progress}% - ${s.stage}`)
});

if (result.status === 'completed') {
  console.log(`Indexed ${result.result.totalSegments} segments`);
}

See the Knowledge API Reference for complete documentation.

Processing Limits

TierMax DurationMax SizeConcurrent Jobs
Starter60 minutes500 MB2
Pro180 minutes2 GB5
EnterpriseUnlimitedCustomCustom

Processing time varies by video length and complexity.

Supported Formats

FormatExtensionNotes
MP4.mp4Recommended
WebM.webmFull support
MOV.movQuickTime
AVI.aviBasic support

For best results, use MP4 with H.264 video and AAC audio.

Best Practices

Choose Videos Wisely

Index videos that users will actually ask questions about:

Good candidates:

  • Training and tutorials
  • Product demonstrations
  • Educational lectures
  • How-to guides
  • Webinar recordings

Poor candidates:

  • Background music
  • Ambiguous b-roll footage
  • Very short clips
  • Content already covered in text

Optimize Segment Duration

Content TypeRecommended Duration
Fast tutorials30-45 seconds
General content60 seconds
Lectures90-120 seconds
Slow discussions120+ seconds

Write Good Instructions

Good:
"Technical tutorial about React hooks. Focus on code
examples, common mistakes, and best practices mentioned."

Bad:
"This is a video"

Handle Long Videos

For very long videos (1+ hours):

  1. Consider breaking into chapters
  2. Use longer segment durations
  3. Expect longer processing times
  4. Index during off-peak hours

Troubleshooting

Processing Failed

Common causes:

  • Unsupported format - Convert to MP4
  • Corrupted file - Re-upload the video
  • Too long - Check tier limits
  • No audio track - Disable audio extraction

Poor Search Results

Improve results by:

  • Adding digestion instructions
  • Using shorter segments
  • Ensuring audio is clear
  • Checking transcript quality

Slow Processing

Processing time depends on:

  • Video length
  • Resolution
  • Audio complexity
  • Server load

Long videos may take 10-30 minutes.

Timestamps Inaccurate

Timestamp precision depends on segment duration:

  • 60s segments = ±30s accuracy
  • 30s segments = ±15s accuracy

Use shorter segments for time-sensitive content.

Cost Considerations

Video processing uses more resources than text:

  • Processing costs scale with video length
  • Storage required for embeddings
  • Query costs slightly higher

Consider indexing only high-value videos.

Next Steps