Automating the Creative Process: Building Smarter Video Workflows

Apr 22, 2025 · 7 min · Automation , FFmpeg , Python , Video Processing , Productivity

Automating the Creative Process: Building Smarter Video Workflows

Introduction

April 2025 was dedicated to a question that’s fascinated me for years: Can we automate creativity? While the artistic vision remains human, the repetitive technical tasks that surround video production are ripe for automation. This month, I built a comprehensive video processing pipeline that saved hours of manual work.

The Problem with Traditional Video Workflows

Anyone who’s edited videos knows the pain points:

  1. Format Conversions: Different platforms require different formats
  2. Optimization: Balancing quality and file size
  3. Thumbnails: Creating compelling preview images
  4. Organization: Managing hundreds of raw files
  5. Distribution: Uploading to multiple platforms

Doing this manually for every video is soul-crushing. Time to fix it.

Building the Automation Pipeline

Architecture Overview

Raw Footage → Intake → Process → Optimize → Distribute
                ↓         ↓          ↓           ↓
           Organize   Transcode   Compress    Upload
                              ↓
                        Generate Thumbnails

Phase 1: Intelligent Intake

The first challenge was organizing incoming footage intelligently:

python

import os
from pathlib import Path
from datetime import datetime
import exiftool
 
class VideoIntake:
    def __init__(self, source_dir, dest_dir):
        self.source = Path(source_dir)
        self.dest = Path(dest_dir)
        
    def organize_by_metadata(self):
        """Organize videos by date and camera"""
        for video in self.source.glob('**/*'):
            if not self.is_video(video):
                continue
                
            metadata = self.extract_metadata(video)
            
            # Create organized structure
            date = metadata.get('date', 'unknown')
            camera = metadata.get('camera', 'default')
            
            destination = self.dest / date / camera / video.name
            destination.parent.mkdir(parents=True, exist_ok=True)
            
            shutil.move(str(video), str(destination))
            
    def extract_metadata(self, file):
        """Extract creation date, camera info, etc."""
        with exiftool.ExifTool() as et:
            metadata = et.get_metadata(str(file))
            
        return {
            'date': self.parse_date(metadata),
            'camera': metadata.get('EXIF:Model', 'unknown'),
            'resolution': f"{metadata.get('ImageWidth')}x{metadata.get('ImageHeight')}",
            'duration': metadata.get('Duration', 0),
        }

Phase 2: Smart Transcoding

Not all videos need the same treatment. The system analyzes each video and applies appropriate settings:

python

import ffmpeg
from typing import Dict, Tuple
 
class SmartTranscoder:
    # Presets for different use cases
    PRESETS = {
        'web': {
            'codec': 'libx264',
            'preset': 'medium',
            'crf': 23,
            'maxrate': '5M',
        },
        'social': {
            'codec': 'libx264', 
            'preset': 'fast',
            'crf': 28,
            'maxrate': '2M',
        },
        'archive': {
            'codec': 'libx265',
            'preset': 'slow',
            'crf': 22,
            'maxrate': '10M',
        }
    }
    
    def transcode(self, input_file: Path, output_dir: Path, purpose: str = 'web'):
        """Intelligently transcode based on purpose"""
        preset = self.PRESETS[purpose]
        
        # Analyze input first
        probe = ffmpeg.probe(str(input_file))
        video_info = next(s for s in probe['streams'] if s['codec_type'] == 'video')
        
        # Determine if we need to scale
        width = int(video_info['width'])
        height = int(video_info['height'])
        
        output_file = output_dir / f"{input_file.stem}_{purpose}.mp4"
        
        stream = ffmpeg.input(str(input_file))
        
        # Apply scaling if needed
        if purpose == 'social' and width > 1920:
            stream = stream.filter('scale', 1920, -2)
        
        # Apply preset settings
        stream = ffmpeg.output(
            stream,
            str(output_file),
            vcodec=preset['codec'],
            preset=preset['preset'],
            crf=preset['crf'],
            maxrate=preset['maxrate'],
            bufsize='10M',
            acodec='aac',
            audio_bitrate='128k'
        )
        
        # Run with progress callback
        stream.run(overwrite_output=True)
        
        return output_file

Phase 3: Thumbnail Generation

Thumbnails can make or break video engagement. The system finds interesting frames automatically:

python

import cv2
import numpy as np
from typing import List
 
class ThumbnailGenerator:
    def __init__(self, video_path: Path):
        self.video = cv2.VideoCapture(str(video_path))
        self.total_frames = int(self.video.get(cv2.CAP_PROP_FRAME_COUNT))
        
    def find_best_frame(self, num_candidates: int = 30) -> np.ndarray:
        """Find the most interesting frame for thumbnail"""
        candidates = self.sample_frames(num_candidates)
        
        # Score frames based on:
        # 1. Face detection
        # 2. Color variance (avoid black frames)
        # 3. Sharpness (avoid blurry frames)
        # 4. Composition (rule of thirds)
        
        scored = [(frame, self.score_frame(frame)) for frame in candidates]
        best_frame = max(scored, key=lambda x: x[1])[0]
        
        return best_frame
        
    def score_frame(self, frame: np.ndarray) -> float:
        """Score a frame's suitability as thumbnail"""
        score = 0.0
        
        # Color variance (avoid blank/black frames)
        variance = np.var(frame)
        score += min(variance / 1000, 1.0) * 25
        
        # Sharpness (Laplacian variance)
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
        sharpness = cv2.Laplacian(gray, cv2.CV_64F).var()
        score += min(sharpness / 500, 1.0) * 25
        
        # Face detection bonus
        face_cascade = cv2.CascadeClassifier(
            cv2.data.haarcascades + 'haarcascade_frontalface_default.xml'
        )
        faces = face_cascade.detectMultiScale(gray, 1.3, 5)
        score += len(faces) * 30
        
        # Composition (check rule of thirds intersections)
        height, width = frame.shape[:2]
        thirds_points = [
            (width // 3, height // 3),
            (width * 2 // 3, height // 3),
            (width // 3, height * 2 // 3),
            (width * 2 // 3, height * 2 // 3),
        ]
        
        # Check for interesting content near thirds intersections
        for x, y in thirds_points:
            region = frame[y-50:y+50, x-50:x+50]
            if region.size > 0:
                score += np.var(region) / 1000
        
        return score
        
    def generate_thumbnail(self, output_path: Path, width: int = 1280):
        """Generate optimized thumbnail"""
        best_frame = self.find_best_frame()
        
        # Resize maintaining aspect ratio
        height = int(best_frame.shape[0] * (width / best_frame.shape[1]))
        resized = cv2.resize(best_frame, (width, height))
        
        # Enhance colors slightly for better appeal
        enhanced = cv2.convertScaleAbs(resized, alpha=1.1, beta=10)
        
        cv2.imwrite(str(output_path), enhanced, [cv2.IMWRITE_JPEG_QUALITY, 90])

Phase 4: Distribution Automation

Finally, automatically distribute to various platforms:

python

from google.oauth2 import service_account
from googleapiclient.discovery import build
from googleapiclient.http import MediaFileUpload
import boto3
 
class VideoDistributor:
    def __init__(self):
        self.youtube = self.init_youtube()
        self.s3 = boto3.client('s3')
        
    def distribute(self, video: Path, thumbnail: Path, metadata: Dict):
        """Distribute video to all platforms"""
        results = {}
        
        # Upload to YouTube
        results['youtube'] = self.upload_to_youtube(video, thumbnail, metadata)
        
        # Upload to S3 for website
        results['cdn'] = self.upload_to_s3(video)
        
        # Copy to backup storage
        results['backup'] = self.backup_to_archive(video)
        
        return results
        
    def upload_to_youtube(self, video: Path, thumbnail: Path, metadata: Dict):
        """Upload to YouTube with metadata"""
        body = {
            'snippet': {
                'title': metadata['title'],
                'description': metadata['description'],
                'tags': metadata['tags'],
                'categoryId': '22',  # People & Blogs
            },
            'status': {
                'privacyStatus': 'private',  # Start private
            }
        }
        
        media = MediaFileUpload(str(video), chunksize=-1, resumable=True)
        
        request = self.youtube.videos().insert(
            part='snippet,status',
            body=body,
            media_body=media
        )
        
        response = request.execute()
        video_id = response['id']
        
        # Set thumbnail
        self.youtube.thumbnails().set(
            videoId=video_id,
            media_body=MediaFileUpload(str(thumbnail))
        ).execute()
        
        return {'video_id': video_id, 'url': f'https://youtube.com/watch?v={video_id}'}

Results and Impact

Time Savings

Before automation:

  • Video processing: 2-3 hours per video
  • Manual format conversions: 30 minutes
  • Thumbnail creation: 15-20 minutes
  • Upload and distribution: 30 minutes
  • Total: 3-4 hours per video

After automation:

  • Setup and initiate: 5 minutes
  • Automated processing: (running in background)
  • Manual review: 10 minutes
  • Total: 15 minutes of active work

Quality Improvements

  • Consistent Output: Same settings applied reliably
  • Better Thumbnails: Algorithm consistently finds engaging frames
  • Optimized Files: Intelligent compression based on content
  • No Human Error: No more forgetting to export at the right resolution

Website Development Integration

This automation pipeline integrates seamlessly with my website:

typescript

// Astro component to display videos
---
import { getVideoMetadata } from '@/utils/videos';
 
const videos = await getVideoMetadata();
---
 
<div class="video-grid">
  {videos.map(video => (
    <div class="video-card">
      <img 
        src={video.thumbnail} 
        alt={video.title}
        loading="lazy"
      />
      <h3>{video.title}</h3>
      <p>{video.description}</p>
      <a href={video.url}>Watch →</a>
    </div>
  ))}
</div>

Lessons Learned

1. Automation Isn’t Always Faster (Initially)

Building this system took weeks. But the ROI is clear: every video going forward saves 3+ hours.

2. Let Machines Do Machine Work

Humans are terrible at:

  • Consistent file naming
  • Remembering optimal export settings
  • Not getting distracted during 20-minute exports

Machines excel at these.

3. The 80/20 Rule Applies

20% of automation effort provides 80% of value:

  • Automatic transcoding
  • Thumbnail generation
  • Organized file structure

The remaining 20% (advanced features) is often optional.

What’s Next

May Plans

  1. AI-Powered Editing: Exploring automatic scene detection and editing
  2. Voice Transcription: Automatic captions and searchable content
  3. Content Analysis: Understanding what makes videos perform well

Future Enhancements

  • Parallel Processing: Use all CPU cores for faster encoding
  • Cloud Integration: Offload heavy processing to cloud GPUs
  • Analytics Integration: Track which automated choices perform best

Open Source

I’m planning to open-source the core pipeline once I clean it up. If you’re interested in video automation, stay tuned!


What repetitive tasks are you automating? Share your automation wins!

Edit this page on GitHub Last updated: 1/9/2026, 3:41:23 AM