Skip to content

doom-fish/videotoolbox-rs

Repository files navigation

videotoolbox

Safe Rust bindings for Apple's VideoToolbox framework — hardware-accelerated encode/decode, pixel transfer/rotation, multipass helpers, HDR metadata, motion estimation, RAW processing, and VTFrameProcessor pipelines on macOS.

Status: experimental, but the crate now covers the main public VideoToolbox surfaces used by the doom-fish stack. Objective-C-only APIs use a small Swift bridge behind the frame_processor feature, and executor-agnostic encode/decode/RAW-processing async helpers live in videotoolbox::async_api behind the async feature.

Features

  • Hardware-accelerated encoding + decoding — H.264, HEVC, and ProRes 422/4444
  • Pixel transfer / rotation / utilitiesVTPixelTransferSession, VTPixelRotationSession, VTCreateCGImageFromCVPixelBuffer
  • Multipass + HDR helpersVTFrameSilo, VTMultiPassStorage, VTHDRPerFrameMetadataGenerationSession
  • Advanced processingVTFrameProcessor, VTMotionEstimationSession, VTRAWProcessingSession
  • Direct IOSurface input/output — zero-copy composition with apple-cf::iosurface
  • Builder pattern — fluent encoder configuration for bitrate, frame rate, keyframe interval, real-time mode, and profile level
  • Executor-agnostic async modulevideotoolbox::async_api::{AsyncCompressionSession, AsyncDecompressionSession, AsyncRawProcessingSession} bridges one-shot frame callbacks to Futures and wraps RAW-parameter change notifications as a bounded async stream via doom-fish-utils
  • Mostly pure C bindings — optional Swift bridge only for Objective-C-only APIs
  • Minimal dependenciesapple-cf, plus optional apple-metal for VTFrameProcessor command-buffer integration

Async notes

  • AsyncRawProcessingSession::parameter_changes(...) exposes VTRAWProcessingSessionSetParameterChangedHandler as a bounded async stream.
  • VTDecompressionSessionSetMultiImageCallback remains sync-only for now: the audited C API requires a non-null callback and exposes no clear / unsubscribe hook for an RAII async stream wrapper.

Why not bindgen?

The full VideoToolbox SDK surface is large, but the useful set for real macOS media pipelines is still small enough to hand-audit. Hand-writing those declarations gives us:

  • No build-time dependency on clang
  • Type-safe Rust enums for codec types (instead of raw u32 four-character codes)
  • Builder APIs that map ergonomically to VT's CFDictionary property bag

Requirements

  • macOS 13.0+
  • Apple Silicon or Intel Mac with hardware video encoder

Quick start

use videotoolbox::prelude::*;
use apple_cf::iosurface::{IOSurface, IOSurfaceLockOptions};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Allocate a 1920×1080 BGRA IOSurface.
    let surface = IOSurface::create(1920, 1080, u32::from_be_bytes(*b"BGRA"), 4)
        .ok_or("failed to allocate")?;

    // Build a real-time H.264 encoder.
    let encoder = CompressionSession::builder(1920, 1080, Codec::H264)
        .with_real_time(true)
        .with_average_bit_rate(8_000_000)
        .with_expected_frame_rate(60.0)
        .with_max_keyframe_interval(120)
        .build()?;

    // Encode one frame and inspect the resulting CMSampleBuffer.
    let encoded = encoder.encode(&surface, (0, 60))?;
    println!("Got {} bytes of H.264", encoded.data.len());

    if let Some(sb) = encoded.cm_sample_buffer() {
        // Hand `sb` straight to avassetwriter::Writer::append_sample for
        // zero-copy muxing — no raw pointer hand-off needed.
        let _ = sb.is_valid();
    }

    Ok(())
}

Composes with the rest of the doom-fish stack

screencapturekit-rs ──► IOSurface ──► videotoolbox-rs ──► H.264 bytes
                                              ↓
                                        avassetwriter-rs (future)
                                              ↓
                                          .mp4 file

Roadmap

  • VTCompressionSession (encoder)
  • VTDecompressionSession (decoder)
  • VTPixelTransferSession (pixel format / colour space conversion)
  • VTPixelRotationSession
  • VTMultiPassStorage + VTFrameSilo (two-pass encoding)
  • VTHDRPerFrameMetadataGenerationSession (Dolby Vision metadata)
  • VTFrameProcessor capability queries (super-resolution / optical flow detection)
  • VTFrameProcessor pipeline (super-resolution + motion blur + temporal noise + frame-rate conversion + optical flow + 2 low-latency variants)
  • VTMotionEstimationSession
  • VTRAWProcessingSession (with parameter introspection)
  • VTProfessionalVideoWorkflow decoder/encoder registration
  • VTCreateCGImageFromCVPixelBuffer
  • HEVC profile-level helpers
  • Executor-agnostic async encode/decode/RAW-processing module behind the async feature

License

Licensed under either of Apache-2.0 or MIT at your option.

About

No description, website, or topics provided.

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

 
 
 

Contributors