Video delivery at scale (v1.5) — TGF-338 spike¶

Investigation and design output for TGF-338 — Spike: Video delivery infrastructure at scale (v1.5 prerequisite). Where the v1 spike (TGF-337, docs/video-poc/) answered "how do we serve the existing film at all", this spike answers "how do we serve it at scale, with ABR, analytics, and cross-catalog clip playback, without re-platforming".

The headline decision — extend the self-hosted R2 + Worker stack rather than move to Mux or Cloudflare Stream — is recorded as an ADR in the portal repo (ADRs live there, next to ADR-0008): ADR-0009 — v1.5 video delivery at scale.

Documents¶

Deliverable	Doc	Answers ticket Q
Build-vs-buy comparison (cost, DX, feature fit, ops)	comparison.md	#1, #2, #4, #5
Migration plan v1 → v1.5	migration-plan.md	#8
Backend design sketch (ingest, storage, signed URLs, clips)	backend-design.md	#3, #6, #7
ADR (v1.5 direction + rationale)	ADR-0009 ↗	decision

Follow-up implementation epic and stories are filed in Jira and linked from TGF-338.

One-paragraph summary¶

For a large, static, already-HLS-friendly catalog served with free egress from Cloudflare, self-hosted R2 + Worker delivery stays flat at ~$120/mo at every viewing volume, while Mux, Cloudflare Stream, and AWS CloudFront all start higher and grow with delivered minutes — there is no volume at which they win. So v1.5 extends v1: a one-time ABR transcode backfill, a session-scoped access token for cross-catalog browsing, a synthesized clip-manifest service that plays arbitrary time ranges over the already-stored segments with no re-encode and no new storage (unblocking TGF-339), and Mux Data bolted on for player QoS. Managed Mux Video is kept as the documented escape hatch for the day user-generated uploads create a persistent transcode pipeline.