Video Space

Here's a video that's played back in the space it was captured

👀 It's interactive, drag the viewer to adjust the viewpoint

← More videos

Locating frames

I was hoping to use the telemetry data from my drone; it produces a text file with it's location as it captures video. However this doesn't include orientation or camera gimble info so I wasn't able to map it into a pose.

So I decided to use COLMAP, a Structure-from-Motion tool which allows you to take a series of images to build a scene. COLMAP stores the position from which each image was captured which I was able to use for aligning the video frames. As a bonus, this works for other video sources, not just drone footage.

I wrote some slightly scrappy code to extract and serialise the poses and points into a ply file that I could load into a webgl component. You can read some of the process (and see some gaussian splats) on this bluesky thread.

Implementation

I’m pretty happy with how this is structured, it’s a web component that wraps a <video /> element with links to the colmap data:

<pose-tracker poses="poses.ply" points="points.ply">
  <video src="motocamp.mp4"></video>
</pose-tracker>

Internally the video element is hidden but still drives the playback of the component, which is some html controls & a webgl canvas element.

The canvas is rendered by threejs. The key trick is using a single 2d texture array to stash the historical video frames, with an instanced mesh that allows everything to be drawn together. My original approach for pushing the frames was using a 2dCanvas to write the pixels into a array buffer, but I found WebGLArrayRenderTarget which lets you populate texture arrays directly!

I didn't want/need every frame of the video, so I sampled it (from 60 → 2-5 Hz) and interpolate to find the position at a set timestamp. Orientation is quite straightforward in threejs, but for translation I was really happy when I found curve-interpolator.

Data sources & formats

Using structure from motion is cool, but you can get potentially richer data from sensors on the capture device.

For drones, the UZH-FPV Drone Racing Dataset is a great example of the sort of data that's available.

For capturing from a mobile device WebXR Raw Camera Access could be an option for capturing pose-aligned video.

GoPro cameras have a telemetry format which looks like it captures a bunch of metadata.

And for output formats, I enjoyed using ply because it's so lightweight/flexible (it can be just a text file!). But if I was doing this properly I'd probably use something like mcap to link everything together.

I've got a fairly shonky pipeline for processing videos now, so if there's something that you think would be interesting give me a shout and I'll run it through!

← Video Space July 2025

Locating frames

Implementation

Data sources & formats