---
url: /guide/extensions/mp3-encoder.md
---
# @mediabunny/mp3-encoder
Browsers typically have no support for MP3 encoding in their WebCodecs implementations. Given the ubiquity of the format, this extension package provides an MP3 encoder for use with Mediabunny. It is implemented using Mediabunny's [custom coder API](../supported-formats-and-codecs#custom-coders) and uses a highly-performant WASM build of the [LAME MP3 Encoder](https://lame.sourceforge.io/) under the hood.
## Installation
This library peer-depends on Mediabunny. Install both using npm:
```bash
npm install mediabunny @mediabunny/mp3-encoder
```
Alternatively, directly include them using a script tag:
```html
```
This will expose the global objects `Mediabunny` and `MediabunnyMp3Encoder`. Use `mediabunny-mp3-encoder.d.ts` to provide types for these globals. You can download the built distribution files from the [releases page](https://github.com/Vanilagy/mediabunny/releases).
## Usage
```ts
import { registerMp3Encoder } from '@mediabunny/mp3-encoder';
registerMp3Encoder();
```
That's it - Mediabunny now uses the registered MP3 encoder automatically.
If you want to be more correct, check for native browser support first:
```ts
import { canEncodeAudio } from 'mediabunny';
import { registerMp3Encoder } from '@mediabunny/mp3-encoder';
if (!(await canEncodeAudio('mp3'))) {
registerMp3Encoder();
}
```
## Example
Here, we convert an input file to an MP3:
```ts
import {
Input,
ALL_FORMATS,
BlobSource,
Output,
BufferTarget,
Mp3OutputFormat,
canEncodeAudio,
Conversion,
} from 'mediabunny';
import { registerMp3Encoder } from '@mediabunny/mp3-encoder';
if (!(await canEncodeAudio('mp3'))) {
// Only register the custom encoder if there's no native support
registerMp3Encoder();
}
const input = new Input({
source: new BlobSource(file), // From a file picker, for example
formats: ALL_FORMATS,
});
const output = new Output({
format: new Mp3OutputFormat(),
target: new BufferTarget(),
});
const conversion = await Conversion.init({
input,
output,
});
await conversion.execute();
output.target.buffer; // => ArrayBuffer containing the MP3 file
```
## Implementation details
This library implements an MP3 encoder by registering a custom encoder class with Mediabunny. This class, when initialized, spawns a worker which then immediately loads a WASM build of the LAME MP3 encoder. Then, raw data is sent to the worker and encoded data is received from it. These encoded chunks are then concatenated in the main thread and properly split into separate MP3 frames.
Great care was put into ensuring maximum compatibility of this package; it works with bundlers, directly in the browser, as well as in Node, Deno, and Bun. All code (including worker & WASM) are bundled into a single file, eliminating the need for CDNs or WASM path arguments. This packages therefore serves as a reference implementation of WASM-based encoder extensions for Mediabunny.
The WASM build itself is a performance-optimized, SIMD-enabled build of LAME 3.100, with all unneeded features disabled. Because maximum performance was the priority, the build is slighter bigger, but ~130 kB gzipped is still very reasonable in my opinion. In my tests, it encodes 5 seconds of audio in ~90 milliseconds (55x real-time speed).
---
---
url: /guide/converting-media-files.md
---
# Converting media files
The [reading](./reading-media-files) and [writing](./writing-media-files) primitives in Mediabunny provide everything you need to convert media files. However, since this is such a common operation and the details can be tricky, Mediabunny ships with a built-in file conversion abstraction.
It has the following features:
* Transmuxing (changing the container format)
* Transcoding (changing a track's codec)
* Track removal
* Compression
* Trimming
* Video resizing & fitting
* Video rotation
* Video frame rate adjustment
* Audio resampling
* Audio up/downmixing
The conversion API was built to be simple, versatile and extremely performant.
## Basic usage
### Running a conversion
Each conversion process is represented by an instance of `Conversion`. Create a new instance using `Conversion.init(...)`, then run the conversion using `.execute()`.
Here, we're converting to WebM:
```ts
import {
Input,
Output,
WebMOutputFormat,
BufferTarget,
Conversion,
} from 'mediabunny';
const input = new Input({ ... });
const output = new Output({
format: new WebMOutputFormat(),
target: new BufferTarget(),
});
const conversion = await Conversion.init({ input, output });
await conversion.execute();
// output.target.buffer contains the final file
```
That's it! A `Conversion` simply takes an instance of `Input` and `Output`, then reads the data from the input and writes it to the output. If you're unfamiliar with [`Input`](./reading-media-files) and [`Output`](./writing-media-files), check out their respective guides.
::: info
The `Output` passed to the `Conversion` must be *fresh*; that is, it must have no added tracks and be in the `'pending'` state (not started yet).
:::
Unconfigured, the conversion process handles all the details automatically, such as:
* Copying media data whenever possible, otherwise transcoding it
* Dropping tracks that aren't supported in the output format
You should consider inspecting the [discarded tracks](#discarded-tracks) before executing a `Conversion`.
### Monitoring progress
To monitor the progress of a `Conversion`, set its `onProgress` property *before* calling `execute`:
```ts
const conversion = await Conversion.init({ input, output });
conversion.onProgress = (progress: number) => {
// `progress` is a number between 0 and 1 (inclusive)
};
await conversion.execute();
```
This callback is called each time the progress of the conversion advances.
::: warning
A progress of `1` doesn't indicate the conversion has finished; the conversion is only finished once the promise returned by `.execute()` resolves.
:::
::: warning
Tracking conversion progress can slightly affect performance as it requires knowledge of the input file's total duration. This is usually negligible but should be avoided when using append-only input sources such as [`ReadableStreamSource`](./reading-media-files#readablestreamsource).
:::
If you want to monitor the output size of the conversion (in bytes), simply use the `onwrite` callback on your `Target`:
```ts
let currentFileSize = 0;
output.target.onwrite = (start, end) => {
currentFileSize = Math.max(currentFileSize, end);
};
```
### Canceling a conversion
Sometimes, you may want to cancel an ongoing conversion process. For this, use the `cancel` method:
```ts
await conversion.cancel(); // Resolves once the conversion is canceled
```
This automatically frees up all resources used by the conversion process.
## Video options
You can set the `video` property in the conversion options to configure the converter's behavior for video tracks. The options are:
```ts
type ConversionVideoOptions = {
discard?: boolean;
width?: number;
height?: number;
fit?: 'fill' | 'contain' | 'cover';
rotate?: 0 | 90 | 180 | 270;
frameRate?: number;
codec?: VideoCodec;
bitrate?: number | Quality;
forceTranscode?: boolean;
};
```
For example, here we resize the video track to 720p:
```ts
const conversion = await Conversion.init({
input,
output,
video: {
width: 1280,
height: 720,
fit: 'contain',
},
});
```
::: info
The provided configuration will apply equally to all video tracks of the input. If you want to apply a separate configuration to each video track, check [track-specific options](#track-specific-options).
:::
### Discarding video
If you want to get rid of the video track, use `discard: true`.
### Resizing/rotating video
The `width`, `height` and `fit` properties control how the video is resized. If only `width` or `height` is provided, the other value is deduced automatically to preserve the video's original aspect ratio. If both are used, `fit` must be set to control the fitting algorithm:
* `'fill'` will stretch the image to fill the entire box, potentially altering aspect ratio.
* `'contain'` will contain the entire image within the box while preserving aspect ratio. This may lead to letterboxing.
* `'cover'` will scale the image until the entire box is filled, while preserving aspect ratio.
`rotation` rotates the video by the specified number of degrees clockwise. This rotation is applied on top of any rotation metadata in the original input file.
If `width` or `height` is used in conjunction with `rotation`, they control the post-rotation dimensions.
If you want to apply max/min constraints to a video's dimensions, check out [track-specific options](#track-specific-options).
In the rare case that the input video changes size over time, the `fit` field can be used to control the size change behavior (see [`VideoEncodingConfig`](./media-sources#video-encoding-config)). When unset, the behavior is `'passThrough'`.
### Adjusting frame rate
The `frameRate` property can be used to set the frame rate of the output video in Hz. If not specified, the original input frame rate will be used (which may be variable).
### Transcoding video
Use the `codec` property to control the codec of the output track. This should be set to a [codec](./supported-formats-and-codecs#video-codecs) supported by the output file, or else the track will be [discarded](#discarded-tracks).
Use the `bitrate` property to control the bitrate of the output video. For example, you can use this field to compress the video track. Accepted values are the number of bits per second or a [subjective quality](./media-sources#subjective-qualities). If this property is set, transcoding will always happen. If this property is not set but transcoding is still required, `QUALITY_HIGH` will be used as the value.
If you want to prevent direct copying of media data and force a transcoding step, use `forceTranscode: true`.
## Audio options
You can set the `audio` property in the conversion options to configure the converter's behavior for audio tracks. The options are:
```ts
type ConversionAudioOptions = {
discard?: boolean;
codec?: AudioCodec;
bitrate?: number | Quality;
numberOfChannels?: number;
sampleRate?: number;
forceTranscode?: boolean;
};
```
For example, here we convert the audio track to mono and set a specific sample rate:
```ts
const conversion = await Conversion.init({
input,
output,
audio: {
numberOfChannels: 1,
sampleRate: 48000,
},
});
```
::: info
The provided configuration will apply equally to all audio tracks of the input. If you want to apply a separate configuration to each audio track, check [track-specific options](#track-specific-options).
:::
### Discarding audio
If you want to get rid of the audio track, use `discard: true`.
### Resampling audio
The `numberOfChannels` property controls the channel count of the output audio (e.g., 1 for mono, 2 for stereo). If this value differs from the number of channels in the input track, Mediabunny will perform up/downmixing of the channel data using [the same algorithm as the Web Audio API](https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API/Basic_concepts_behind_Web_Audio_API#audio_channels).
The `sampleRate` property controls the sample rate in Hz (e.g., 44100, 48000). If this value differs from the input track's sample rate, Mediabunny will resample the audio.
### Transcoding audio
Use the `codec` property to control the codec of the output track. This should be set to a [codec](./supported-formats-and-codecs#audio-codecs) supported by the output file, or else the track will be [discarded](#discarded-tracks).
Use the `bitrate` property to control the bitrate of the output audio. For example, you can use this field to compress the audio track. Accepted values are the number of bits per second or a [subjective quality](./media-sources#subjective-qualities). If this property is set, transcoding will always happen. If this property is not set but transcoding is still required, `QUALITY_HIGH` will be used as the value.
If you want to prevent direct copying of media data and force a transcoding step, use `forceTranscode: true`.
## Track-specific options
You may want to configure your video and audio options differently depending on the specifics of the input track. Or, in case a media file has multiple video or audio tracks, you may want to discard only specific tracks or configure each track separately.
For this, instead of passing an object for `video` and `audio`, you can instead pass a function:
```ts
const conversion = await Conversion.init({
input,
output,
// Function gets invoked for each video track:
video: (videoTrack, n) => {
if (n > 1) {
// Keep only the first video track
return { discard: true };
}
return {
// Shrink width to 640 only if the track is wider
width: Math.min(videoTrack.displayWidth, 640),
};
},
// Async functions work too:
audio: async (audioTrack, n) => {
if (audioTrack.languageCode !== 'rus') {
// Keep only Russian audio tracks
return { discard: true };
}
return {
codec: 'aac',
};
},
});
```
For documentation about the properties of video and audio tracks, refer to [Reading track metadata](./reading-media-files#reading-track-metadata).
## Trimming
Use the `trim` property in the conversion options to extract only a section of the input file into the output file:
```ts
type ConversionOptions = {
// ...
trim?: {
start: number; // in seconds
end: number; // in seconds
};
// ...
};
```
For example, here we extract a clip from 10s to 25s:
```ts
const conversion = await Conversion.init({
input,
output,
trim: {
start: 10,
end: 25,
},
});
```
In this case, the output will be 15 seconds long.
If only `start` is set, the clip will run until the end of the input file. If only `end` is set, the clip will start at the beginning of the input file.
## Metadata tags
By default, any [descriptive metadata tags](../api/MetadataTags.md) of the input will be copied to the output. If you want to further control the metadata tags written to the output, you can use the `tags` options:
```ts
// Set your own metadata:
const conversion = await Conversion.init({
// ...
tags: () => ({
title: 're:Turning',
artist: 'Alexander Panos',
}),
// ...
});
// Or, augment the input's metadata:
const conversion = await Conversion.init({
// ...
tags: inputTags => ({
...inputTags, // Keep the existing metadata
images: [{ // And add cover art
data: new Uint8Array(...),
mimeType: 'image/jpeg',
kind: 'coverFront',
}],
comment: undefined, // And remove any comments
}),
// ...
});
// Or, remove all metadata
const conversion = await Conversion.init({
// ...
tags: () => ({}),
// ...
});
```
## Discarded tracks
If an input track is excluded from the output file, it is considered *discarded*. The list of discarded tracks can be accessed after initializing a `Conversion`:
```ts
const conversion = await Conversion.init({ input, output });
conversion.discardedTracks; // => DiscardedTrack[]
type DiscardedTrack = {
// The track that was discarded
track: InputTrack;
// The reason for discarding the track
reason:
| 'discarded_by_user'
| 'max_track_count_reached'
| 'max_track_count_of_type_reached'
| 'unknown_source_codec'
| 'undecodable_source_codec'
| 'no_encodable_target_codec';
};
```
Since you can inspect this list before executing a `Conversion`, this gives you the option to decide if you still want to move forward with the conversion process.
***
The following reasons exist:
* `discarded_by_user`\
You discarded this track by setting `discard: true`.
* `max_track_count_reached`\
The output had no more room for another track.
* `max_track_count_of_type_reached`\
The output had no more room for another track of this type, or the output doesn't support this track type at all.
* `unknown_source_codec`\
We don't know the codec of the input track and therefore don't know what to do with it.
* `undecodable_source_codec`\
The input track's codec is known, but we are unable to decode it.
* `no_encodable_target_codec`\
We can't find a codec that we are able to encode and that can be contained within the output format. This reason can be hit if the environment doesn't support the necessary encoders, or if you requested a codec that cannot be contained within the output format.
***
On the flip side, you can always query which input tracks made it into the output:
```ts
const conversion = await Conversion.init({ input, output });
conversion.utilizedTracks; // => InputTrack[]
```
---
---
url: /examples.md
---
---
---
url: /guide/input-formats.md
---
# Input formats
Mediabunny supports a wide variety of commonly used container formats for reading input files. These *input formats* are used in two ways:
* When creating an `Input`, they are used to specify the list of supported container formats. See [Creating a new input](./reading-media-files#creating-a-new-input) for more.
* Given an existing `Input`, its `getFormat` method returns the *actual* format of the file as an `InputFormat`.
## Input format properties
Retrieve the full written name of the format like this:
```ts
inputFormat.name; // => 'MP4'
```
You can also retrieve the format's base MIME type:
```ts
inputFormat.mimeType; // => 'video/mp4'
```
If you want a file's full MIME type, which depends on track codecs, use [`getMimeType`](./reading-media-files#reading-file-metadata) on `Input` instead.
## Input format singletons
Since input formats don't require any additional configuration, each input format is directly available as an exported singleton instance:
```ts
import {
MP4, // MP4 input format singleton
QTFF, // QuickTime File Format input format singleton
MATROSKA, // Matroska input format singleton
WEBM, // WebM input format singleton
MP3, // MP3 input format singleton
WAVE, // WAVE input format singleton
OGG, // Ogg input format singleton
} from 'mediabunny';
```
You can use these singletons when creating an input:
```ts
import { Input, MP3, WAVE, OGG } from 'mediabunny';
const input = new Input({
formats: [MP3, WAVE, OGG],
// ...
});
```
You can also use them for checking the actual format of an `Input`:
```ts
import { MP3 } from 'mediabunny';
const isMp3 = (await input.getFormat()) === MP3;
```
There is a special `ALL_FORMATS` constant exported by Mediabunny which contains every input format singleton. Use this constant if you want to support as many formats as possible:
```ts
import { Input, ALL_FORMATS } from 'mediabunny';
const input = new Input({
formats: ALL_FORMATS,
// ...
});
```
::: info
Using `ALL_FORMATS` means [demuxers](https://en.wikipedia.org/wiki/Demultiplexer_\(media_file\)) for all formats must be included in the bundle, which can increase the bundle size significantly. Use it only if you need to support all formats.
:::
## Input format class hierarchy
In addition to singletons, input format classes are structured hierarchically:
* `InputFormat` (abstract)
* `IsobmffInputFormat` (abstract)
* `Mp4InputFormat`
* `QuickTimeInputFormat`
* `MatroskaInputFormat`
* `WebMInputFormat`
* `Mp3InputFormat`
* `WaveInputFormat`
* `OggInputFormat`
This means you can also perform input format checks using `instanceof` instead of `===` comparisons. For example:
```ts
import { Mp3InputFormat } from 'mediabunny';
// Check if the file is MP3:
(await input.getFormat()) instanceof Mp3InputFormat;
// Check if the file is Matroska (MKV + WebM):
(await input.getFormat()) instanceof MatroskaInputFormat;
// Check if the file is MP4 or QuickTime:
(await input.getFormat()) instanceof IsobmffInputFormat;
```
::: info
Well, actually 🤓☝️, the QuickTime File Format is technically not an instance of the ISO Base Media File Format (ISOBMFF) - instead, ISOBMFF is a standard originally inspired by QTFF. However, as the two are extremely similar and are used in the same way, we consider QTFF an instance of `IsobmffInputFormat` for convenience.
:::
---
---
url: /guide/installation.md
---
# Installation
Install Mediabunny using your favorite package manager:
::: code-group
```bash [npm]
npm install mediabunny
```
```bash [yarn]
yarn add mediabunny
```
```bash [pnpm]
pnpm add mediabunny
```
```bash [bun]
bun add mediabunny
```
:::
::: info
Requires any JavaScript environment that can run ECMAScript 2021 or later. Mediabunny is expected to be run in modern browsers. For types, TypeScript 5.7 or later is required.
:::
Then, simply import it like this:
```ts
import { ... } from 'mediabunny'; // ESM
const { ... } = require('mediabunny'); // or CommonJS
```
ESM is preferred because it gives you tree shaking.
You can also just include the library using a script tag in your HTML:
```html
```
This will add a `Mediabunny` object to the global scope. You can provide types for this global using `mediabunny.d.ts`.
You can download a built distribution file from the [releases page](https://github.com/Vanilagy/mediabunny/releases). Use the `*.cjs` builds for normal script tag inclusion, or the `*.mjs` builds for script tags with `type="module"` or direct imports via ESM. Including the `mediabunny.d.ts` declaration file in your TypeScript project will declare a global `Mediabunny` namespace.
---
---
url: /guide/introduction.md
---
# Introduction
Mediabunny is a JavaScript library for reading, writing, and converting media files (like MP4 or WebM), directly in the browser. It aims to be a complete toolkit for high-performance media operations on the web. It's written from scratch in pure TypeScript, has zero dependencies, and is extremely tree-shakable, meaning you only include what you use. You can think of it a bit like [FFmpeg](https://ffmpeg.org/), but built for the web's needs.
## Features
Here's a long list of stuff this library does:
* Reading metadata from media files
* Extracting media data from media files
* Creating new media files
* Converting media files
* Hardware-accelerated decoding & encoding (via the WebCodecs API)
* Support for multiple video, audio and subtitle tracks
* Read & write support for many container formats (.mp4, .mov, .webm, .mkv, .mp3, .wav, .ogg, .aac), including variations such as MP4 with Fast Start, fragmented MP4, or streamable Matroska
* Support for 25 different codecs
* Lazy, optimized, on-demand file reading
* Input and output streaming, arbitrary file size support
* File location independence (memory, disk, network, ...)
* Utilities for compression, resizing, rotation, resampling, trimming
* Transmuxing and transcoding
* Microsecond-accurate reading and writing precision
* Efficient seeking through time
* Pipelined design for efficient hardware usage and automatic backpressure
* Custom encoder & decoder support for polyfilling
* Low- & high-level abstractions for different use cases
* Performant everything
* Node.js support
...and there's probably more.
## Use cases
Mediabunny is a general-purpose toolkit and can be used in infinitely many ways. But, here are a few ideas:
* File conversion & compression
* Displaying file metadata (duration, dimensions, ...)
* Extracting thumbnails
* Creating videos in the browser
* Building a video editor
* Live recording & streaming
* Efficient, sample-accurate playback of large files via the Web Audio API
Check out the [Examples](/examples) page for demo implementations of many of these ideas!
## Getting started
To get going with Mediabunny, here are some starting points:
* Check out [Quick start](./quick-start) for a collection of useful code snippets
* Start with [Reading media files](./reading-media-files) if you want to do read operations.
* Start with [Writing media files](./writing-media-files) if you want to do write operations.
* Start with [Converting media files](./converting-media-files) if you care about file conversions.
* Dive into [Packets & samples](./packets-and-samples) for a deeper understanding of the concepts underlying this library.
## Motivation
Mediabunny is the evolution of my previous libraries, [mp4-muxer](https://github.com/Vanilagy/mp4-muxer) and [webm-muxer](https://github.com/Vanilagy/webm-muxer), which were both created due to the advent of the WebCodecs API. While they fulfilled their job just fine, I saw a few painpoints:
* Lots of duplicated code between the two libraries, otherwise very similar API.
* No help with the difficulties of navigating the WebCodecs API & related browser APIs.
* "mp4-demuxer when??"
This library is the result of unifying these libraries into one, solving all the above issues, and expanding the scope. Now:
* Changing the output file format is a single-line change; the rest of the API is identical.
* Lots of abstractions on top of the WebCodecs API & browser APIs are provided.
* mp4-demuxer now.
Due to tree shaking, if you only need an MP4 or WebM muxer, this library's bundle size will still be very small.
### Migration
If you're coming from mp4-muxer or webm-muxer, you should migrate to Mediabunny. For that, refer to these guides:
* [Guide: Migrating from mp4-muxer to Mediabunny](https://github.com/Vanilagy/mp4-muxer/blob/main/MIGRATION-GUIDE.md)
* [Guide: Migrating from webm-muxer to Mediabunny](https://github.com/Vanilagy/webm-muxer/blob/main/MIGRATION-GUIDE.md)
## Technical overview
At its core, Mediabunny is a collection of multiplexers and demultiplexers, one of each for every container format. Demultiplexers stream data from *sources*, while multiplexers stream data to *targets*. Every demultiplexer is capable of extracting file metadata as well as compressed media data, while multiplexers write metadata and encoded media data into a new file.
Mediabunny then provides several wrappers around the WebCodecs API to simplify usage: for reading, it creates decoders with the correct codec configuration and efficiently decodes media data in a pipelined way. For writing, it figures out the necessary codec configuration and sets up encoders which are then used to encode raw media data, while respecting the backpressure applied by the encoder. Extracting the right decoder configuration from a media file can be tricky and sometimes involves diving into encoded media packet bitstreams.
The conversion abstraction is built on top of Mediabunny's reading and writing primitives and combines them both in a heavily-pipelined way, making sure reading and writing happen in lockstep. It also consists of a lot of conditional logic probing output track compatibility, decoding support, and finding encodable codec configurations. It makes use of the Canvas API for video processing operations, and uses a custom implementation for audio resampling and up/downmixing.
---
---
url: /guide/media-sinks.md
---
# Media sinks
## Introduction
*Media sinks* offer ways to extract media data from an `InputTrack`. Different media sinks provide different levels of abstraction and cater to different use cases.
For information on how to obtain input tracks, or how to generally read data from media files, refer to [Reading media files](./reading-media-files).
### General usage
> General usage patterns of media sinks will be demonstrated using a fictional `FooSink`.
Media sinks are like miniature "namespaces" for retrieving media data, scoped to a specific track. This means that you'll typically only need to construct one sink per type for a track.
```ts
const track = await input.getPrimaryVideoTrack();
const sink = new FooSink(track);
```
Constructing the sink is virtually free and does not perform any media data reads.
To read media data, each sink offers a different set of methods. You can call these methods as many times as you want; their calls will be independent since media sinks are stateless\[^1].
```ts
await sink.getFoo(1);
await sink.getFoo(2);
await sink.getFoo(3);
```
\[^1]: Almost: `CanvasSink` becomes stateful when using a [canvas pool](#canvas-pool).
### Async iterators
Media sinks make heavy use of [async iterators](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/AsyncIterator). They allow you to iterate over a set of media data (like all frames in a video track) efficiently, only having to read small sections of the file at any given point.
Async iterators are extremely ergonomic with `for await...of` loops:
```ts
for await (const foo of sink.foos()) {
console.log(foo.timestamp);
}
```
Just like in regular `for` loops, the `break` statement can be used to exit the loop early. This will automatically clean up any internal resources (such as decoders) used by the async iterator:
```ts
// Loop only over the first 5 foos
let count = 0;
for await (const foo of sink.foos()) {
console.log(foo.timestamp);
if (++count === 5) break;
}
```
Async iterators are also useful outside of `for` loops. Here, the `next` method is used to retrieve the next item in the iteration:
```ts
const foos = sink.foos();
const foo1Result = await foos.next();
const foo2Result = await foos.next();
const foo1 = foo1Result.value; // Might be `undefined` if the iteration is complete
```
::: warning
When you manually use async iterators, make sure to call `return` on them once you're done:
```ts
await foos.return();
```
This ensures all internally held resources are freed.
:::
### Decode vs. presentation order
Packets may appear out-of-order in the file, meaning the order in which they are decoded does not correspond to the order in which the decoded data is displayed (see [B-frames](./media-sources#b-frames)). The methods on media sinks differ with respect to which ordering they use to query and retrieve packets. So, just keep these definitions in mind:
* **Presentation order:** The order in which the data is to be presented; sorted by timestamp.
* **Decode order:** The order in which packets must be decoded; not always sorted by timestamp.
## General sinks
There is one media sink which can be used with any `InputTrack`:
### `EncodedPacketSink`
This sink can be used to extract raw, [encoded packets](./packets-and-samples#encodedpacket) from media files and is the most elementary media sink. `EncodedPacketSink` is useful if you don't care about the decoded media data (for example, you're only interested in timestamps), or if you want to roll your own decoding logic.
Start by constructing the sink from any `InputTrack`:
```ts
import { EncodedPacketSink } from 'mediabunny';
const sink = new EncodedPacketSink(track);
```
You can retrieve specific packets given a timestamp in seconds:
```ts
await sink.getPacket(5); // => EncodedPacket | null
// Or, retrieving only packets with type 'key':
await sink.getKeyPacket(5); // => EncodedPacket | null
```
When retrieving a packet using a timestamp, the last packet (in [presentation order](#decode-vs-presentation-order)) with a timestamp less than or equal to the search timestamp will be returned. The methods return `null` if there exists no such packet.
There is a special method for retrieving the first packet (in [decode order](#decode-vs-presentation-order)):
```ts
await sink.getFirstPacket(); // => EncodedPacket | null
```
The last packet (in [presentation order](#decode-vs-presentation-order)) can be retrieved like so:
```ts
await sink.getPacket(Infinity); // => EncodedPacket | null
```
Once you have a packet, you can retrieve the packet's successor (in [decode order](#decode-vs-presentation-order)) like so:
```ts
await sink.getNextPacket(packet); // => EncodedPacket | null
// Or jump straight to the next packet with type 'key':
await sink.getNextKeyPacket(packet); // => EncodedPacket | null
```
These methods return `null` if there is no next packet.
These methods can be combined to iterate over a range of packets. Starting from an initial packet, call `getNextPacket` in a loop to iterate over packets:
```ts
let currentPacket = await sink.getFirstPacket();
while (currentPacket) {
console.log('Packet:', currentPacket);
// Do something with the packet
currentPacket = await sink.getNextPacket(currentPacket);
}
```
While this approach works, `EncodedPacketSink` also provides a dedicated `packets` iterator function, which iterates over packets in [decode order](#decode-vs-presentation-order):
```ts
for await (const packet of sink.packets()) {
// ...
}
```
You can also constrain the iteration using a packet range, where the iteration will go from the starting packet up to (but excluding) the end packet:
```ts
const start = await sink.getPacket(5);
const end = await sink.getPacket(10, { metadataOnly: true });
for await (const packet of sink.packets(start, end)) {
// ...
}
```
The `packets` method is more performant than manual iteration as it will intelligently preload future packets before they are needed.
#### Verifying key packets
By default, packet types are determined using the metadata provided by the containing file. Some files can erroneously label some delta packets as key packets, leading to potential decoder errors. To be guaranteed that a key packet is actually a key packet, you can enable the `verifyKeyPackets` option:
```ts
// If the packet returned by this method has type: 'key', it's guaranteed
// to be a key packet.
await sink.getPacket(5, { verifyKeyPackets: true });
// Returned packets are guaranteed to be key packets
await sink.getKeyPacket(10, { verifyKeyPackets: true });
await sink.getNextKeyPacket(packet, { verifyKeyPackets: true });
// Also works for the iterator:
for await (const packet of sink.packets(
undefined,
undefined,
{ verifyKeyPackets: true },
)) {
// ...
}
```
::: info
`verifyKeyPackets` only works when `metadataOnly` is not also enabled.
:::
#### Metadata-only packet retrieval
Sometimes, you're only interested in a packet's metadata (timestamp, duration, type, ...) and not in its encoded media data. All methods on `EncodedPacketSink` accept a final `options` parameter which you can use to retrieve [metadata-only packets](./packets-and-samples#metadata-only-packets):
```ts
const packet = await sink.getPacket(5, { metadataOnly: true });
packet.isMetadataOnly; // => true
packet.data; // => Uint8Array([])
```
Retrieving metadata-only packets is more efficient for some input formats: Only the metadata section of the file must be read, not the media data section.
## Video data sinks
These sinks can only be used with an `InputVideoTrack`.
### `VideoSampleSink`
Use this sink to extract decoded [video samples](./packets-and-samples#videosample) (frames) from a video track. The sink will automatically handle the decoding internally.
::: info
All operations of this sink use [presentation order](#decode-vs-presentation-order).
:::
Create the sink like so:
```ts
import { VideoSampleSink } from 'mediabunny';
const sink = new VideoSampleSink(videoTrack);
```
#### Single retrieval
You can retrieve the sample presented at a given timestamp in seconds:
```ts
await sink.getSample(5);
// Extracting the first sample:
await sink.getSample(await videoTrack.getFirstTimestamp());
// Extracting the last sample:
await sink.getSample(Infinity);
```
This method returns the last sample with a timestamp less than or equal to the search timestamp, or `null` if there is no such sample.
#### Range iteration
You can use the `samples` iterator method to iterate over a contiguous range of samples:
```ts
// Iterate over all samples:
for await (const sample of sink.samples()) {
console.log('Sample:', sample);
// Do something with the sample
sample.close();
}
// Iterate over all samples in a specific time range:
for await (const sample of sink.samples(5, 10)) {
// ...
sample.close();
}
```
The `samples` iterator yields the samples in [presentation order](#decode-vs-presentation-order) (sorted by timestamp).
#### Sparse iteration
Sometimes, you may want to retrieve the samples for multiple timestamps at once (for example, for generating thumbnails). While you could call `getSample` multiple times, the `samplesAtTimestamps` method provides a more efficient way:
```ts
for await (const sample of sink.samplesAtTimestamps([0, 1, 2, 3, 4, 5])) {
// `sample` is either VideoSample or null
sample.close();
}
// Any timestamp sequence is allowed:
sink.samplesAtTimestamps([1, 2, 3]);
sink.samplesAtTimestamps([4, 5, 5, 5]);
sink.samplesAtTimestamps([10, -2, 3]);
```
This method is more efficient than multiple calls to `getSample` because it avoids decoding the same packet twice.
In addition to arrays, you can pass any iterable into this method:
```ts
sink.samplesAtTimestamps(new Set([2, 3, 3, 4]));
sink.samplesAtTimestamps((function* () {
for (let i = 0; i < 5; i++) {
yield i;
}
})());
sink.samplesAtTimestamps((async function* () {
const firstTimestamp = await videoTrack.getFirstTimestamp();
const lastTimestamp = await videoTrack.computeDuration();
for (let i = 0; i <= 100; i++) {
yield firstTimestamp + (lastTimestamp - firstTimestamp) * i / 100;
}
})());
```
Passing an async iterable is especially useful when paired with `EncodedPacketSink`. Imagine you want to retrieve every key frame. A naive implementation might look like this:
```ts
// Naive, bad implementation: // [!code error]
const packetSink = new EncodedPacketSink(videoTrack);
const keyFrameTimestamps: number[] = [];
let currentPacket = await packetSink.getFirstPacket();
while (currentPacket) {
keyFrameTimestamps.push(currentPacket.timestamp);
currentPacket = await packetSink.getNextKeyPacket(currentPacket);
}
const sampleSink = new VideoSampleSink(videoTrack);
const keyFrameSamples = sampleSink.samplesAtTimestamps(keyFrameTimestamps);
for await (const sample of keyFrameSamples) {
// ...
sample.close();
}
```
The issue with this implementation is that it first iterates over all key packets before yielding the first sample. The better implementation is this:
```ts
// Better implementation:
const packetSink = new EncodedPacketSink(videoTrack);
const sampleSink = new VideoSampleSink(videoTrack);
const keyFrameSamples = sampleSink.samplesAtTimestamps((async function* () {
let currentPacket = await packetSink.getFirstPacket();
while (currentPacket) {
yield currentPacket.timestamp;
currentPacket = await packetSink.getNextKeyPacket(currentPacket);
}
})());
for await (const sample of keyFrameSamples) {
// ...
sample.close();
}
```
### `CanvasSink`
While `VideoSampleSink` extracts raw decoded video samples, you can use `CanvasSink` to extract these samples as canvases instead. In doing so, certain operations such as scaling and rotating can also be handled by the sink. The downside is the additional VRAM requirements for the canvases' framebuffers.
::: info
This sink yields `HTMLCanvasElement` whenever possible, and falls back to `OffscreenCanvas` otherwise (in Worker contexts, for example).
:::
Create the sink like so:
```ts
import { CanvasSink } from 'mediabunny';
const sink = new CanvasSink(videoTrack, options);
```
Here, `options` has the following type:
```ts
type CanvasSinkOptions = {
width?: number;
height?: number;
fit?: 'fill' | 'contain' | 'cover';
rotation?: 0 | 90 | 180 | 270;
poolSize?: number;
};
```
* `width`\
The width of the output canvas in pixels. When omitted but `height` is set, the width will be calculated automatically to maintain the original aspect ratio. Otherwise, the width will be set to the original width of the video.
* `height`\
The height of the output canvas in pixels. When omitted but `width` is set, the height will be calculated automatically to maintain the original aspect ratio. Otherwise, the height will be set to the original height of the video.
* `fit`\
*Required* when both `width` and `height` are set, this option sets the fitting algorithm to use.
* `'fill'` will stretch the image to fill the entire box, potentially altering aspect ratio.
* `'contain'` will contain the entire image within the box while preserving aspect ratio. This may lead to letterboxing.
* `'cover'` will scale the image until the entire box is filled, while preserving aspect ratio.
* `rotation`\
The clockwise rotation by which to rotate the raw video frame. Defaults to the rotation set in the file metadata. Rotation is applied before resizing.
* `poolSize`\
See [Canvas pool](#canvas-pool).
Some examples:
```ts
// This sink yields canvases with the unaltered display dimensions of the track,
// and respecting the track's rotation metadata.
new CanvasSink(videoTrack);
// This sink yields canvases with a width of 1280 and a height that maintains the
// original display aspect ratio.
new CanvasSink(videoTrack, {
width: 1280,
});
// This sink yields square canvases, with the video frame scaled to completely
// cover the canvas.
new CanvasSink(videoTrack, {
width: 512,
height: 512,
fit: 'cover',
});
// This sink yields canvases with the unaltered coded dimensions of the track,
// and without applying any rotation.
new CanvasSink(videoTrack, {
rotation: 0,
});
```
The methods for retrieving canvases are analogous to those on `VideoSampleSink`:
* `getCanvas`\
Gets the canvas for a given timestamp; see [Single retrieval](#single-retrieval).
* `canvases`\
Iterates over a range of canvases; see [Range iteration](#range-iteration).
* `canvasesAtTimestamps`\
Iterates over canvases at specific timestamps; see [Sparse iteration](#sparse-iteration).
These methods yield `WrappedCanvas` instances:
```ts
type WrappedCanvas = {
// A canvas element or offscreen canvas.
canvas: HTMLCanvasElement | OffscreenCanvas;
// The timestamp of the corresponding video sample, in seconds.
timestamp: number;
// The duration of the corresponding video sample, in seconds.
duration: number;
};
```
#### Canvas pool
By default, a new canvas is created for every canvas yielded by this sink. If you know you'll keep only a few canvases around at any given time, you should make use of the `poolSize` option. This integer value specifies the number of canvases in the pool; these canvases are then reused in a ring buffer / round-robin type fashion. This keeps the amount of allocated VRAM constant and relieves the browser from constantly allocating/deallocating canvases. A pool size of 0 or `undefined` disables the pool.
An illustration using a pool size of 3:
```ts
const sink = new CanvasSink(videoTrack, { poolSize: 3 });
const a = await sink.getCanvas(42);
const b = await sink.getCanvas(42);
const c = await sink.getCanvas(42);
const d = await sink.getCanvas(42);
const e = await sink.getCanvas(42);
const f = await sink.getCanvas(42);
assert(a.canvas === d.canvas);
assert(b.canvas === e.canvas);
assert(c.canvas === f.canvas);
assert(a.canvas !== b.canvas);
assert(a.canvas !== c.canvas);
```
For closed iterators, a pool size of 1 is sufficient:
```ts
const sink = new CanvasSink(videoTrack, { poolSize: 1 });
const canvases = sink.canvases();
for await (const { canvas, timestamp } of canvases) {
// ...
}
```
## Audio data sinks
These sinks can only be used with an `InputAudioTrack`.
### `AudioSampleSink`
Use this sink to extract decoded [audio samples](./packets-and-samples#audiosample) from an audio track. The sink will automatically handle the decoding internally.
Create the sink like so:
```ts
import { AudioSampleSink } from 'mediabunny';
const sink = new AudioSampleSink(audioTrack);
```
The methods for retrieving samples are analogous to those on `VideoSampleSink`.
* `getSample`\
Gets the sample for a given timestamp; see [Single retrieval](#single-retrieval).
* `samples`\
Iterates over a range of samples; see [Range iteration](#range-iteration).
* `samplesAtTimestamps`\
Iterates over samples at specific timestamps; see [Sparse iteration](#sparse-iteration).
These methods yield [`AudioSample`](./packets-and-samples#audiosample) instances.
For example, let's use this sink to calculate the average loudness of an audio track using [root mean square](https://en.wikipedia.org/wiki/Root_mean_square):
```ts
const sink = new AudioSampleSink(audioTrack);
let sumOfSquares = 0;
let totalSampleCount = 0;
for await (const sample of sink.samples()) {
const bytesNeeded = sample.allocationSize({ format: 'f32', planeIndex: 0 });
const floats = new Float32Array(bytesNeeded / 4);
sample.copyTo(floats, { format: 'f32', planeIndex: 0 });
for (let i = 0; i < floats.length; i++) {
sumOfSquares += floats[i] ** 2;
}
totalSampleCount += floats.length;
}
const averageLoudness = Math.sqrt(sumOfSquares / totalSampleCount);
```
### `AudioBufferSink`
While `AudioSampleSink` extracts raw decoded audio samples, you can use `AudioBufferSink` to directly extract [`AudioBuffer`](https://developer.mozilla.org/en-US/docs/Web/API/AudioBuffer) instances instead. This is particularly useful when working with the Web Audio API.
Create the sink like so:
```ts
import { AudioBufferSink } from 'mediabunny';
const sink = new AudioBufferSink(audioTrack);
```
The methods for retrieving audio buffers are analogous to those on `VideoSampleSink`:
* `getBuffer`\
Gets the buffer for a given timestamp; see [Single retrieval](#single-retrieval).
* `buffers`\
Iterates over a range of buffers; see [Range iteration](#range-iteration).
* `buffersAtTimestamps`\
Iterates over buffers at specific timestamps; see [Sparse iteration](#sparse-iteration).
These methods yield `WrappedAudioBuffer` instances:
```ts
type WrappedAudioBuffer = {
// An AudioBuffer that can be used with the Web Audio API.
buffer: AudioBuffer;
// The timestamp of the corresponding audio sample, in seconds.
timestamp: number;
// The duration of the corresponding audio sample, in seconds.
duration: number;
};
```
For example, let's use this sink to play the last 10 seconds of an audio track:
```ts
const sink = new AudioBufferSink(audioTrack);
const audioContext = new AudioContext();
const lastTimestamp = await audioTrack.computeDuration();
const baseTime = audioContext.currentTime;
for await (const { buffer, timestamp } of sink.buffers(lastTimestamp - 10)) {
const source = audioContext.createBufferSource();
source.buffer = buffer;
source.connect(audioContext.destination);
source.start(baseTime + timestamp);
}
```
---
---
url: /guide/media-sources.md
---
# Media sources
## Introduction
*Media sources* provide APIs for adding media data to an output file. Different media sources provide different levels of abstraction and cater to different use cases.
For information on how to use media sources to create output tracks, check [Writing media files](./writing-media-files).
Most media sources follow this code pattern to add media data:
```ts
await mediaSource.add(...);
```
### Closing sources
When you're done using the source, meaning no additional media data will be added, it's best to close the source as soon as possible:
```ts
mediaSource.close();
```
Closing sources manually is *technically* not required and will happen automatically when finalizing the `Output`. However, if your `Output` has multiple tracks and not all of them finish supplying their data at the same time (for example, adding all audio first and then all video), closing sources early will improve performance and lower memory usage. This is because the `Output` can better "plan ahead", knowing it doesn't have to wait for certain tracks anymore (see [Packet buffering](./writing-media-files#packet-buffering)). Therefore, it is good practice to always manually close all media sources as soon as you are done using them.
### Backpressure
Media sources are the means by which backpressure is propagated from the output pipeline into your application logic. The `Output` may want to apply backpressure if the encoders or the [StreamTarget](./writing-media-files#streamtarget)'s writable can't keep up.
Backpressure is communicated by media sources via promises. All media sources with an `add` method return a promise:
```ts
mediaSource.add(...); // => Promise
```
This promise resolves when the source is ready to receive more data. In most cases, the promise will resolve instantly, but if some part of the output pipeline is overworked, it will remain pending until the output is ready to continue. Therefore, by awaiting this promise, you automatically propagate backpressure into your application logic:
```ts
// Wrong: // [!code error]
while (notDone) { // [!code error]
mediaSource.add(...); // [!code error]
} // [!code error]
// Correct:
while (notDone) {
await mediaSource.add(...);
}
```
### Video encoding config
All video sources that handle encoding internally require you to specify a `VideoEncodingConfig`, specifying the codec configuration to use:
```ts
type VideoEncodingConfig = {
codec: VideoCodec;
bitrate: number | Quality;
bitrateMode?: 'constant' | 'variable';
latencyMode?: 'quality' | 'realtime';
keyFrameInterval?: number;
fullCodecString?: string;
hardwareAcceleration?: 'no-preference' | 'prefer-hardware' | 'prefer-software';
scalabilityMode?: string;
contentHint?: string;
sizeChangeBehavior?: 'deny' | 'passThrough' | 'fill' | 'contain' | 'cover';
onEncodedPacket?: (
packet: EncodedPacket,
meta: EncodedVideoChunkMetadata | undefined
) => unknown;
onEncoderConfig?: (
config: VideoEncoderConfig
) => unknown;
};
```
* `codec`: The [video codec](./supported-formats-and-codecs#video-codecs) used for encoding.
* `bitrate`: The target number of bits per second. Alternatively, this can be a [subjective quality](#subjective-qualities).
* `bitrateMode`: Can be used to control constant vs. variable bitrate.
* `latencyMode`: The latency mode as specified by the WebCodecs API. Browsers default to `quality`. Media stream-driven video sources will automatically use the `realtime` setting.
* `keyFrameInterval`: The maximum interval in seconds between two adjacent key frames. Defaults to 5 seconds. More frequent key frames improve seeking behavior but increase file size. When using multiple video tracks, this value should be set to the same value for all tracks.
* `fullCodecString`: Allows you to optionally specify the full codec string used by the video encoder, as specified in the [WebCodecs Codec Registry](https://www.w3.org/TR/webcodecs-codec-registry/). For example, you may set it to `'avc1.42001f'` when using AVC. Keep in mind that the codec string must still match the codec specified in `codec`. If you don't set this field, a codec string will be generated automatically.
* `hardwareAcceleration`: A hint that configures the hardware acceleration method of this codec. This is best left on `'no-preference'`.
* `scalabilityMode`: An encoding scalability mode identifier as defined by [WebRTC-SVC](https://w3c.github.io/webrtc-svc/#scalabilitymodes*).
* `contentHint`: An encoding video content hint as defined by [mst-content-hint](https://w3c.github.io/mst-content-hint/#video-content-hints).
* `sizeChangeBehavior`: Video frames may change size overtime. This field controls the behavior in case this happens. Defaults to `'deny'`.
* `onEncodedPacket`: Called for each successfully encoded packet. Useful for determining encoding progress.
* `onEncoderConfig`: Called when the internal encoder config, as used by the WebCodecs API, is created. You can use this to introspect the full codec string.
### Audio encoding config
All audio sources that handle encoding internally require you to specify an `AudioEncodingConfig`, specifying the codec configuration to use:
```ts
type AudioEncodingConfig = {
codec: AudioCodec;
bitrate?: number | Quality;
bitrateMode?: 'constant' | 'variable';
fullCodecString?: string;
onEncodedPacket?: (
packet: EncodedPacket,
meta: EncodedAudioChunkMetadata | undefined
) => unknown;
onEncoderConfig?: (
config: AudioEncoderConfig
) => unknown;
};
```
* `codec`: The [audio codec](./supported-formats-and-codecs#audio-codecs) used for encoding. Can be omitted for uncompressed PCM codecs.
* `bitrate`: The target number of bits per second. Alternatively, this can be a [subjective quality](#subjective-qualities).
* `bitrateMode`: Can be used to control constant vs. variable bitrate.
* `fullCodecString`: Allows you to optionally specify the full codec string used by the audio encoder, as specified in the [WebCodecs Codec Registry](https://www.w3.org/TR/webcodecs-codec-registry/). For example, you may set it to `'mp4a.40.2'` when using AAC. Keep in mind that the codec string must still match the codec specified in `codec`. If you don't set this field, a codec string will be generated automatically.
* `onEncodedPacket`: Called for each successfully encoded packet. Useful for determining encoding progress.
* `onEncoderConfig`: Called when the internal encoder config, as used by the WebCodecs API, is created. You can use this to introspect the full codec string.
### Subjective qualities
Mediabunny provides five subjective quality options as an alternative to manually providing a bitrate. From a subjective quality, a bitrate will be calculated internally based on the codec and track information (width, height, sample rate, ...).
```ts
import {
QUALITY_VERY_LOW,
QUALITY_LOW,
QUALITY_MEDIUM,
QUALITY_HIGH,
QUALITY_VERY_HIGH,
} from 'mediabunny';
```
## Video sources
Video sources feed data to video tracks on an `Output`. They all extend the abstract `VideoSource` class.
### `VideoSampleSource`
This source takes [video samples](./packets-and-samples#videosample), encodes them, and passes the encoded data to the output.
```ts
import { VideoSampleSource } from 'mediabunny';
const sampleSource = new VideoSampleSource({
codec: 'avc',
bitrate: 1e6,
});
await sampleSource.add(videoSample);
videoSample.close(); // If it's not needed anymore
// You may optionally force samples to be encoded as key frames:
await sampleSource.add(videoSample, { keyFrame: true });
```
### `CanvasSource`
This source simplifies a common pattern: A single canvas is repeatedly updated in a render loop and each frame is added to the output file.
```ts
import { CanvasSource, QUALITY_MEDIUM } from 'mediabunny';
const canvasSource = new CanvasSource(canvasElement, {
codec: 'av1',
bitrate: QUALITY_MEDIUM,
});
await canvasSource.add(0.0, 0.1); // Timestamp, duration (in seconds)
await canvasSource.add(0.1, 0.1);
await canvasSource.add(0.2, 0.1);
// You may optionally force frames to be encoded as key frames:
await canvasSource.add(0.3, 0.1, { keyFrame: true });
```
### `MediaStreamVideoTrackSource`
This is a source for use with the [Media Capture and Streams API](https://developer.mozilla.org/en-US/docs/Web/API/Media_Capture_and_Streams_API). Use this source if you want to pipe a real-time video source (such as a webcam or screen recording) to an output file.
```ts
import { MediaStreamVideoTrackSource } from 'mediabunny';
// Get the user's screen
const stream = await navigator.mediaDevices.getDisplayMedia({ video: true });
const videoTrack = stream.getVideoTracks()[0];
const videoTrackSource = new MediaStreamVideoTrackSource(videoTrack, {
codec: 'vp9',
bitrate: 1e7,
});
// Make sure to allow any internal errors to properly bubble up
videoTrackSource.errorPromise.catch((error) => ...);
```
This source requires no additional method calls; data will automatically be captured and piped to the output file as soon as `start()` is called on the `Output`. Make sure to `stop()` on `videoTrack` after finalizing the `Output` if you don't need the user's media anymore.
::: info
If this source is the only MediaStreamTrack source in the `Output`, then the first video sample added by it starts at timestamp 0. If there are multiple, then the earliest media sample across all tracks starts at timestamp 0, and all tracks will be perfectly synchronized with each other.
:::
::: warning
`MediaStreamVideoTrackSource`'s internals are detached from the typical code flow but can still throw, so make sure to utilize `errorPromise` to deal with any errors and to stop the `Output`.
:::
### `EncodedVideoPacketSource`
The most barebones of all video sources, this source can be used to directly pipe [encoded packets](./packets-and-samples#encodedpacket) of video data to the output. This source requires that you take care of the encoding process yourself, which enables you to use the WebCodecs API manually or to plug in your own encoding stack. Alternatively, you may retrieve the encoded packets directly by reading them from another media file, allowing you to skip decoding and reencoding video data.
```ts
import { EncodedVideoPacketSource } from 'mediabunny';
// You must specify the codec name:
const packetSource = new EncodedVideoPacketSource('vp9');
await packetSource.add(packet1);
await packetSource.add(packet2);
```
> \[!IMPORTANT]
> You must add the packets in decode order.
You will need to provide additional metadata alongside your first call to `add` to give the `Output` more information about the shape and form of the video data. This metadata must be in the form of the WebCodecs API's `EncodedVideoChunkMetadata`. It might look like this:
```ts
await packetSource.add(firstPacket, {
decoderConfig: {
codec: 'vp09.00.31.08',
codedWidth: 1280,
codedHeight: 720,
colorSpace: {
primaries: 'bt709',
transfer: 'iec61966-2-1',
matrix: 'smpte170m',
fullRange: false,
},
description: undefined,
},
});
```
`codec`, `codedWidth`, and `codedHeight` are required for all codecs, whereas `description` is required for some codecs. Additional fields, such as `colorSpace`, are optional. The [WebCodecs Codec Registry](https://www.w3.org/TR/webcodecs-codec-registry/) specifies the formats of `codec` and `description` for each video codec, which you must adhere to.
#### B-frames
Some video codecs use *B-frames*, which are frames that require both the previous and the next frame to be decoded. For example, you may have something like this:
```md
Frame 1: 0.0s, I-frame (key frame)
Frame 2: 0.1s, B-frame
Frame 3: 0.2s, P-frame
```
The decode order for these frames will be:
```md
Frame 1 -> Frame 3 -> Frame 2
```
Some file formats have an explicit notion of both a "decode timestamp" and a "presentation timestamp" to model B-frames or out-of-order decoding. However, Mediabunny packets only specify their *presentation timestamp*. Decode order is determined by the order in which you add the packets, so in our example, you must add the packets like this:
```ts
await packetSource.add(packetForFrame1); // 0.0s
await packetSource.add(packetForFrame3); // 0.2s
await packetSource.add(packetForFrame2); // 0.1s
```
You are allowed to provide wildly out-of-order presentation timestamp sequences, but there is a hard constraint:
> \[!IMPORTANT]
> A packet you add must not have a smaller timestamp than the largest timestamp you added before adding the last key frame.
This is quite a mouthful, so this example will hopefully clarify it:
```md
# Legal:
Packet 1: 0.0s, key frame
Packet 2: 0.3s, delta frame
Packet 3: 0.2s, delta frame
Packet 4: 0.1s, delta frame
Packet 5: 0.4s, key frame
Packet 6: 0.5s, delta frame
# Also legal:
Packet 1: 0.0s, key frame
Packet 2: 0.3s, delta frame
Packet 3: 0.2s, delta frame
Packet 4: 0.1s, delta frame
Packet 5: 0.4s, key frame
Packet 6: 0.35s, delta frame
Packet 7: 0.3s, delta frame
Packet 8: 0.5s, delta frame
# Illegal:
Packet 1: 0.0s, key frame
Packet 2: 0.3s, delta frame
Packet 3: 0.2s, delta frame
Packet 4: 0.1s, delta frame
Packet 5: 0.4s, key frame
Packet 6: 0.25s, delta frame
```
## Audio sources
Audio sources feed data to audio tracks on an `Output`. They all extend the abstract `AudioSource` class.
### `AudioSampleSource`
This source takes [audio samples](./packets-and-samples#audiosample), encodes them, and passes the encoded data to the output.
```ts
import { AudioSampleSource } from 'mediabunny';
const sampleSource = new AudioSampleSource({
codec: 'aac',
bitrate: 128e3,
});
await sampleSource.add(audioSample);
audioSample.close(); // If it's not needed anymore
```
### `AudioBufferSource`
This source directly accepts instances of `AudioBuffer` as data, simplifying usage with the Web Audio API. The first AudioBuffer will be played at timestamp 0, and any subsequent AudioBuffer will be appended after all previous AudioBuffers.
```ts
import { AudioBufferSource, QUALITY_MEDIUM } from 'mediabunny';
const bufferSource = new AudioBufferSource({
codec: 'opus',
bitrate: QUALITY_MEDIUM,
});
await bufferSource.add(audioBuffer1);
await bufferSource.add(audioBuffer2);
await bufferSource.add(audioBuffer3);
```
### `MediaStreamAudioTrackSource`
This is a source for use with the [Media Capture and Streams API](https://developer.mozilla.org/en-US/docs/Web/API/Media_Capture_and_Streams_API). Use this source if you want to pipe a real-time audio source (such as a microphone or audio from the user's computer) to an output file.
```ts
import { MediaStreamAudioTrackSource } from 'mediabunny';
// Get the user's microphone
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
const audioTrack = stream.getAudioTracks()[0];
const audioTrackSource = new MediaStreamAudioTrackSource(audioTrack, {
codec: 'opus',
bitrate: 128e3,
});
// Make sure to allow any internal errors to properly bubble up
audioTrackSource.errorPromise.catch((error) => ...);
```
This source requires no additional method calls; data will automatically be captured and piped to the output file as soon as `start()` is called on the `Output`. Make sure to `stop()` on `audioTrack` after finalizing the `Output` if you don't need the user's media anymore.
::: info
If this source is the only MediaStreamTrack source in the `Output`, then the first audio sample added by it starts at timestamp 0. If there are multiple, then the earliest media sample across all tracks starts at timestamp 0, and all tracks will be perfectly synchronized with each other.
:::
::: warning
`MediaStreamAudioTrackSource`'s internals are detached from the typical code flow but can still throw, so make sure to utilize `errorPromise` to deal with any errors and to stop the `Output`.
:::
### `EncodedAudioPacketSource`
The most barebones of all audio sources, this source can be used to directly pipe [encoded packets](./packets-and-samples#encodedpacket) of audio data to the output. This source requires that you take care of the encoding process yourself, which enables you to use the WebCodecs API manually or to plug in your own encoding stack. Alternatively, you may retrieve the encoded packets directly by reading them from another media file, allowing you to skip decoding and reencoding audio data.
```ts
import { EncodedAudioPacketSource } from 'mediabunny';
// You must specify the codec name:
const packetSource = new EncodedAudioPacketSource('aac');
await packetSource.add(packet);
```
You will need to provide additional metadata alongside your first call to `add` to give the `Output` more information about the shape and form of the audio data. This metadata must be in the form of the WebCodecs API's `EncodedAudioChunkMetadata`. It might look like this:
```ts
await packetSource.add(firstPacket, {
decoderConfig: {
codec: 'mp4a.40.2',
numberOfChannels: 2,
sampleRate: 48000,
description: new Uint8Array([17, 144]),
},
});
```
`codec`, `numberOfChannels`, and `sampleRate` are required for all codecs, whereas `description` is required for some codecs. The [WebCodecs Codec Registry](https://www.w3.org/TR/webcodecs-codec-registry/) specifies the formats of `codec` and `description` for each audio codec, which you must adhere to.
## Subtitle sources
Subtitle sources feed data to subtitle tracks on an `Output`. They all extend the abstract `SubtitleSource` class.
### `TextSubtitleSource`
This source feeds subtitle cues to the output from a text file in which the subtitles are defined.
```ts
import { TextSubtitleSource } from 'mediabunny';
const textSource = new TextSubtitleSource('webvtt');
const text =
`WEBVTT
00:00:00.000 --> 00:00:02.000
This is your last chance.
00:00:02.500 --> 00:00:04.000
After this, there is no turning back.
00:00:04.500 --> 00:00:06.000
If you take the blue pill, the story ends.
00:00:06.500 --> 00:00:08.000
You wake up in your bed and believe whatever you want to believe.
00:00:08.500 --> 00:00:10.000
If you take the red pill, you stay in Wonderland
00:00:10.500 --> 00:00:12.000
and I show you how deep the rabbit hole goes.
`;
await textSource.add(text);
```
If you add the entire subtitle file at once, make sure to [close the source](#closing-sources) immediately after:
```ts
textSource.close();
```
You can also add cues individually in small chunks:
```ts
import { TextSubtitleSource } from 'mediabunny';
const textSource = new TextSubtitleSource('webvtt');
await textSource.add('WEBVTT\n\n');
await textSource.add('00:00:00.000 --> 00:00:02.000\nHello there!\n\n');
await textSource.add('00:00:02.500 --> 00:00:04.000\nChunky chunks.\n\n');
```
The chunks have certain constraints: A cue must be fully contained within a chunk and cannot be split across multiple smaller chunks (although a chunk can contain multiple cues). Also, the WebVTT preamble must be added first and all at once.
---
---
url: /llms.md
---
# Mediabunny and LLMs
While Mediabunny is proudly human-generated, we want to encourage any and all usage of Mediabunny, even when the vibes are high.
Mediabunny is still new and is unlikely to be in the training data of modern LLMs, but we can still make the AI perform extremely well by just giving it a little more context.
***
Give one or more of these files to your LLM:
### [mediabunny.d.ts](/mediabunny.d.ts)
This file contains the entire public TypeScript API of Mediabunny and is commented extremely thoroughly.
### [llms.txt](/llms.txt)
This file provides an index of Mediabunny's guide, which the AI can then further dive into if it wants to.
### [llms-full.txt](/llms-full.txt)
This is just the entire Mediabunny guide in a single file.
---
---
url: /guide/output-formats.md
---
# Output formats
## Introduction
An *output format* specifies the container format of the data written by an `Output`. Mediabunny supports many commonly used container formats, each having format-specific options.
Many formats also offer *data callbacks*, which are special callbacks that fire for specific data regions in the output file.
### Output format properties
All output formats have a common set of properties you can query.
```ts
// Get the format's file extension:
format.fileExtension; // => '.mp4'
// Get the format's base MIME type:
format.mimeType; // => 'video/mp4'
// Check which codecs can be contained by the format:
format.getSupportedCodecs(); // => MediaCodec[]
format.getSupportedVideoCodecs(); // => VideoCodec[]
format.getSupportedAudioCodecs(); // => AudioCodec[]
format.getSupportedSubtitleCodecs(); // => SubtitleCodec[]
// Check if the format supports video tracks with rotation metadata:
format.supportsVideoRotationMetadata; // => boolean
```
Refer to the [compatibility table](./supported-formats-and-codecs.md#compatibility-table) to see which codecs can be used with which output format.
Formats also differ in the amount and types of tracks they can contain. You can retrieve this information using:
```ts
format.getSupportedTrackCounts(); // => TrackCountLimits
type TrackCountLimits = {
video: { min: number, max: number },
audio: { min: number, max: number },
subtitle: { min: number, max: number },
total: { min: number, max: number },
};
```
### Append-only writing
Some output format configurations write in an *append-only* fashion. This means they only ever add new data to the end, and never have to seek back to overwrite a previously-written section of the file. Or, put formally: the byte offset of any write is exactly equal to the number of bytes written before it.
Append-only formats, in combination with [`StreamTarget`](./writing-media-files#streamtarget), have some useful properties. They enable use with [Media Source Extensions](https://developer.mozilla.org/en-US/docs/Web/API/Media_Source_Extensions_API) and allow for trivial streaming across the network, such as for file uploads.
## MP4
This output format creates MP4 files.
```ts
import { Output, Mp4OutputFormat } from 'mediabunny';
const output = new Output({
format: new Mp4OutputFormat(options),
// ...
});
```
The following options are available:
```ts
type IsobmffOutputFormatOptions = {
fastStart?: false | 'in-memory' | 'fragmented';
minimumFragmentDuration?: number;
onFtyp?: (data: Uint8Array, position: number) => unknown;
onMoov?: (data: Uint8Array, position: number) => unknown;
onMdat?: (data: Uint8Array, position: number) => unknown;
onMoof?: (data: Uint8Array, position: number, timestamp: number) => unknown;
};
```
* `fastStart`\
Controls the placement of metadata in the file. Placing metadata at the start of the file is known as "Fast Start" and provides certain benefits: The file becomes easier to stream over the web without range requests, and sites like YouTube can start processing the video while it's uploading. However, placing metadata at the start of the file can require more processing and memory in the writing step. This library provides full control over the placement of metadata by setting `fastStart` to one of these options:
* `false`\
Disables Fast Start, placing the metadata at the end of the file. Fastest and uses the least memory.
* `'in-memory'`\
Produces a file with Fast Start by keeping all media chunks in memory until the file is finalized. This produces a high-quality and compact output at the cost of a more expensive finalization step and higher memory requirements.
::: info
This option ensures [append-only writing](#append-only-writing), although all the writing happens in bulk, at the end.
:::
* `'fragmented'`\
Produces a *fragmented MP4 (fMP4)* file, evenly placing sample metadata throughout the file by grouping it into "fragments" (short sections of media), while placing general metadata at the beginning of the file. Fragmented files are ideal in streaming contexts, as each fragment can be played individually without requiring knowledge of the other fragments. Furthermore, they remain lightweight to create no matter how large the file becomes, as they don't require media to be kept in memory for very long. However, fragmented files are not as widely and wholly supported as regular MP4 files, and some players don't provide seeking functionality for them.
::: info
This option ensures [append-only writing](#append-only-writing).
:::
::: warning
This option requires [packet buffering](./writing-media-files#packet-buffering).
:::
* `undefined`\
The default option; it behaves like `'in-memory'` when using [`BufferTarget`](./writing-media-files#buffertarget) and like `false` otherwise.
* `minimumFragmentDuration`\
Only relevant when `fastStart` is `'fragmented'`. Sets the minimum duration in seconds a fragment must have to be finalized and written to the file. Defaults to 1 second.
* `onFtyp`\
Will be called once the ftyp (File Type) box of the output file has been written.
* `onMoov`\
Will be called once the moov (Movie) box of the output file has been written.
* `onMdat`\
Will be called for each finalized mdat (Media Data) box of the output file. Usage of this callback is not recommended when not using `fastStart: 'fragmented'`, as there will be one monolithic mdat box which might require large amounts of memory.
* `onMoof`\
Will be called for each finalized moof (Movie Fragment) box of the output file. The fragment's start timestamp in seconds is also passed.
## QuickTime File Format (.mov)
This output format creates QuickTime files (.mov).
```ts
import { Output, MovOutputFormat } from 'mediabunny';
const output = new Output({
format: new MovOutputFormat(options),
// ...
});
```
The available options are the same `IsobmffOutputFormatOptions` used by [MP4](#mp4).
## WebM
This output format creates WebM files.
```ts
import { Output, WebMOutputFormat } from 'mediabunny';
const output = new Output({
format: new WebMOutputFormat(options),
// ...
});
```
The following options are available:
```ts
type MkvOutputFormatOptions = {
appendOnly?: boolean;
minimumClusterDuration?: number;
onEbmlHeader?: (data: Uint8Array, position: number) => void;
onSegmentHeader?: (data: Uint8Array, position: number) => unknown;
onCluster?: (data: Uint8Array, position: number, timestamp: number) => unknown;
};
```
* `appendOnly`\
Configures the output to write data in an append-only fashion. This is useful for live-streaming the output as it's being created. Note that when enabled, certain features like file duration or seeking will be disabled or impacted, so don't use this option when you want to write out a media file for later use.
::: info
This option ensures [append-only writing](#append-only-writing).
:::
* `minimumClusterDuration`\
Sets the minimum duration in seconds a cluster must have to be finalized and written to the file. Defaults to 1 second.
* `onEbmlHeader`\
Will be called once the EBML header of the output file has been written.
* `onSegmentHeader`\
Will be called once the header part of the Matroska Segment element has been written. The header data includes the Segment element and everything inside it, up to (but excluding) the first Matroska Cluster.
* `onCluster`\
Will be called for each finalized Matroska Cluster of the output file. The cluster's start timestamp in seconds is also passed.
## Matroska (.mkv)
This output format creates Matroska files (.mkv).
```ts
import { Output, MkvOutputFormat } from 'mediabunny';
const output = new Output({
format: new MkvOutputFormat(options),
// ...
});
```
The available options are the same `MkvOutputFormatOptions` used by [WebM](#webm).
## Ogg
This output format creates Ogg files.
```ts
import { Output, OggOutputFormat } from 'mediabunny';
const output = new Output({
format: new OggOutputFormat(options),
// ...
});
```
::: info
This format ensures [append-only writing](#append-only-writing).
:::
The following options are available:
```ts
type OggOutputFormatOptions = {
onPage?: (data: Uint8Array, position: number, source: MediaSource) => unknown;
};
```
* `onPage`\
Will be called for each finalized Ogg page of the output file. The [media source](./media-sources) backing the page's track (logical bitstream) is also passed.
## MP3
This output format creates MP3 files.
```ts
import { Output, Mp3OutputFormat } from 'mediabunny';
const output = new Output({
format: new Mp3OutputFormat(options),
// ...
});
```
The following options are available:
```ts
type Mp3OutputFormatOptions = {
xingHeader?: boolean;
onXingFrame?: (data: Uint8Array, position: number) => unknown;
};
```
* `xingHeader`\
Controls whether the Xing header, which contains additional metadata as well as an index, is written to the start of the MP3 file. Defaults to `true`.
::: info
When set to `false`, this option ensures [append-only writing](#append-only-writing).
:::
* `onXingFrame`\
Will be called once the Xing metadata frame is finalized, which happens at the end of the writing process. This callback only fires if `xingHeader` isn't set to `false`.
::: info
Most browsers don't support encoding MP3. Use the official [`@mediabunny/mp3-encoder`](./extensions/mp3-encoder) package to polyfill an encoder.
:::
## WAVE
This output format creates WAVE (.wav) files.
```ts
import { Output, WavOutputFormat } from 'mediabunny';
const output = new Output({
format: new WavOutputFormat(options),
// ...
});
```
The following options are available:
```ts
type WavOutputFormatOptions = {
large?: boolean;
onHeader?: (data: Uint8Array, position: number) => unknown;
};
```
* `large`\
When enabled, an RF64 file be written, allowing for file sizes to exceed 4 GiB, which is otherwise not possible for regular WAVE files.
* `onHeader`\
Will be called once the file header is written. The header consists of the RIFF header, the format chunk, and the start of the data chunk (with a placeholder size of 0).
## ADTS
This output format creates ADTS (.aac) files.
```ts
import { Output, AdtsOutputFormat } from 'mediabunny';
const output = new Output({
format: new AdtsOutputFormat(options),
// ...
});
```
The following options are available:
```ts
type AdtsOutputFormatOptions = {
onFrame?: (data: Uint8Array, position: number) => unknown;
};
```
* `onFrame`\
Will be called for each ADTS frame that is written.
---
---
url: /guide/packets-and-samples.md
---
# Packets & samples
## Introduction
Media data in Mediabunny is present in two different forms:
* **Packet:** Encoded media data, the result of an encoding process
* **Sample:** Raw, uncompressed, presentable media data
In addition to data, both packets and samples carry additional metadata, such as timestamp, duration, width, etc.
Packets are represented with the `EncodedPacket` class, which is used for both video and audio packets. Samples are represented with the `VideoSample` and `AudioSample` classes:
* `VideoSample`: Represents a single frame of video.
* `AudioSample`: Represents a (typically short) section of audio.
Samples can be encoded into packets, and packets can be decoded into samples:
```mermaid
flowchart LR
A[VideoSample]
B[AudioSample]
C[EncodedPacket]
D[VideoSample]
E[AudioSample]
A -- encode --> C
B -- encode --> C
C -- decode --> D
C -- decode --> E
```
### Connection to WebCodecs
Packets and samples in Mediabunny correspond directly with concepts of the [WebCodecs API](https://w3c.github.io/webcodecs/):
* `EncodedPacket`\
-> `EncodedVideoChunk` for video packets\
-> `EncodedAudioChunk` for audio packets
* `VideoSample`
-> `VideoFrame`
* `AudioSample`
-> `AudioData`
Since Mediabunny makes heavy use of WebCodecs API, its own classes are typically used as wrappers around the WebCodecs classes. However, this wrapping comes with a few benefits:
1. **Independence:** This library remains functional even if the WebCodecs API isn't available. Encoders and decoders can be polyfilled using [custom coders](./supported-formats-and-codecs#custom-coders), and the library can run in non-browser contexts such as Node.js.
2. **Extensibility:** The wrappers serve as a namespace for additional operations, such as `toAudioBuffer()` on `AudioSample`, or `draw()` on `VideoSample`.
3. **Consistency:** While WebCodecs uses integer microsecond timestamps, Mediabunny uses floating-point second timestamps everywhere. With these wrappers, all timing information is always in seconds and the user doesn't need to think about unit conversions.
Conversion is easy:
```ts
import { EncodedPacket, VideoSample, AudioSample } from 'mediabunny';
// EncodedPacket to WebCodecs chunks:
encodedPacket.toEncodedVideoChunk(); // => EncodedVideoChunk
encodedPacket.toEncodedAudioChunk(); // => EncodedAudioChunk
// WebCodecs chunks to EncodedPacket:
EncodedPacket.fromEncodedChunk(videoChunk); // => EncodedPacket
EncodedPacket.fromEncodedChunk(audioChunk); // => EncodedPacket
// VideoSample to VideoFrame:
videoSample.toVideoFrame(); // => VideoFrame
// VideoFrame to VideoSample:
new VideoSample(videoFrame); // => VideoSample
// AudioSample to AudioData:
audioSample.toAudioData(); // => AudioData
// AudioData to AudioSample:
new AudioSample(audioData); // => AudioSample
```
::: info
`VideoSample`/`AudioSample` instances created from their WebCodecs API counterpart are very efficient; they simply maintain a reference to the underlying WebCodecs API instance and do not perform any unnecessary copying.
:::
### Negative timestamps
While packet and sample durations cannot be negative, packet and sample timestamps can.
A negative timestamp represents a sample that starts playing before the composition does (the composition always starts at 0). Negative timestamps are typically a result of a track being trimmed at the start, either to cut off a piece of media or to synchronize it with the other tracks. Therefore, you should avoid presenting any sample with a negative timestamp.
## `EncodedPacket`
An encoded packet represents encoded media data of any type (video or audio). They are the result of an *encoding process*, and you can turn encoded packets into actual media data using a *decoding process*.
### Creating packets
To create an `EncodedPacket`, you can use its constructor:
```ts
constructor(
data: Uint8Array,
type: 'key' | 'delta',
timestamp: number, // in seconds
duration: number, // in seconds
sequenceNumber?: number,
byteLength?: number,
);
```
::: info
You probably won't ever need to set `sequenceNumber` or `byteLength` in the constructor.
:::
For example, here we're creating a packet from some encoded video data:
```ts
import { EncodedPacket } from 'mediabunny';
const encodedVideoData = new Uint8Array([...]);
const encodedPacket = new EncodedPacket(encodedVideoData, 'key', 5, 1/24);
```
Alternatively, if you're coming from WebCodecs encoded chunks, you can create an `EncodedPacket` from them:
```ts
import { EncodedPacket } from 'mediabunny';
// From EncodedVideoChunk:
const encodedPacket = EncodedPacket.fromEncodedChunk(encodedVideoChunk);
// From EncodedAudioChunk:
const encodedPacket = EncodedPacket.fromEncodedChunk(encodedAudioChunk);
```
### Inspecting packets
Encoded packets have a bunch of read-only data you can inspect. You can get the encoded data like so:
```ts
encodedPacket.data; // => Uint8Array
```
You can query the type of packet:
```ts
encodedPacket.type; // => PacketType ('key' | 'delta')
```
* A *key packet* can be decoded directly, independently of other packets.
* A *delta packet* can only be decoded after the packet before it has been decoded.
For example, in a video track, it is common to have a key frame about every few seconds. When seeking, if the user seeks to a position shortly after a key frame, the decoded data can be shown quickly; if they seek far away from a key frame, the decoder must first crunch through many delta frames before it can show anything.
#### Determining a packet's actual type
The `type` field is derived from metadata in the containing file, which can sometimes (in rare cases) be incorrect. To determine a packet's actual type with certainty, you can do this:
```ts
// `packet` must come from the InputTrack `track`
const type = await track.determinePacketType(packet); // => PacketType | null
```
This determines the packet's type by looking into its bitstream. `null` is returned when the type couldn't be determined.
***
You can query the packet's timing information:
```ts
encodedPacket.timestamp; // => Presentation timestamp in seconds
encodedPacket.duration; // => Duration in seconds
// There also exist integer microsecond versions of these:
encodedPacket.microsecondTimestamp;
encodedPacket.microsecondDuration;
```
`timestamp` and `duration` are both given as floating-point numbers.
::: warning
Timestamps can be [negative](#negative-timestamps).
:::
***
A packet also has a quantity known as a *sequence number*:
```ts
encodedPacket.sequenceNumber; // => number
```
When [reading packets from an input file](./media-sinks#encodedpacketsink), this number specifies the relative ordering of packets. If packet $A$ has a lower sequence number than packet $B$, then packet $A$ comes first (in [decode order](./media-sinks#decode-vs-presentation-order)). If two packets have the same sequence number, then they represent the same media sample.
Sequence numbers have no meaning on their own and only make sense when comparing them to other sequence numbers. If a packet has sequence number $n$, it does not mean that it is the $n$th packet of the track.
Negative sequence numbers mean the packet's ordering is undefined. When creating an `EncodedPacket`, the sequence number defaults to -1.
### Cloning packets
Use the `clone` method to create a new packet from an existing packet. While doing so, you can change its timestamp and duration.
```ts
// Creates a clone identical to the original:
packet.clone();
// Creates a clone with the timestamp set to 10 seconds:
packet.clone({ timestamp: 10 });
```
### Metadata-only packets
[`EncodedPacketSink`](./media-sinks#encodedpacketsink) can create *metadata-only* packets:
```ts
await sink.getFirstPacket({ metadataOnly: true });
```
Metadata-only packets contain all the metadata of the full packet, but do not contain any data:
```ts
packet.data; // => Uint8Array([])
```
You can still retrieve the *size* that the data would have:
```ts
packet.byteLength; // => number
```
Given a packet, you can check if it is metadata-only like so:
```ts
packet.isMetadataOnly; // => boolean
```
## `VideoSample`
A video sample represents a single frame of video. It can be created directly from an image source, or be the result of a decoding process. Its API is modeled after [VideoFrame](https://developer.mozilla.org/en-US/docs/Web/API/VideoFrame).
### Creating video samples
Video samples have an image source constructor and a raw constructor.
::: info
The constructor of `VideoSample` is very similar to [`VideoFrame`'s constructor](https://developer.mozilla.org/en-US/docs/Web/API/VideoFrame/VideoFrame), but uses second timestamps instead of microsecond timestamps.
:::
#### Image source constructor
This constructor creates a `VideoSample` from a `CanvasImageSource`:
```ts
import { VideoSample } from 'mediabunny';
// Creates a sample from a canvas element
const sample = new VideoSample(canvas, {
timestamp: 3, // in seconds
duration: 1/24, // in seconds
});
// Creates a sample from an image element, with some added rotation
const sample = new VideoSample(imageElement, {
timestamp: 5, // in seconds
rotation: 90, // in degrees clockwise
});
// Creates a sample from a VideoFrame (timestamp will be copied)
const sample = new VideoSample(videoFrame);
```
#### Raw constructor
This constructor creates a `VideoSample` from raw pixel data given in an `ArrayBuffer`:
```ts
import { VideoSample } from 'mediabunny';
// Creates a sample from pixel data in the RGBX format
const sample = new VideoSample(buffer, {
format: 'RGBX',
codedWidth: 1280,
codedHeight: 720,
timestamp: 0,
});
// Creates a sample from pixel data in the YUV 4:2:0 format
const sample = new VideoSample(buffer, {
format: 'I420',
codedWidth: 1280,
codedHeight: 720,
timestamp: 0,
});
```
See [`VideoPixelFormat`](https://w3c.github.io/webcodecs/#enumdef-videopixelformat) for a list of pixel formats supported by WebCodecs.
### Inspecting video samples
A `VideoSample` has several read-only properties:
```ts
// The internal pixel format in which the frame is stored
videoSample.format; // => VideoPixelFormat | null
// Raw dimensions of the sample
videoSample.codedWidth; // => number
videoSample.codedHeight; // => number
// Transformed display dimensions of the sample (after rotation)
videoSample.displayWidth; // => number
videoSample.displayHeight; // => number
// Rotation of the sample in degrees clockwise. The raw sample should be
// rotated by this amount when it is presented.
videoSample.rotation; // => 0 | 90 | 180 | 270
// Timing information
videoSample.timestamp; // => Presentation timestamp in seconds
videoSample.duration; // => Duration in seconds
videoSample.microsecondTimestamp; // => Presentation timestamp in microseconds
videoSample.microsecondDuration; // => Duration in microseconds
// Color space of the sample
videoSample.colorSpace; // => VideoColorSpace
```
While all of these properties are read-only, you can use the `setTimestamp`, `setDuration` and `setRotation` methods to modify some of the metadata of the video sample.
::: warning
Timestamps can be [negative](#negative-timestamps).
:::
### Using video samples
Video samples provide a couple of ways with which you can access its frame data.
You can convert a video sample to a WebCodecs [`VideoFrame`](https://developer.mozilla.org/en-US/docs/Web/API/VideoFrame) to access additional data or to pass it to a [`VideoEncoder`](https://developer.mozilla.org/en-US/docs/Web/API/VideoEncoder):
```ts
videoSample.toVideoFrame(); // => VideoFrame
```
This method is virtually free if the video sample was constructed using a `VideoFrame`.
::: warning
The `VideoFrame` returned by this method **must** be closed separately from the video sample.
:::
***
It's also common to draw video samples to a `