Learn to use the requestVideoFrameCallback ()
para trabajar de manera más eficiente con videos en el browser.
Updated
Hay una nueva API Web en el bloque, definida en el
HTMLVideoElement.requestVideoFrameCallback ()
specification. the requestVideoFrameCallback ()
The method allows web authors to record a callback that runs through the rendering steps when a new video frame is sent to the composer. This is intended to enable developers to perform efficient operations per frame of video to video, such as video processing and painting on a canvas, video analysis, or syncing with external audio sources.
Difference with requestAnimationFrame ()
Operations such as drawing a video frame on a canvas using
drawImage ()
made through this API will be synchronized as best effort with the frame rate of the video that is played on the screen. Different from
window.requestAnimationFrame ()
, which typically fires about 60 times per second,
requestVideoFrameCallback ()
is tied to the actual video frame rate, with an important
exception:
The effective rate at which callbacks are executed is the lower rate of the video rate and the browser rate. This means that a 25fps video played in a browser that paints at 60Hz would trigger callbacks at 25Hz. A 120fps video on that same 60Hz browser would trigger 60Hz callbacks.
What's in a name?
Due to its similarity to window.requestAnimationFrame ()
, the method initially was proposed as video.requestAnimationFrame ()
, but I'm happy with the new name,
requestVideoFrameCallback ()
, which was agreed after a long discussion. Hurrah, bikes for the victory!
Browser support and feature detection
The method is
implemented in Chromium
already, and
Mozilla people like. For what it's worth, I've also submitted a
WebKit error asking for it. API feature detection works like this:
if ('requestVideoFrameCallback' in HTMLVideoElement.prototype) {
}
Using the requestVideoFrameCallback () method
If you have ever used the requestAnimationFrame ()
method, you will immediately feel at home with the requestVideoFrameCallback ()
method. You register an initial callback once and then re-register each time the callback is triggered.
const doSomethingWithTheFrame = (now, metadata) => {
console.log(now, metadata);
video.requestVideoFrameCallback(doSomethingWithTheFrame);
};
video.requestVideoFrameCallback(doSomethingWithTheFrame);
In the callback, now
is a DOMHighResTimeStamp
and metadata
is a VideoFrameMetadata
dictionary with the following properties:
presentationTime
, of typeDOMHighResTimeStamp
: El momento en el que el user agent envió el marco para su composición.expectedDisplayTime
, of typeDOMHighResTimeStamp
: El momento en el que el agente de Username espera que el marco be visible.width
, of typeunsigned long
: The width of the video frame, in media pixels.height
, of typeunsigned long
: The height of the video frame, in multimedia pixels.mediaTime
, of typedouble
: The media display timestamp (PTS) in seconds of the displayed frame (for example, your timestamp in thevideo.currentTime
timeline).presentedFrames
, of typeunsigned long
: Count of the number of frames sent for composition. Allows customers to determine if frames were lost between instances ofVideoFrameRequestCallback
.processingDuration
, of typedouble
: The elapsed duration in seconds since the sending of the packet encoded with the same presentation time stamp (PTS) as this frame (for example, the same as themediaTime
) to the decoder until the decoded frame is ready for presentation.
For WebRTC applications, additional properties may appear:
captureTime
, of typeDOMHighResTimeStamp
: For video frames coming from a local or remote source, this is the moment the camera captured the frame. For a remote source, capture time is estimated using RTCP sender clock synchronization and reporting to convert RTP timestamps to capture time.receiveTime
, of typeDOMHighResTimeStamp
: For video frames that come from a remote source, this is the time the platform received the encoded frame, that is, the time the last packet belonging to this frame was received through the network.rtpTimestamp
, of typeunsigned long
: The RTP timestamp associated with this video frame.
Note that width
and height
may differ from videoWidth
and videoHeight
in certain cases (for example, an anamorphic video may have rectangular pixels).
Of special interest in this list is mediaTime
. In the Chromium implementation, we use the audio clock as the time source that supports video.currentTime
, Meanwhile he mediaTime
is populated directly by the presentationTimestamp
of the frame. the mediaTime
is what to use if you want to accurately identify the frames in a reproducible way, even to identify exactly the frames that were lost.
Unfortunately, the video element does not guarantee frame accuracy searching. This has been a continuous topic of discussion.
WebCodecs it will eventually allow for frame-accurate applications.
If things seem like a box away ...
Vertical sync (or simply vsync) is a graphics technology that synchronizes the frame rate of a video and the refresh rate of a monitor. As requestVideoFrameCallback ()
runs in the main thread, but under the hood video compositing happens in the composer thread, everything from this API is a best effort and we do not offer any strict guarantees. What may be happening is that the API may lag a vsync relative to when a video frame is rendered. A vsync is required for changes made to the web page via the API to appear on the screen (same as window.requestAnimationFrame ()
). So if you keep updating mediaTime
or the frame number on your web page and compare it to the numbered video frames, eventually the video will look like it is one frame ahead.
What is actually happening is that the frame is ready in vsync x, the callback fires, and the frame is processed in vsync x + 1, and the changes made to the callback are processed in vsync x + 2. You can check if the callback is a late vsync (and the frame is already rendered on the screen) by checking if the metadata.expectedDisplayTime
is approximately now
or a vsync in the future. If it is within five to ten microseconds of now
, the frame is already rendered; If he expectedDisplayTime
it's roughly sixteen milliseconds into the future (assuming your browser / screen updates to 60Hz), then it's in sync with the frame.
Manifestation
I have created a small
Glitch demo
que muestra cómo se dibujan los fotogramas en un lienzo exactamente a la velocidad de fotogramas del video y dónde se registran los metadata del fotograma para fines de depuración. La lógica central es solo un par de líneas de JavaScript.
let paintCount = 0;
let startTime = 0.0;const updateCanvas = (now, metadata) => {
if (startTime === 0.0) {
startTime = now;
}
ctx.drawImage(video, 0, 0, canvas.width, canvas.height);
const elapsed = (now - startTime) / 1000.0;
const fps = (++paintCount / elapsed).toFixed(3);
fpsInfo.innerText = `video fps: ${fps}`;
metadataInfo.innerText = JSON.stringify(metadata, null, 2);
video.requestVideoFrameCallback(updateCanvas);
};
video.requestVideoFrameCallback(updateCanvas);
Conclusions
I've done frame-level processing for a long time, not having access to the actual frames, just based on video.currentTime
. Implementé la segmentation de tomas de video en JavaScript de una manera aproximada; todavía puedes leer el adjunto
research work. Had the requestVideoFrameCallback ()
existed back then, my life would have been much simpler ...
Thanks
the requestVideoFrameCallback
The API was specified and implemented by
Thomas guilbert. This article was reviewed by Joe medley
and Kayce Basques.
Hero image for
Denise Jans on Unsplash.