

As illustrated below, this adds an extra process steps to the resulting video pipeline. Similarly, on the receiving side, you may want to give users options like adjusting colors and contrast, which also require raw pixel access between the decode and render steps. Said differently, it needs to take place between the capture step and encode steps. This operation needs access to the raw pixels of the video stream. If you want to add the ability to do something like remove users’ backgrounds, the most scalable and privacy respective option is to do it client-side before the video stream is sent to the network. Web applications do not see these separate encode/send and receive/decode steps in practice – they are entangled in the core WebRTC API and under the control of the browser. The resulting video pipeline is illustrated below. Received streams must be decoded before they can be rendered. Raw media streams then need to be encoded for transport and sent over to the receiving side.

The capture of raw audio and video streams from microphones and cameras relies on getUserMedia. In simple WebRTC video conferencing scenarios, audio and video streams captured on one device are sent to another device, possibly going through some intermediary server. Workers, TransformStream and VideoFrame.

What about the Canvas? Do we need WebCodecs?.The role of WebCodecs in client-side processing.Note: these APIs are new and may not work in your browser.I am thrilled about the depth and insights these guides provide on these cutting-edge approaches – enjoy!

Part two will explore the actual processing of video frames. This first section provides a review of the steps and pitfalls in a multi-step video processing pipeline using existing and the newest web APIs. This is the first of a two-part series of articles that explores the future of real-time video processing with WebCodecs and Streams. François and Dom are long-time standards guys with a deep history of helping to make the web what it is today. In case you forgot, W3C is the World Wide Web Consortium that standardizes the Web. So how do all these new APIs work together? That is exactly what W3C specialists, François Daoust and Dominique Hazaël-Massieux (Dom) decided to find out. The result is not only more control within existing APIs, but also a bunch of new APIs like Insertable Streams, WebCodecs, Streams, WebGPU, and WebNN. To better accommodate this growing trend, the web platform has been exposing its underlying platform to give developers more access. Now it is common to use ML to analyze and manipulate media in real time for things like virtual backgrounds, augmented reality, noise suppression, intelligent cropping, and much more. WebRTC used to be about capturing some media and sending it from Point A to Point B.
