Creating blurred or virtual backgrounds in real-time video in React apps

November 5, 2024

Creating blurred or virtual backgrounds in real-time video in React apps

Recently, we built NeetoRecord, a loom alternative. The desktop application was built using Electron. In a series of blogs, we capture how we built the desktop application and the challenges we ran into. This blog is part 3 of the blog series. You can also read about part 1, part 2, part 4, part 5, part 6, part 7 part 8 and part 9.

Modern tools like Zoom and Google Meet allow us to blur or completely replace our background in real-time video, creating a polished and distraction-free environment regardless of where we are.

This is possible because of advancements in machine learning. In this blog, we'll explore how to achieve real-time background blurring and replacement using TensorFlow's body segmentation capabilities.

Tensorflow body segmentation

TensorFlow body segmentation is a computer vision technique that involves dividing an image into distinct regions corresponding to different parts of a human body. It typically employs deep learning models, such as convolutional neural networks (CNNs), to analyze an image and predict pixel-level labels. These labels indicate whether each pixel belongs to a specific body part, like the head, torso, arms, or legs.

The segmentation process often starts with a pre-trained model, which has been trained on large datasets. The model processes the input image through multiple layers of convolutions and pooling, gradually refining the segmentation map. The final output is a precise mask that outlines each body part, allowing for applications in areas like augmented reality, fitness tracking, and virtual try-ons.

To learn more about Tensorflow and body segmentation, check out the below resources.

Setting up React app

We'll create a simple React app that streams video from the webcam.

import React, { useRef, useEffect } from "react";

const App = () => {
  const videoRef = useRef(null);

  useEffect(() => {
    const getVideo = async () => {
      try {
        const stream = await navigator.mediaDevices.getUserMedia({
          video: true,
        });
        if (videoRef.current) {
          videoRef.current.srcObject = stream;
        }
      } catch (err) {
        console.error("Error accessing webcam: ", err);
      }
    }

    getVideo();

    return () => {
      if (videoRef.current && videoRef.current.srcObject) {
        videoRef.current.srcObject.getTracks().forEach(track => track.stop());
      }
    };
  }, []);

  return (
    <div>
      <video ref={videoRef} autoPlay width="640" height="480" style={transform: 'scaleX(-1)'}/>
    </div>
  );
}

export default App;

In the code above, we render a <video> element, and once the app is mounted, we obtain the video stream from the user's webcam using navigator.mediaDevices.getUserMedia. This call will prompt the user to grant permission to access their camera. Once the user grants permission, the video stream is captured and rendered in the <video> element.

Installing packages

Next, let's add the necessary TensorFlow packages.

yarn add @tensorflow/tfjs-core @tensorflow/tfjs-converter @tensorflow-models/body-segmentation @mediapipe/selfie_segmentation

@tensorflow/tfjs-core is the core JavaScript package for TensorFlow, @tensorflow-models/body-segmentation contains all the functions we need for body segmentation, and @mediapipe/selfie_segmentation is our pre-trained model.

Creating body segmenter

The TensorFlow body segmentation package provides a pre-trained MediaPipeSelfieSegmentation model for segmenting the human body in images and videos. This model is specifically designed for the upper body. If our requirement involves the entire body, we may want to consider other models like BodyPix.

We need to load this model to create a segmenter;

import * as bodySegmentation from "@tensorflow-models/body-segmentation";

const createSegmenter = async () => {
  const model = bodySegmentation.SupportedModels.MediaPipeSelfieSegmentation;
  const segmenterConfig = {
    runtime: "mediapipe",
    solutionPath: "https://cdn.jsdelivr.net/npm/@mediapipe/selfie_segmentation",
    modelType: "general",
  };
  return bodySegmentation.createSegmenter(model, segmenterConfig);
};

We load the model from a CDN, configure the runtime as mediapipe, and set the modelType to general. Then, we create the segmenter using the bodySegmentation.createSegmenter method.

// ./videoBackground.js
import * as bodySegmentation from "@tensorflow-models/body-segmentation";

const createSegmenter = async () => {
  const model = bodySegmentation.SupportedModels.MediaPipeSelfieSegmentation;
  const segmenterConfig = {
    runtime: "mediapipe",
    solutionPath: "https://cdn.jsdelivr.net/npm/@mediapipe/selfie_segmentation",
    modelType: "general",
  };
  return bodySegmentation.createSegmenter(model, segmenterConfig);
};

class VideoBackground {
  #segmenter;

  getSegmenter = async () => {
    if (!this.#segmenter) {
      this.#segmenter = await createSegmenter();
    }
    return this.#segmenter;
  };
}

const videoBackground = new VideoBackground();
export default videoBackground;

Here, we define a VideoBackground class and create an instance of it. Inside the class, the getSegmenter function ensures that the segmenter is created only once, so we don't have to recreate it each time.

Blur the video background

Before we continue further, let's update our demo app. Since we are going to modify the video, we need a <canvas/> to display the modified video. Add that to our demo app.

// rest of the code...
const App = () => {
  const canvasRef = useRef();
  // rest of the code...
  return (
    <div>
      <video
        ref={videoRef}
        autoPlay
        width="640"
        height="480"
        style={{ display: "none" }}
      />
      <canvas ref={canvasRef} width="640" height="480" style={transform: 'scaleX(-1)'}/>
    </div>
  );
}

Also, hide the <video> element by setting display: "none" since we don't want to display the raw video.

Next, create a function within the VideoBackground class to blur the video.

// rest of the code...
class VideoBackground {
  // rest of the code...

  #animationId;
  stop = () => {
    cancelAnimationFrame(this.#animationId);
  };

  blur = async (canvas, video) => {
    const foregroundThreshold = 0.5;
    const edgeBlurAmount = 15;
    const flipHorizontal = false;
    const blurAmount = 5;
    const segmenter = await this.getSegmenter();

    const processFrame = async () => {
      const segmentation = await segmenter.segmentPeople(video);
      await bodySegmentation.drawBokehEffect(
        canvas,
        video,
        segmentation,
        foregroundThreshold,
        blurAmount,
        edgeBlurAmount,
        flipHorizontal
      );
      this.#animationId = requestAnimationFrame(processFrame);
    };
    this.#animationId = requestAnimationFrame(processFrame);
  };
}

The blur function takes video and canvas references. It uses requestAnimationFrame to continuously draw the resulting image onto the canvas. First, it creates a body segmentation using the segmenter.segmentPeople function by passing the video reference. This allows us to identify which pixels belong to the background and foreground.

To achieve the blurred effect, we use the bodySegmentation.drawBokehEffect function, which applies a blur to the background pixels. This function accepts additional configurations like foregroundThreshold, blurAmount, and edgeBlurAmount, which we can adjust to customize the effect.

We've also added a stop function to halt video processing by canceling the recursive requestAnimationFrame calls.

import React, { useRef, useEffect, useState } from "react";

function App() {
  const [cameraReady, setCameraReady] = useState(false);
  // rest of the code...

  <video
    // rest of the code...
    onLoadedMetadata={() => setCameraReady(true)}
  />;
  // rest of the code...
}

Before calling the blur function, ensure the video is loaded by waiting for the onLoadedMetadata event to be triggered.

All set; let's blur the video background.

import React, { useRef, useEffect, useState } from "react";

import videoBackground from "./videoBackground";

function App() {
  const [cameraReady, setCameraReady] = useState(false);
  const videoRef = useRef(null);
  const canvasRef = useRef();

  useEffect(() => {
    async function getVideo() {
      try {
        const stream = await navigator.mediaDevices.getUserMedia({
          video: true,
        });
        if (videoRef.current) {
          videoRef.current.srcObject = stream;
        }
      } catch (err) {
        console.error("Error accessing webcam: ", err);
      }
    }

    getVideo();

    return () => {
      if (videoRef.current && videoRef.current.srcObject) {
        videoRef.current.srcObject.getTracks().forEach(track => track.stop());
      }
    };
  }, []);

  useEffect(() => {
    if (!cameraReady) return;
    videoBackground.blur(canvasRef.current, videoRef.current);
    return () => {
      videoBackground.stop();
    };
  }, [cameraReady]);

  return (
    <div className="App">
      <video
        ref={videoRef}
        autoPlay
        width="640"
        height="480"
        style={{ display: "none" }}
        onLoadedMetadata={() => setCameraReady(true)}
      />
      <canvas ref={canvasRef} width="640" height="480" />
    </div>
  );
}

export default App;

Here, we added another useEffect that triggers when cameraReady is true. Inside this useEffect, we call the videoBackground.blur function, passing the canvas and video refs. When the component unmounts, we stop the video processing by calling the videoBackground.stop() function.

Replace with a virtual background

If we feel that just blurring is not enough and want to completely replace the background, we need to remove the background from the video and place an <img/> behind the <canvas/>. To remove the background, we can utilize the bodySegmentation.toBinaryMask function. This function will return an ImageData with its alpha channel being 255 for the background and 0 for the foreground. We can use this info in the original data and set the background pixels' alpha to be transparent.

// rest of the code...
class VideoBackground {
  // rest of the code...

  remove = async (canvas, video) => {
    const context = canvas.getContext("2d");
    const segmenter = await this.getSegmenter();
    const processFrame = async () => {
      context.drawImage(video, 0, 0);
      const segmentation = await segmenter.segmentPeople(video);
      const coloredPartImage = await bodySegmentation.toBinaryMask(
        segmentation
      );
      const imageData = context.getImageData(
        0,
        0,
        video.videoWidth,
        video.videoHeight
      );
      // imageData format; [R,G,B,A,R,G,B,A...]
      // below for loop iterate through alpha channel
      for (let i = 3; i < imageData.data.length; i += 4) {
        // Background pixel's alpha will be 255.
        if (coloredPartImage.data[i] === 255) {
          imageData.data[i] = 0; // this is a background pixel's alpha. Make it fully transparent
        }
      }
      await bodySegmentation.drawMask(canvas, imageData);
      this.#animationId = requestAnimationFrame(processFrame);
    };
    this.#animationId = requestAnimationFrame(processFrame);
  };
}

Similar to the blurring process, inside processFrame, we first create the segmentation using segmenter.segmentPeople and convert it to a binary mask using bodySegmentation.toBinaryMask. We then obtain the original image data with context.getImageData. Next, we loop through the image data to make the background pixels transparent. Finally, we draw the result on the canvas using bodySegmentation.drawMask.

Before calling this function, let's modify our demo app by adding an option to switch between none, blur, and image effects, rather than removing the blur function. Additionally, include a background image.

const BACKGROUND_OPTIONS = ["none", "blur", "image"];
function App() {
  const [backgroundType, setBackgroundType] = useState(BACKGROUND_OPTIONS[0]);
  // rest of the code...

  return (
    <div>
      // rest of the code...
      {backgroundType === "image" && (
        <img
          alt=""
          style={{
            position: "absolute",
            top: 0,
            bottom: 0,
            width: "640px",
            height: "480px",
          }}
          src="/bgImage.png"
        />
      )}
      // rest of the code...
      <div>
        <select
          value={backgroundType}
          onChange={e => setBackgroundType(e.target.value)}
        >
          {BACKGROUND_OPTIONS.map(option => (
            <option value={option} key={option}>
              {option}
            </option>
          ))}
        </select>
      </div>
    </div>
  );
}

Here, we added a <select> element to choose between none, blur, and image, and an <img> element to display the background image, which will serve as our virtual background.

All set. Now, let's update the useEffect.

useEffect(() => {
  if (!cameraReady || backgroundType === "none") return;

  const bgFn =
    backgroundType === "blur" ? videoBackground.blur : videoBackground.remove;

  bgFn(canvasRef.current, videoRef.current);

  return () => {
    videoBackground.stop();
  };
}, [cameraReady, backgroundType]);

Based on the selection, we will call either videoBackground.blur or videoBackground.remove.

Full working example can be found in this Github repo.

If this blog was helpful, check out our full blog archive.