Let's light a torch and explore MediaStreamTrack's capabilities

Finally, after months of changing APIs, sparse documentation and insufficient examples, new exciting features are arriving in today's release of Chrome (59). What makes this version so special, compared to the countless versions before?

MediaStream Image Capture

The final release of Chrome 59 ships with MediaStream Image Capture capabilities and you are no longer required to enable the Experimental Web Platform Features (through chrome://flags) for them to work.

This new specification adds on to the functionality already present in Media Capture and Streams and extends it with so called capabilities. The idea was to create a universal interface for controlling a video device, either during streaming or for still-shots. Now you can control, depending on the capabilities of your device, things like exposure, focus, zoom and even the torch. Hence, no more excuses (e.g.: we must go native) when you want to build a fully-fledged photo app directly in the web-browser.

Getting Started

Instead of paragraph-long explanations on how this all works, I would like to dive directly into the code and shed some light on the bits and pieces. In case you want to code along, I would recommend connecting your beloved Android phone to your Developer tools (chrome://inspect/#devices) and code directly in the console. First, make sure your host and your phone have Chrome version >= 59 installed (beta is fine too). Additionally, serve a minimal index.html through https so you have privileged access to these features (or make use of the linked codepen collection).

This html snippet is the necessary boilerplate for all the JavaScript code that follows.

<html>  
  <body>
    <video autoplay></video>
  </body>
<html>  

Get a Camera Stream

We will start off by acquiring a stream from the device's camera facing away from the user. Why? Because typically the back-facing cameras are much more capable than the user-facing ones. You can also try to change facingMode to 'user' but many of the features we will explore won't work there.

navigator.mediaDevices.getUserMedia({  
  video: {
    facingMode: 'environment',
  }
})
.then((stream) => {
  const video = document.querySelector('video');
  video.srcObject = stream;
})
.catch(err => console.error('getUserMedia() failed: ', err));

(https://codepen.io/serratus/pen/PjwJQp)

After an initial permission-request, you should now see your camera's stream in the browser.

Querying Capabilities

Before we can even start changing the settings of our camera, we need to ask for its capabilities first. As mentioned before, not every camera exposes the same functionality, that's why we have to make sure to only use what is actually supported. To do that, we need to kindly ask our MediaStreamTrack instance by calling getCapabilities() which returns an object representing its functionality.

During my experiments I noticed that getCapabilities might return an empty object, even though it shouldn't, when called too early. What is too early? Waiting a few hundred milliseconds then try again? But what if the track does not expose any capabilities (e.g.: my MacBook)? This always returns an empty object. So waiting for something that only eventually happens is never a good idea. The official specs on getConstraints don't mention anything. What could be a viable solution?

Events

There must be an event that is being triggered whenever this data is ready to be read. That was my initial thought. After attaching countless event handlers to the <video> element, including loadedmetadata, progress, canplay (and many more: Media Events), I realized that most of the events still fire too early. Events like progress and canplay worked most of the time, but cross-device behavior differs. So no luck there.

Wait and see

Back at the drawing board I came up with a hacky, but seemingly working solution. Waiting for the loadedmetadata event of the <video> element with an additional delay of 500 ms did the trick. However, this might not be an ideal solution, but that's what worked on all of my tested devices (Nexus 5, Moto G 2014, Galaxy S6):

// get the active track of the stream
const track = stream.getVideoTracks()[0];

video.addEventListener('loadedmetadata', (e) => {  
  window.setTimeout(() => (
    onCapabilitiesReady(track.getCapabilities())
  ), 500);
});

function onCapabilitiesReady(capabilities) {  
  console.log(capabilities);
}

(https://codepen.io/serratus/pen/NgPaoY)

Now, if your device is capable enough, you should see an object printed to the console:

getCapabilities

This object now describes the capabilities of the currently visible track where all the keys can be changed according to their definition. Some of the properties have an Array type, such as exposureMode, and others are an instance of MediaSettingsRange.

Changing Settings

Now that we know what adjustments we can make to our stream, let's change them already!

The aforementioned capabilities are nothing more than additional constraints to our MediaStreamTrack which we usually define when initially requesting a stream via getUserMedia. Applying constraints to an already live track is tricky, especially when it involves changing the dimensions, frame-rate and other properties of the stream. However, MediaStreamTrack now exposes a new method applyConstraints that does exactly that, but is restricted to constraints that DO NOT change the physical properties of a stream. An overview of supported constraints can be found in the source-code of Chromium where they check the parameters within ImageCapture::HasNonImageCaptureConstraints.

That said, all of keys that getCapabilities() returns can be changed with applyConstraints(), even though they must be wrapped under the advanced constraint. But let's see what this might look like:

// called when our hacky check above finds capabilities
function onCapabilitiesReady(capabilities) {  
  if (capabilities.zoom) {
    track.applyConstraints({
      advanced: [{zoom: capabilities.zoom.max}]
    })
    .catch(e => console.log(e));
  }
}

(https://codepen.io/serratus/pen/zzxaOL)

We wait until our capabilities are valid and then apply a zoom that is set to capabilities.zoom.max. Notice the .catch? MediaStreamTrack::applyConstraints returns a Promise when all the settings have been applied. So watch out for potential unhandled rejection errors and high-frequency calls (should be debounced).

Let there be light

Changing the zoom is fun, but let's explore the darkness with some light. Turning on and off the torch is one of the coolest features, especially when dealing with image-recognition/processing tasks that require good lighting. A good use-case would be in QuaggaJS where detecting and decoding barcodes is very difficult in low-light conditions. Let's get to it:

function onCapabilitiesReady(capabilities) {  
  if (capabilities.torch) {
    track.applyConstraints({
      advanced: [{torch: true}]
    })
    .catch(e => console.log(e));
  }
}

(https://codepen.io/serratus/pen/LLErbK)

Notice the difference to the previous code-snipped? We've only changed the capability to torch and set the value to true. That's it.

However, this might not work on your device, even though it exposes the torch capability. While testing, it worked on a Nexus 5, as well as on a Samsung Galaxy S6, but my Moto G 2014 refused to comply. I haven't found the cause for this problem yet.

What's the current value of each capability?

Initially, when an app boots up, you probably want to know the current state of the track. This is where MediaStreamTrack.getSettings() makes itself useful and returns an object that is similar in shape to what getCapabilities returns.

function onCapabilitiesReady(capabilities) {  
  console.log(track.getSettings());
}

(https://codepen.io/serratus/pen/QgwBwV)

Usually, the return value of getSettings contains more properties, including physical data such as height and width.

What has changed?

In case your application does not track the changes made to the MediaStreamTrack how would you know what has changed? MediaStreamTrack::getConstraints() comes to the rescue. This method returns all the changes that have been made via applyConstraints.

function onCapabilitiesReady(capabilities) {  
    if (capabilities.zoom) {
      track.applyConstraints({
        advanced: [{zoom: capabilities.zoom.max}]
      })
      .then(() => {
        console.log(track.getConstraints());
      })
      .catch(e => console.log(e));
    }
  }

(https://codepen.io/serratus/pen/owgyQj)

And the result-object of getConstraints now contains an advanced property which is an array with one single entry. That entry is exactly what we previously set:

What's next?

All the features and experiments in this post were conducted using Chrome 59 for Android, but what about the other platforms? The guys working on the MediaStream Image Capture specification are also maintaining a browser/feature matrix where they keep track of the Implementation Status. As of now, Chrome for Android is on-par with the Linux/Chromium implementation, followed by the respective versions for Windows and Mac. Other browser vendors are lagging further behind. Once again, Safari does not even consider supporting any of these features. Who knows how long we have to wait until WebRTC finally lands on an iOS device (The corresponding WebKit Feature Status hasn't changed much).

However, I'm still a strong believer that the web-platform is the future, and everyone can code their own camera-apps. Enjoy :-)

Codepens