Vladimir Vukićević

I Make Fun Stuff

First Steps for VR on the Web

There has been a lot of excitement around Virtual Reality recently, with good reason. Display devices such as the Oculus Rift and input devices such as the Leap Motion, PrioVR, Sixense Stem and many others are creating a strong ecosystem where a high-quality, consumer-level VR experience can be delivered.

The opportunity for VR on the Web is particularly exciting. The Web is a vibrant, connected universe where many different types of experiences can be created and shared. People can be productive, have fun and learn all from within their browser. It is, arguably, an early version of the Metaverse — the browser is the portal through which we access it. It’s not perfect, though, and lacks many of the “virtual” and “immersive” aspects. Given that, could we not expand the Web to include the immersive elements of a fully three-dimensional virtual universe? Is it possible for the Web to evolve to become the Metaverse that Stephenson envisioned?

Today, we want to take some of the first steps in finding an answer to this question. We are adding native support for VR devices to early experimental builds of Firefox, so that Web developers can start experimenting with adding VR interactivity to their websites and content. This is only the first of many steps that well be taking over the coming weeks and months.

The initial technical vision for VR on the Web includes:

  • Rendering Canvas (WebGL or 2D) to VR output devices
  • Rendering 3D Video to VR output devices (as directly as possible)
  • Rendering HTML (DOM+CSS) content to VR output devices – taking advantage of existing CSS features such as 3D transforms
  • Mixing WebGL-rendered 3D Content with DOM rendered 3D-transformed content in a single 3D space
  • Receiving input from orientation and position sensors, with a focus on reducing latency from input/render to final presentation

In particular, Web content should not need to be aware of the particulars of the VR output device, beyond that there is one and it has certain standard rendering characteristics (e.g. a specific projection matrix that’s needed). For example, in the case of the Oculus Rift, content should not need to apply the Rift-specific distortion rendering effect.

This initial step has the seed technical functionality needed to support the first type of VR content listed above: receiving sensor input and rendering Canvas/WebGL content to VR. Over the coming weeks, I’ll be expanding the scope and support of VR in the Web platform so that all of the above will be possible. In addition, my colleague at Mozilla, Josh Carpenter, will be working at the problem from a user experience and design angle, to figure out what some best practices might be for bringing VR to Web content – and what the experience of browsing a VR Web might feel like, what creating VR Web content could be like, and what it could mean to browse the current Web using a VR interface.

If you’re interested in joining the discussion or providing feedback, please join us on the web-vr-discuss mailing list. As with all technology experimentation, I look forward to seeing what Web developers do with this functionality!


This is an early preview build of work-in-progress code. It is intended for developers looking to experiment with VR on the Web. Windows and OS X builds are available below, with Linux coming soon. Only the Oculus Rift is currently supported, though other devices will come soon (including Google’s Cardboard and similar phone harnesses!).

How to report bugs: Because this code is not yet in the upstream Mozilla repository, please report any VR-specific issues to me via GitHub issues on my gecko-dev repo. Once the code is merged into the main repository, bugs will be tracked using Bugzilla as usual. Expect bugs and crashes, which are part of the fun!

I’ve modified the three.js cubes sample to add support for VR. Press “f” to toggle fullscreen using VR. If no Oculus Rift is plugged in, a DK1 will be simulated, but there will be no orientation input.

The features proposed here are all subject to rapid change; they are not indicative of any standard, and may even change or break from release to release of any upcoming builds. This code is currently not in the main Firefox source tree, though that process has started. The goal of this release is to allow for some initial experimentation and feedback. The API supported in this release allows for:

  • Making an element fullscreen, with VR post-processing. For this release, the element is expected to contain left/right split content.
  • Querying recommended field of view settings per eye, and setting per-eye FOV
  • Querying current sensor state (orientation, position)

Not implemented yet, but next on the list and coming very soon:

  • Synchronization between requestAnimationFrame and sensor timings (i.e. the orientation value that you currently query is instantaneous, and not tied to what it is predicted to be when the frame is rendered)
  • Automatic window positioning on HMD when fullscreen is applied. (You need to move the browser window to the HMD display.)
  • Browser-supported left/right rendering of CSS content.

Technical details on usage:

The mozGetVRDevices entry point will provide a list of all VR devices to a given callback:

navigator.mozGetVRDevices(vrCallback);

The callback will accept an array of devices, which currently will be either a HMDVRDevice or a PositionSensorVRDevice. You’ll want to find a HMD device, and then find its associated sensor:

var vrHMD, vrSensor;
function vrDeviceCallback(vrdevs) {
  // First, find a HMD -- just use the first one we find
  for (var i = 0; i < vrdevs.length; ++i) {
    if (vrdevs[i] instanceof HMDVRDevice) {
      vrHMD = vrdevs[i];
      break;
    }
  }

  if (!vrHMD)
   return;

  // Then, find that HMD's position sensor
  for (var i = 0; i < vrdevs.length; ++i) {
    if (vrdevs[i] instanceof PositionSensorVRDevice &&
        vrdevs[i].hardwareUnitId == vrHMD.hardwareUnitId)
    {
      vrSensor = vrdevs[i];
      break;
    }
  }

  if (!vrHMD || !vrSensor) {
    alert("Didn't find a HMD and sensor!");
    return;
  }

  startRendering();
}

Once you have a vrHMD, you can query its configuration (initially initialized to the recommended FOV values):

leftFOV = vrHMD.getCurrentEyeFieldOfView("left");
rightFOV = vrHMD.getCurrentEyeFieldOfView("right");

leftTranslation = vrHMD.getEyeTranslation("left");
rightTranslation = vrHMD.getEyeTranslation("right");

Sample HTML might look like this:

<div id="container">
  <canvas id="content" width="1920" height="1080"></canvas>
</div>

To request full screen of the container with VR distortion applied, call:

document.addEventListener("mozfullscreenchange", fullScreenChange, false);

var container = document.getElementById("container");
container.mozRequestFullScreen({ vrDisplay: vrHMD });

If vrHMD is null or not present, this will act as a regular full screen request. Once you receive the fullscreenchange event, start doing left/right rendering:

function fullScreenChange() {
  if (document.mozFullScreenElement && vrHMD) {
    // reconfigure for two-camera left/right split rendering
    vrEnabled = true;
  } else {
    // reconfigure for non-VR single camera renering
    vrEnabled = false;
  }
}

When rendering a frame, you can query the current orientation and position from the sensor:

var state = vrSensor.getState();
if (state.orientation) {
  ... use state.orientation.xyzw as a quaternion ...
}
if (state.position) {
  ... use state.position.xyz as position ...
}

In the cubes demo, note that the entire contents of the “container” div are made full screen – it can include content other than just a WebGL canvas. While there is not yet support for the browser itself doing appropriate left/right rendering of CSS 3D transformed content, you are not prevented from experimenting with this!

Two Presentations: PAX Dev 2013 and Advanced JS 2013

I have not blogged in a while, because I’m not a good person. Ahem. I’ve also been behind about posting presentations that I’ve done, so I’m catching up on that now.

The first one is a presentation I gave at PAX Dev 2013, about the nuts and bolts involed in bringing a C++ game (or really any app) to the web using Emscripten. The slides are available here, and have the steps involved in going from existing C++ source to getting that running on the web.

The second one is a presentation I gave at Advanced.JS, about how asm.js can be used in traditional JavaScript client-side apps to speed up processing or add new features that would be difficult to rewrite in pure JS. The slides are available here.

Faster Firefox (re-)Build Speeds

One of the common complains about Firefox development is that build speed is slow, and that the normal “edit, compile, test” cycle is painful. Over time, developers learn tricks (“If I change file X, I have to rebuild in foo/, bar/, toolkit/library/”), but this is error prone. New developers also don’t have any of that knowledge, and they’re struggling with a lot of new things that having the build system slow down their learning hurts significantly.

Ehsan and BenWa had an idea for how to make this better. The core idea was to just annotate the relevant build rules to spit out specific rules for each target and their inputs, thereby generating an alternate, faster, non-recursive build system as part of the first build. Others have had this idea before, but usually only for testing or gathering some data. BenWa did a first prototype, generating makefile fragments that could be included together. I did some hacking to create a little python script for the fragment generation, which let us start experimenting with generating ninja files instead of using make – for what we were doing, ninja is perfect, since its purpose is to execute “generated build systems”.

BenWa is working on extending it to generate project files for some IDEs. Since we have the full dependency graph, we can also do parallel builds with 100% cpu utilization, and can significantly distribute builds as well with little overhead.

Here’s the main result that you’re probably wondering about, all on my Windows 7, quad-core 4th Gen Core i7 (8 threads), 7200 RPM hard drive, 16GB memory desktop:

  • Clean tree
    • top level pymake (-s -j8)
      • 490 seconds (8 minutes, 10 seconds)
    • top level ninja
      • 1.46 seconds
  • Touch a deep inner source (gfxGDIShaper.cpp – requires libxul relink)
    • top level pymake (-s -j8)
      • 544 seconds (9 minutes, 4 seconds)
    • magic knowledge pymake (pymake -s -j8 -C gfx/thebes && pymake -s -j8 -C toolkit/library)
      • 66 seconds
    • top level ninja
      • 61 seconds (3 total commands executed)
  • Touch some core headers (nsTArray.h, nsTSubstring.h)
    • top level pymake (-s -j8)
      • 33 minutes, 14 seconds
    • top level ninja
      • 21 minutes, 48 seconds

I was going to make a chart, but it would have needed a logarithmic axis.

These commands of course don’t do the same thing; running make will redo the export step as well as recurse into a bunch of other targets that are never seen by hacky.mk. But the result is the same for most developers, and the time savings is significant. These numbers are also a good target for other more comprehensive build system work; the results above are purely bound my compiler and linker speed (and IO; I need to retry this on a SSD – and also wrapper scripts for python, executing bash, etc. There’s a bit of room for improvement!). It would be very difficult to end up with faster builds than the above, given equivalent parallelization.

How to use it

  1. Clone hacky.mk from github: http://github.com/bgirard/hacky.mk. Then run ./hackify ../path/to/mozilla-central. It’ll patch rules.mk and copy some files into place (alternatively, you can patch by hand symlink the hackymake dir from the repo as build/tools/hackymake and js/src/build/tools/hackymake).
  2. Do a normal build. You’ll want to do a full build into a brand new objdir.
  3. While that’s going, clone http://github.com/martine/ninja.git and build it. Copy the binary somewhere in your PATH.
  4. When the Firefox build is done, run $srcdir/tools/hackymake/reninja
  5. To build, run ninja

What it can do:

  • Rebuild objects from C/C++ sources (for JS, libxul, and friends)
  • Rebuild libraries
  • On Windows, copy exported include files into your objdir’s dist/include dir if the original changes (since we can’t use symlinks on Windows) ** caveat: not JS, yet; JS has its own export rules that aren’t hooked up yet

What it can’t do:

Basically, anything not listed above.

It doesn’t do anything outside of (mostly) libxul or JS deps. It doesn’t know anything about any of the external build systems (e.g. nspr).

It doesn’t know how to build binaries (e.g. firefox.exe, the js shell). It also doesn’t know how to regenerate headers from IDL files, or IPC stubs from ipdl files. None of that would be very hard to add, it just wasn’t a top priority.

It can’t handle additions or removals of source files, at least not automatically. If you just run a regular “make” in the directory where the source file was added and you rerun reninja, that should be enough. Removals are harder; manually remove the relevant file from $objdir/.hacky and run reninja.

It can’t handle changes to the build configuration. Build configuration changes are unlikely to be ever handled; they’ll require a rebuild using the regular build system first.

Next steps, how can you help?

  • Use it, tell others about it
  • Provide feedback; Github issues work fine for this
  • We’ll work to get it into the tree
  • Add support for additional rules (binaries, idl/ipdl generators, etc!)

Mobile Graphics Pushing Forward

SIGGRAPH was this week, and while I didn’t attend, I had a few thoughts about some of the announcements that were made. Mozilla is pretty serious about wanting to increase the level of graphics capability accessible to the web on both the desktop and mobile. As some sharp observers may have noticed, we (and by we, I mean largely Guillaume Abadie) at Mozilla have been working quietly on implementing the bulk of OpenGL ES 3.0 capabilities as extensions to WebGL 1 (all disabled by pref by default, for now!). “WebGL 2”, when it’s specified, will likely just be the union of all of these extensions, but all on by default. A WebGL 2 tied to OpenGL ES 3.0 is a big step towards exposing more modern graphics capabilities, but it’s still behind what’s current on the desktop. What we need is a unified API on both desktop and mobile that can allow access to the maximum capabilities of each.

NVIDIA’s announcement of Project Logan, their next-generation mobile processor, is pretty exciting. By bringing desktop-class GL 4.4 in addition to ES 2 and ES 3 to mobile, I’m hoping that it will spur the mobile industry to push the mobile graphics envelope forward, instead of stagnating for years as it did with ES2. Having the latest version of Android support ES 3 is a good start, but it’s still behind what’s available on the desktop. While having OpenGL 4.4 isn’t immediately going to help WebGL – we still need to deliver a consistent API for the web – the presence of the “desktop” API on a mobile platform blurs the line between the two. The difference between what can be done on a traditional desktop or laptop PC vs a tablet or a smartphone is shrinking rapidly. The difference in graphics capabilities should shrink as well.

Under the hood, Firefox can take advantage of the additional capabilities even if they aren’t exposed to content through WebGL. We currently have many paths to access the various hardware acceleration APIs on different platforms. Being able to access desktop-class GL on a mobile device will enable us to bring some acceleration paths to mobile that should make Firefox really shine on Logan-class mobile hardware. I’m looking forward to getting my hands on a Logan-powered device and running some content through Firefox for Android!

WebRAW: Using Emscripten and asm.js to Process RAW Images

Asm.js is getting lots of visibility because it has successfully brought near-native code performance to the web. This is a huge accomplishment, but I think it will take some time until developers figure out how to realize its potential. Many of the initial high profile examples have been games, such as BananaBread or engines, such as Epic’s Unreal Engine 3, powering the Epic Citadel demo. This is with good reason – games demand high performance, low latency, and consistent execution.

But there are other interesting use cases for asm.js and the Emscripten compiler. I do photography as a hobby, and I use Adobe’s Lightroom to manage my photos and to perform color correction and full image tasks. A few years ago, with the introduction of Tracemonkey, the original JavaScript JIT for Mozilla’s Spidermonkey JS engine, I created a simple “Darkroom” demo. The app allowed loading a JPEG image, and performing some image adjustments.

But now, I wanted to see if it was possible to go further – I wanted to work with the original raw files from my camera in the browser. To see what the base performance would be like, I compiled LibRAW using Emscripten. It required few changes (I mainly added some access methods that an approach like embind would do automatically). The initial results were quite good, and directly in line with what we have consistently seen with Emscripten and asm.js performance. These are times required to unpack and process a 36MP losslessly-compressed RAW file from my D800. The input NEF file is about 40MB in size.

Native:  7,812ms average
asm.js: 12,868ms average (1.64x)

You can play with a demo I put together. For best results, use a recent Firefox Nightly build. I believe the Aurora alpha builds should also work well. Drag and drop one of your own images onto the page. The embedded thumbnail will be extracted and put into the filmstrip at the top, along with some basic information about the image. Clicking on the thumbnail will load and decode the full raw image. The full image display will let you click to zoom and drag to pan around.

The demo’s UI is entirely done in regular HTML. The RAW processing happens on a web worker, where the asm.js-based LibRAW is loaded and the processing executed. The results are then sent via postMessage for display to the user.

It’s obviously not functional for real usage, and is purely an interesting tech demo. In particular, it uses some hardcoded defaults instead of selecting a proper white balance or choosing other appropriate processing parameters. It is also rather careless with memory; it could be much improved, but memory usage was not my focus. The performance is promising, and shows that Emscripten and asm.js are powerful tools that are usable in conjunction with standard web technologies instead of as just a porting method for native applications.

After I created this demo a week ago, I read an article about pics.io who is working to bring the full photo processing task to the browser. I hope they succeed! They’re not using asm.js, and instead have custom decoders written to take advantage of WebGL, thus offloading much of the processing to the GPU. This is a great approach for performance, but it does mean that they are limited in the formats that they support – currently, only support for Adobe’s DNG is present. A combination of an asm.js-compiled existing decoding library along with their WebGL-based manipulation tools could be a winner.

Unreal JavaScript

At the 2013 Game Developers’ Conference, Alon and I from Mozilla and Josh Adams from Epic Games presented a talk called “Fast and Awesome HTML5 Games”. We surprised people by showing off Unreal Engine 3 running in Firefox — compiled from C++ source with Emscripten, running smoothly and efficiently. Today, Epic is making the Epic Citadel demo available, so that you can try it out for yourself.

For best results, it needs a recent Firefox Nightly (Firefox 23 or better). However, because the core technologies are just standard web technologies, it will run in Firefox 20 (the current released version) — but with some performance degradation and a lack of Web Audio-dependant audio effects. We’ve had success in running it in other browsers, but it’s somewhat hit and miss – it heavily depends on the quality of the WebGL implementation, memory management, and JavaScript engine. Now that the demo is available, we expect that they will fix any remaining issues quickly.

Here’s a video (1080p!) of both the Epic Citadel demo, as well as some gameplay footage from the unreleased “Sanctuary” demo:

Goals

Working with Epic helped us prove that the Emscripten approach was viable even for large, performance-intensive codebases. We also wanted to demonstrate that with Emscripten, the web becomes just another porting and build target that integrates nicely in existing frameworks. During the week working with Epic to build the demo, we went from things failing to compile to shooting bots in a UT3 level. Having an early build of an asm.js-enabled Firefox (with Odinmonkey) got us the performance boost we needed to hit great frame rates and a very playable experience.

Technical and Porting Details

The engine is compiled down to 60MB of minified asm.js-flavoured JavaScript. This is a large chunk of JS for browsers to digest. Initially, our parse and compile time was close to 40-45s, which is not a great experience. Thanks to some efforts by Nick Nethercote, Sean Stangl, and others, that’s down to around 10s currently. (less on some hardware, highly dependant on CPU speed.) The work to improve parsing and compilation of this large codebase also translated into gains on more normal JS workloads as well. We have plans in progress to reduce this further by doing a custom parser for asm.js, as well as enable compilation caching in IndexedDB.

On the platform side, this was the first big test of our new Web Audio implementation. Ehsan wrote an OpenAL-compatible wrapper that mapped to Web Audio, and this worked very well with the existing OpenAL code that was in UE. The Sanctuary demo was enhanced by adding non-full-screen pointer lock support — a web page can request pointer lock independent from fullscreen, making games that want to lock the mousepossible as part of web pages instead of needing to enter a full screen mode.

The experience really reinforced the main porting strategy that we suggest to others for getting existing content on the Web using Emscripten: make it work natively first using the same libraries that Emscripten is mapping. If you can launch the application direct from the command line, using SDL and OpenGL ES 2.0 for video, OpenAL for audio, and otherwise using pure C libraries, you should see quick results on the Web.

The Web Is The Platform

With Emscripten, the web is just another platform to target. It can be targetted alongside other platforms, with the additional development cost similar to porting to another OS. No longer does choosing to deploy on the web mean rewriting in JavaScript, and worse, continuing to maintain that JavaScript version in parallel with the native version.

Personally, it’s really exciting to get to this point a short two years since the the first WebGL specification was introduced at GDC 2011. With WebGL, Emscripten, and asm.js, the pieces really fell into place to make this possible. It’s even more exciting to be able to do this with high profile engine, known for its quality and performance. Seeing the web get to this point is pretty amazing, and I look forward to seeing what the industry does with it.

More Information

For more information, please check out the slides from our talk:

Vladimir Vukićević – The Web Is Your Next Platform
Josh Adams – Unreal Engine 3 in JavaScript
Alon Zakai – Compiling C/C++ to JavaScript

Also, please visit the new Mozilla Developer’s Network Games landing page, which we’ll be expanding in the coming weeks. The Emscripten project and information about asm.js are also useful if you’d like to take a look at what it would take to port your own games or other apps.