2019-01-30

November 2016

Volume 31 Number 11

[HoloLens]

Introduction to the HoloLens

By Adam Tuliper | November 2016

This is an exciting year for new groundbreaking devices. It’s said to be the year of virtual reality (VR) and augmented reality (AR), and some very notable and highly anticipated devices have started shipping, including the HoloLens, HTC and Oculus Rift CV1. John Riccitiello, the CEO of Unity Technologies, said of AR/VR: “It’s a once-in-a-generation technology that’s so compelling it literally changes everything,” and I couldn’t agree more. HoloLens, the exciting new device from Microsoft, is capable of blending the real and virtual world.

What Is HoloLens?

HoloLens is an untethered, fully self-contained Windows 10 computer that rests comfortably on your head. It’s what’s known as a mixed reality device, a device that tries to blend the real and digital worlds. You see objects placed in the world that look and—to an extent—act like they’re in the real world. In contrast, VR immerses you in an environment and you typically don’t see anything around you but that virtual world. You generally aren’t visually aware of the real world outside your head-mounted display (HMD). This experience can take you flying in outer space while you sit in your office chair. And AR tries to enhance the world around you with extra data, such as markers, or heads-up information that may pertain to your location. Some AR headsets simply throw text and images on a screen overlapping whatever you’re looking at.

With the HoloLens, you can bring applications and objects into the world around you that understand your environment. If you want an application pinned to the wall or in mid-air like a digital screen, as shown in Figure 1, no problem. Such apps stay put, even when you leave your room and come back the next day. I’m constantly leaving virtual windows open in other rooms, to be surprised when I go back days later and they’re still there. And that’s not all. Suppose you want a skeleton standing in front of you in your living room that you can walk around and inspect (including climbing on your couch to look at the top of the head). Again, no problem. Drop a virtual 3D object, say a ball—referred to as a hologram—into your world and it will fall and hit your real table and stop. Move the table and the ball will fall and hit your real floor. The HoloLens understands the world around you and most are absolutely amazed the first time they try it (though I’m still waiting to be able to download Kung Fu into my brain).

Figure 1 App on Wall

In this article, I’ll be covering the HoloLens development environment and the three pillars of input—gaze, gesture and voice, which represent the primary ways to interact with the HoloLens. More HoloLens features like spatial mapping, spatial audio and coordinate systems will be covered in my next article.

The Tools and SDK

There isn’t a separate SDK for the HoloLens—it’s simply part of the Windows 10 SDK. You just need Visual Studio 2015 with Update 3+ installed, and be sure you’ve checked off “Tools and Windows 10 SDK” during the install. You can go back and run the Visual Studio installer again to select these options, of course—just run the Update 3 (or higher if applicable) installer again.

The HoloLens, a Universal Windows Platform (UWP) because it’s a Windows 10 device, runs UWP apps. That’s right—any Windows 10 app can potentially run not only on the desktop, phone and Xbox, but also on the HoloLens. You’ll find published guidance on taking your Windows Phone and Windows Store non-UWP apps and converting them at bit.ly/2c3Hkit. To ensure that a UWP app will show up in the Windows Store, make sure you’ve allowed the HoloLens device family in the Windows Dev Center when your application is published, as shown in Figure 2. As is the case with any app on any platform, you need to ensure your application will look good on the particular devices you’re targeting, as different devices have varying resolutions and memory capabilities.

Figure 2 Ensuring the Windows Dev Center Submission Supports the HoloLens

The two options for developing holographic applications, the kind that can take full advantage of the HoloLens feature set, are DirectX and Unity. You can also create a plain UWP app using C# and XAML, but it will run as a 2D application on the HoloLens and you don’t get the ability to drop holograms all around in space. There are project templates in Visual Studio for holographic applications that can display holograms, as shown in Figure 3. Ensure prior requirements are met or you may not see the templates available. Unity (my preferred development environment for the HoloLens) has support out of the box in the HoloLens Technical Preview (HTP) builds available on its Web site at bit.ly/2bBZsNn.

Figure 3 HoloLens DirectX Templates

At some point, however, you must touch Visual Studio, even if you’re using Unity, though the development experience is quite different from using pure DirectX. Deploying to the HoloLens is done through Visual Studio and can be done via a local USB connection, a remote connection or through the emulator, as shown in Figure 4.

Figure 4 Different Ways to Deploy to the HoloLens

What about documentation? MSDN has long been a mainstay for developers, but years of additions made it harder to find what you needed. Now documentation is being completely redone at docs.microsoft.com. The HoloLens team intends its documentation to be excellent and you should be able to find most things in one spot at bit.ly/1rb7i5C. If you find something missing, let the HoloLens team know. We’re in a new era of making many things better at Microsoft, and the HoloLens team is helping to lead the way on good, solid documentation.

The Hardware The HoloLens displays at 60 FPS, so it’s extremely important to ensure your experiences run at this frame rate. You’ll also want to ensure you stay within the device’s memory requirements (as of this writing this means 900MB RAM available for your app). If you’re new to developing graphical experiences, it’s tempting to grab any art asset and place it in the scene. You may think, “I can grab a 3D model of my city and drop it into my scene and explore it!” But this would likely cause issues on any platform. Keep in mind most games implement many tricks to reduce draw calls, occlude geometry, show lower-resolution models that are further away (called level of detail, or LOD) and more, and the HoloLens is no exception. Performance recommendations are listed at bit.ly/2besJxQ. Some examples of optimization can be seen in the “Optimizing Your Games” module at bit.ly/2dBqMxw.

The Emulator The HoloLens emulator can get you pretty far in the development process. Technically, you don’t need a HoloLens at all to develop an experience, just as you don’t need a phone to develop for a phone. At some point, however, you’ll obviously want to test on a device. The emulator can emulate gestures, head position and voice, but it does have limitations. For one, when you’re in an emulator, you won’t see the real world around your holograms, as Figure 5 shows, though you can assign a shader to your spatial mesh, which will be covered in the next article.

Figure 5 Using the HoloLens Emulator

There are four predefined room models you can select in the emulator. Because you don’t have a HoloLens on your head, the emulator should know about the room and what it looks like from a 3D standpoint. I’m jumping ahead to the next article a bit, but the HoloLens understands the space around you and is constantly updating its understanding of this space, as shown in Figure 6. This “behind the scenes” view of what the HoloLens sees is a room that’s loaded by default in the emulator and is visible in the Windows Device Portal, which you can access for the emulator or your real HoloLens device. You can save a room model you’ve scanned with the HoloLens and load that into the emulator—which means you can also develop multi-user experiences using a HoloLens and the emulator or multiple instances of the emulator, and load in your own custom rooms to test out.

Figure 6 The Default Room Model Loaded in the Emulator Being Displayed

Windows Holographic

Windows 10 contains APIs across desktop, mobile and Xbox for the Windows Holographic platform. It’s through these APIs that you can interact with not only the HoloLens but other future devices, as well, including initiatives from other vendors, such as Intel’s new Project Alloy untethered headset, HTC, Qualcomm, and many others. Some of the APIs for Windows Holographic can be found in the following new namespaces (I recommend looking at these to get an idea of some of the underlying classes and structures):

Windows.Graphics.Holographic (bit.ly/2bFbeqv)
Windows.Perception (bit.ly/2bVNMHj)
Windows.Perception.Spatial (bit.ly/2bwjB8h)
Windows.Perception.Spatial.Surfaces (bit.ly/2bG9jCZ)
Windows.UI.Input.Spatial (bit.ly/2bipBOo)

There are three pillars of input on the HoloLens—gaze, gesture and voice, and these are what I’ll focus on now. If you’ve read any of my prior articles on game development or watched any of my online videos for Microsoft Virtual Academy, you know I’m very fond of the game engine Unity, so the code samples that follow are Unity C#-based.

Gaze is essentially where the HoloLens is, though it’s easiest to consider it when it’s on your head. To understand what gaze represents, let’s look within the Windows.Perception.People namespace at the HeadPose class, which contains three Vector3 types (a Vector3 struct contains just x,y,z single values):

ForwardDirection: Gets the forward direction of the HoloLens.
Position: Gets the position of the HoloLens.
UpDirection: Gets which direction is up for the HoloLens.

This information lets you determine where the user is and in which direction they’re looking. When an app launches on the HoloLens, the starting position is at 0,0,0. From that position it’s then easy to determine where and in which direction the user moves. Let’s look first at how to detect the object at which the user is looking. The requirement is that this object has a collider component on it to detect, which in Unity is trivial to add to any object. In Unity, instead of the HeadPose, you simply use the Camera.main information, which (behind the scenes) maps to the HeadPose. If you’re used to working in Unity with the camera, nothing is different. With this information, you can shoot an invisible arrow out and find out what it hits (a common technique in 3D programming called ray casting). I just need a game object like a cube with the Interactible script (Figure 7) on it and another separate empty game object with the InteractibleManager (Figure 8) on it.

Figure 7 The Interactible Class

public class Interactible : MonoBehaviour
{
  // The materials we’ll set to highlight.
  private Material[] defaultMaterials;
  void Awake()
  {
    // Get the materials from our renderer.
    // A material contains a shader which lights/colors object.
    defaultMaterials = GetComponent<Renderer>().materials;
  }
  void GazeEntered()
  {
    for (int i = 0; i < defaultMaterials.Length; i++)
    {
      // This assumed we're using a shader that has a Highlight
      // property, as does the Custom/SpecularHighlight one
      // from Holograms 210. This adds lighting color to it
      // to "highlight" it.
      defaultMaterials[i].SetFloat("_Highlight", .25f);
    }
  }
  void GazeExited()
  {
    for (int i = 0; i < defaultMaterials.Length; i++)
    {
      defaultMaterials[i].SetFloat("_Highlight", 0f);
    }
  }
}

Figure 8 The InteractibleManager Class

public class InteractibleManager : MonoBehaviour
{
  private GameObject lastHit;
  // Every frame see if we are looking at a HoloGram (i.e. a game object).
  void Update()
  {
    RaycastHit hitInfo;
    if (Physics.Raycast(
      Camera.main.transform.position,
      Camera.main.transform.forward,
      out hitInfo,
      10.0f,
      Physics.DefaultRaycastLayers))
    {   
      // The game object we've hit.
      var tempGO = hitInfo.collider.gameObject;
      // See if this object contains our Interactible class.
      if(tempGO.GetComponent<Interactible>() != null)
      {
        lastHit = tempGO;
        // Loosely coupled way to call a method in Unity on our g.o.
        lastHit.SendMessage("GazeEntered");
      }
  }
  else
  {
    // No object detected in Gaze, lets deselect the last one.
    if(lastHit!=null)
    {
      lastHit.SendMessage("GazeExited");
      lastHit = null;
    }
  }
  }
}

I’m presenting just an overview of the main ideas here. You’ll find much more detail in Holograms 210, the Holographic Academy course on gaze concepts, including stabilizing the head position to avoid jerky movements when selecting objects (bit.ly/2b9TWlR). In short, Camera.main is the HoloLens position. You shoot your invisible ray from this position in the Camera.main.forward direction to find the first collider you hit. If you find something, all that’s needed is to highlight it by setting a variable that the shader will use. For the example in Figure 7, I used the shader from Holograms 210 and just set the _Highlight value to .25 when selected. You could use the standard Unity shader, as well, and set the emission color’s brightness instead, like so:

defaultMaterials[i].SetFloat("_EmissionColor", Color(0,0,0,.2f);

You need to be aware of the effect of shaders on the performance of the HoloLens. This device, while amazing, is constrained by its size, as is often the case with hardware. Nearly every experience and game made today has time allocated for optimization, and the HoloLens is no different. When developing for the HoloLens with Unity, the first thing you’ll want to do is swap out the shaders for the optimized variants located in the HoloToolkit at bit.ly/2bO6cH2.

The HoloToolkit contains many helper functions and objects that can greatly aid in HoloLens development, including cursors, gaze stabilizers, gesture and hand managers, plane detectors, follow-me scripts, examples for sharing across multiple HoloLens, and much more. The majority of this functionality is in the HoloToolkit-Unity at bit.ly/2bO8XrT, so be sure you look in the correct repository and not just at the basic HoloToolkit at bit.ly/2bPCbas, which, however, is useful in its own right and contains various standalone sample projects and other library code that the HoloToolkit-Unity utilizes.

When you have objects that aren’t in the viewable area, be it a game on the screen, VR, AR or mixed reality, it’s useful to provide a small indicator to tell the user in what direction they should look for an object if they start to look away, as shown in Figure 9. This is a very common technique in many video games, and it’s incredibly useful in the mixed-reality world.

Figure 9 Displaying a Directional Indicator

The code in Figure 10 can be used to determine if a particular hologram is visible.

Figure 10 Determining If a Hologram Is Visible

[Tooltip("Allowable percentage (to 30%) inside the holographic frame to continue to show a directional indicator.")]
[Range(-0.3f, 0.3f)]
public float TitleSafeFactor = 0.1f;
// Determine if sgame object is visible within a certain percentage.
// The Viewport is 0,0 to 1,1 (bottom left of screen to top right).
private bool IsTargetVisible()
{
  // This will return true if the target's mesh is within the Main
  // Camera's view frustums.
  Vector3 targetViewportPosition =
    Camera.main.WorldToViewportPoint(gameObject.transform.position);
      return (targetViewportPosition.x > TitleSafeFactor &&
      targetViewportPosition.x < 1 - TitleSafeFactor &&
      targetViewportPosition.y > TitleSafeFactor &&
      targetViewportPosition.y < 1 - TitleSafeFactor &&
      targetViewportPosition.z > 0);
}

Once you know if an object isn’t visible (or is just barely visible), you can figure out in which direction to show a directional indicator. Because you have access to the camera position and any game object position, it’s easy to figure out where a hologram is compared to where the user is looking. To get a directional vector representing an arrow from the camera to a particular game object is a matter of subtracting one location from the other and normalizing it to make it one-based (it’s easiest to use a normalized vector in many calculations), like so:

Vector3 camToObjectDirection =
  gameObject.transform.position - Camera.main.transform.position; 
  camToObjectDirection.Normalize();

Showing directional indicators does require a couple other things to be set up, but this is the gist of the functionality. Once you know the direction from the HoloLens (that is, the camera) to the hologram, you can display an indicator on the screen. You can check out this feature in more detail at bit.ly/2bh0Hz7.

Gesture is the next way of processing input to the HoloLens. Gestures are performed with the hand, controller or voice commands. Hand gestures are scanned for via one of the cameras on the front, within what’s called the gesture frame, an expanse that extends the viewing area on all sides of the hologram, allowing you to keep your hand closer to your body rather than holding it way out every time. You typically don’t tap on holograms, but instead tap and determine what the user is looking at via gaze.

The HoloLens supports two categories of gestures. The first includes discrete gestures, such as a single quick action like the air tap shown in Figure 11, a click with the included Bluetooth clicker, saying the command “Select,” or even a double tap. The second type involves continuous gestures, which happen over time and are generally triggered when you press and hold and then move your hands. Continuous gestures provide support for navigation and manipulation. Navigation gestures give you a starting position of 0,0,0—which is wherever you start the gesture—and then are constrained to a virtual cube in space that allows you to move from -1 to 1 on any axis within that cube to provide (among other use cases) smooth rotation around one or multiple axes or smooth scrolling. Manipulation gestures, rather than constraining to a virtual cube, allow a 1:1 movement between hands and holograms—think of painting virtually or nice fluid motions to position holograms around your environment; you get world-relative positional information in x,y,z.

Figure 11 An Air Tap

As a developer you can, of course, hook into the gestures, like air tap, hold, navigation and manipulation. Currently there’s no support for custom gestures, but you do have access to quite a bit more data via the aforementioned gestures. You can also track hand position (when in the ready state or pressed state) in x,y,z and understand when the hands are in or leaving the gesture frame, which can provide helpful feedback to the user when their hands are about to leave the gesture frame.

Figure 12 The GestureManager Class

public class GestureManager : MonoBehaviour
{
  private GestureRecognizer gestureRecognizer;
  void Start()
  {
    gestureRecognizer = new GestureRecognizer();
    // We can register here for more than just tap (ex Navigation).
    gestureRecognizer.SetRecognizableGestures(GestureSettings.Tap);
    gestureRecognizer.TappedEvent += (source, tapCount, ray) =>
    {
      // A tap has been detected.
      // Raycast using the provided ray and get the hit game
      // object and perform some action on it as was done previously.
    };
    gestureRecognizer.StartCapturingGestures();
  }
  void OnDestroy()
  {
    gestureRecognizer.StopCapturingGestures();
  }
}

You can add a GestureManager class like the one in Figure 12 to register for the tapped event, which will let you know, for example, when the user has performed a tap gesture via the Select voice command, a controller or a hand (regardless of where they tapped). You then need to determine what was being looked at when the user tapped, just as what was done previously. It’s important to note that the TappedEvent in Figure 12 is passed in a ray that represents where the head was when the event happened. In the case of fast head movement, you want to ensure you raycast from where the gaze was when the user taps, not where it is when the event arrives, which may be slightly off of the original location.

Voice commands on the HoloLens are supported on a system level through Cortana, but also share the same speech engine as all other UWP apps, and thus require the Microphone device capability in the project’s package.appxmanifest file. The audio processing is hardware-accelerated and provides 16khz to 48khz 24-bit audio streams of the user’s voice and ambient environment. Built-in Cortana commands include “Select,” “Hey Cortana <command>,” and in-app commands contain “Select” (instead of air-tapping) and “Place.”

Responding to custom voice commands is possible and straightforward. Just add your keywords and delegate code to execute and start listening, as shown in Figure 13.

Figure 13 Adding Keywords and Delegate Code

public class SpeechManager : MonoBehaviour
{
  KeywordRecognizer keywordRecognizer;
  Dictionary<string, System.Action> keywords =
    new Dictionary<string, System.Action>();
  void Start()
  {
    keywords.Add("Reset level", () =>
    {
      // Call the OnReset method on every descendant object to reset their state.
      this.BroadcastMessage("OnReset");
      // We could also do a full-level reload via
      // SceneManager.LoadScene(SceneManager.GetActiveScene().buildIndex);
    });
    // Tell the KeywordRecognizer about our keywords.
    keywordRecognizer = new KeywordRecognizer(keywords.Keys.ToArray());
    // Register a callback for the KeywordRecognizer and start recognizing!
    keywordRecognizer.OnPhraseRecognized += KeywordRecognizer_OnPhraseRecognized;
    keywordRecognizer.Start();
  }
  private void KeywordRecognizer_
    OnPhraseRecognized(PhraseRecognizedEventArgs args)
  {
    System.Action keywordAction;
    if (keywords.TryGetValue(args.text, out keywordAction))
    {
      keywordAction.Invoke();
    }
  }
}

For those looking to do speech-to-text, dictation is available, as well, using the DictationManager class to get DictationHypothesis, DictationResult, DictationComplete and DictationError events. Do note this requires WiFi connectivity. My coworker Jared Bienz has added some nice support for text-to-speech, as well, which you can find in the HoloToolkit for Unity.

Wrapping Up

The HoloLens opens up a new era in how you can experience the world around you, mixing both reality and the virtual world. Get started at HoloLens.com and keep an eye there for announcements. Stay tuned for my next article, which will discuss my favorite HoloLens feature, spatial mapping.

Adam Tuliper is a senior technical evangelist with Microsoft living in sunny SoCal. He’s a Web dev/game dev Pluralsight.com author, and all-around tech lover. Find him on Twitter: @AdamTuliper or at channel9.msdn.com/Blogs/AdamsGarage.

Thanks to the following Microsoft technical expert for reviewing this article: Jackson Fields
Jackson Fields is a software engineer at Microsoft

Discuss this article in the MSDN Magazine forum

Share via