I’m sitting in a small interview room with one of the smartest of the hundred-odd candidates I’ve interviewed so far this year at Oculus Research—and one of the hardest to get a clear fix on. There’s no doubt about whether she can do the work, but that’s only half the picture; what I don’t know is whether she’d be happy doing it. So I roll out a question I’ve asked hundreds of times before: “What are you excited to do over the next few years?”
Responses tend to be safe, bland interview answers—“do challenging work,” “build new things,” “be around smart people I can learn from”—but this time I get an answer I’ve never heard before.
“I want to invent the future.”
She’s come to the right place.
Our world is constantly being reinvented in many ways, ranging from space travel to medicine to the Internet. The future we’re creating is one where truly immersive human-oriented computing is an integral part of everyday life for billions of people. This future includes both virtual reality (VR) and augmented reality (AR), where VR is built around headsets, like the Oculus Rift and the Samsung Gear VR, that block all external light and display purely computer-generated scenes, while AR involves goggles and glasses with see-through lenses that can add virtual images to the real world. Today, AR and VR are very different experiences, but as they mature over the next decade, they will converge, and the core experience that unites them will be the ability to mix the real and virtual worlds freely.
VR and AR will together change our lives as fundamentally as personal computers and smartphones have, and quite possibly even more. Over the last 40 years, personal computers, smartphones, and tablets have given us constant, near-instantaneous access to the digital world through 2D screens, in the process touching almost every aspect of our lives. Over the next 40 years, AR and VR will allow us to actually live in a mix of the real and virtual worlds, and this will once again radically change the way we work, play, and communicate.
Arthur C. Clarke said: “Any sufficiently advanced technology is indistinguishable from magic,” and mature AR and VR will truly seem magical. Consider AR glasses, which will let you transcend space, conjure objects and devices into existence at will, and amplify your senses, memory, and cognition to superhuman levels.
Imagine wearing a pair of glasses that let you visit with your parents no matter where you are, tour the Louvre on your lunch break, and walk and talk with a friend on the other side of the planet and truly feel that they are at your side. Think of how the patterns of where we choose to live would change. Imagine that your glasses replace all your electronic devices – phones, TVs, computers, e-book readers, game consoles, the whole lot – with virtual versions, in the process making them inexpensive and instantly upgradeable. Instead of mounting an expensive big-screen TV in your living room, you would pay a few dollars to have a virtual big-screen TV available, wherever you happen to be. And imagine that your glasses let you see in low light, hear in noisy environments, remember people’s names, find the fastest way to your destination, and leave virtual sticky notes for friends anywhere in the world. Imagine that they automatically feed you relevant information whatever you’re doing – put them on and you can beat Ken Jennings at “Jeopardy.” And, of course, imagine they do everything your smartphone does today, but always instantly accessible.
Who wouldn’t leap at the chance to wear those glasses?
Alas, that’s not possible today, because the technology isn’t there yet; we’re working on it, but right now bottling the magic of AR is just an aspiration. Virtual reality, on the other hand, is already more than aspirational; it’s shipping now and there’s definitely some magic there, but today’s headsets are just a beginning. You can see my predictions for the next five years for VR here, but even that’s early days; in the not too distant future, virtual reality will immerse us in visuals, audio, and even haptics of stunning clarity, and VR will become the preferred environment for work, play, and communication when we’re not on the move.
The most important step forward for VR, however, will be the ability to import the real world into the virtual world. VR headsets can be equipped with sensors capable of reconstructing models of the real world in real-time, which can then be decorated, modified, enhanced, and shared with others. Virtual images can be mixed with the real world to augment reality, and avatars reflecting the appearance, movement, and unique characteristics of real people can share that space. This is a very different technology from see-through AR, but the result is the same, the combination of virtual and real to create something more powerful than either.
AR and VR together form such a broad platform—the real world and then some—that it will enhance each of our lives in very different ways. I personally can’t wait for a personal virtual workspace, which is how I fully expect to be doing the bulk of my work before my career is over. Working with other people from around the world will be an important part of this; just being able to share a virtual whiteboard with Yaser Sheikh from our Pittsburgh lab would be a big win. The ability to instantly switch between workspaces will be a game-changer as well. The real key, however, is that after more than 30 years, I will finally be able to have all the (virtual) big-screen monitors I want!
VR has been tantalizingly close for 30 years, always on the verge of becoming the next big thing. Much has changed recently, though, largely thanks to Moore’s Law and technology developed for smartphones. It’s risky to say this time is different, but this time may indeed be different; millions of VR headsets have already shipped, and there’s a race to build the first true AR glasses. Both AR and VR might finally be ready to take off, and when they do, it will change our lives more than anyone imagines. The virtual world has the potential to provide experiences every bit as rich as the real world, across a far greater range, and it is not an exaggeration to say that VR and AR have the potential to greatly expand the full range of human experience.
However, we have barely started to realize that potential, and a great deal remains to be done, in areas ranging from optics and displays to computer vision to audio to graphics to user interfaces to experiences, and much, much more. Oculus Research’s goal is to develop all those pieces and bring them together to make VR and AR together the platform of the future. Getting there will take many years and a ton of innovation, so it will require a critical mass of vision, resources, and a long-term perspective—and, most of all, extraordinary people.
Doug Lanman exudes his characteristic positive energy as his distinctive voice, with a hint of his childhood in Oklahoma, booms across the table between us.
“I had an idea the other day,” he says.
That’s how a lot of conversations with Doug start. He specializes in computational imaging, and you have the sense that there’s a part of his mind that never stops thinking about new ways to combine optics and computation. The objective is to get the right photons into your eye to produce the best possible approximation of reality, something he’s pursued obsessively at MIT Media Lab, NVIDIA, and now Oculus Research.
Doug is particularly interested in making it possible for people to use VR comfortably all day long. One of the big barriers to that is that current VR headsets focus at a fixed distance, which can be associated with visual fatigue and discomfort, especially for near viewing. There are many possible ways to approach this problem, including holograms, multi-focal displays, multi-lens lightfield displays, and varifocal displays, but none has yet gotten past the research stage.
Doug has investigated every one of the approaches mentioned above; in fact, he recently published about a new one, focal surface displays, which you can read about here and here. He eventually decided that varifocal, where the lens deforms or moves relative to the screen in order to alter the focal distance, was promising enough to take the next step with, and formed a cross-disciplinary team of about 40 people to build a prototype that proved that the approach worked, and that could eventually lead to shipping a headset that solves depth of focus.
The DNA of research labs spans the spectrum from blue sky to advanced product development. Oculus Research sits somewhere in the middle, looking for breakthroughs on high-impact, genuinely unsolved problems, but always with an eye to getting the results out into the world. Doug came into Oculus Research as a world-class researcher focused on publication. He’s continued to do world-class research and to publish, but the work he and his team have done may also someday allow millions of people to comfortably work with virtual objects within arm’s length and read in VR for however long they want. And Doug’s found, to his surprise, that that combination is more rewarding than pure research.
Doug’s work is just one thread among dozens in the technology tapestry that we’re weaving. In order for VR and AR to become part of everyday life, we need to integrate computer vision, optics, displays, user interface, user experience, audio, haptics, perceptual science, material science, silicon, operating systems, nanofabrication, animation, rendering, hand tracking, eye tracking, speech recognition—and more—into systems capable of delivering magical experiences. That takes great researchers like Doug, of course. But it also takes something else—the ability to turn research ideas into working devices.
From the start, Oculus Research has been structured around rapid iteration of full hardware/software stacks. We have everything needed to develop prototype VR and AR systems from start to finish, and since hardware always takes longer, we’ve paid particular attention to being able to build whatever’s needed as quickly as possible.
Rapid iteration requires a great shop, and our machinists make the CNCs hum. There’s also everything else you might want for rapid prototyping, including 3D printers, laser and water jet cutters, and PCB and flex fabrication. And there’s a crack engineering team to put it all to work.
All this gives us the ability to iterate rapidly and control every aspect of the process. As a result, we were able to build Doug’s varifocal prototype entirely in-house.
Each research team also has everything it needs to move quickly, ranging from one of the best anechoic chambers in the world to the Sausalito mocap stage to the state-of-the art device fabrication and microassembly capabilities at our research office in Cork, Ireland.
Software is of course equally important; the varifocal project depended on real-time firmware to control the hardware, novel rendering techniques to produce the correct depth of focus, some nice demo programming to prove out the experience, and user study software to figure out how much difference varifocal actually makes.
Software engineering across Oculus Research spans a broad spectrum, from firmware through drivers and APIs to operating systems, demos, simulations, networking, databases, machine learning, GPUs, computer vision, apps, test suites, games, and more. As just one example, for AR the entire operating system stack will have to be rethought and restructured around power management, from drivers and the graphics pipeline up through the app model. Over my career I’ve written code in every one of those areas, and loved it; alas, my job involves little coding these days, but Oculus Research would have been the ultimate playground for me.
There are a great many aspects to VR and AR, but the first thing most people think of is seeing virtual objects. As it happens, it is also one of the hardest parts of enabling great virtual experiences.
To tackle that, we have one of the best optics teams in the world, equipped with facilities that enable them to push the state of art across a wide variety of technologies. For example, some of the most promising approaches for AR glasses involve waveguides—flat pieces of glass or plastic that light can be injected into so that it bounces along lengthwise and eventually deflects out and into the pupil. Because they’re flat and thin, waveguides lend themselves to glasses-like form factors, but there are many complications with image quality, see-through quality, field of view, depth of focus, efficiency, and manufacturability. Solving those problems requires sophisticated computation combined with rapid experimentation across a variety of new and emerging approaches, so we’ve built a state of the art clean room designed for nanofabricating optical structures in order to build our own custom waveguides.
The equipment and the team together enable rapid end-to-end development of new technology across disciplines. For instance, when she was a post-doctoral fellow at Berkeley, perceptual scientist Marina Zannoli had dreamed of building a testbed that would unify research on depth of focus across many different display technologies, such as Doug’s varifocal system. As a post-doc, Marina had no way to bring together the optical and engineering expertise needed, but upon arriving at Oculus, Marina teamed up with optical scientist Yusufu Sulai, who had recently completed his post-doc in the field of retinal imaging. Together, Yusufu and Marina designed a first-of-its-kind tool for probing the limits of the human visual system, and within a year, Yusufu had the full system built, operating to spec, and deployed for experiments. You can read about Marina and Yusufu’s system here.
The man who built the optics team is Scott McEldowney, a 30-year industry veteran who still bikes to work every day. Scott took great care over several years to assemble a unique team capable of performing the research and development required to move beyond the state of the art, which is fitting, given Scott’s trademark quote: “In order to do something great, you have to say no to many good things.”
And we’re lucky enough to have the critical mass of equipment and especially people to do something great.
Straight out of grad school at Caltech, Sean Keller was the architect of the lowest-power radiation-hardened microprocessor ever made, by two orders of magnitude. He had to invent a new type of circuit analysis to make that happen. Sean’s not afraid to venture into new, unexplored territory, to say the least.
That’s fortunate, because he has now taken on the very different challenge of leading the user interface team, and that is perhaps the single greatest challenge for AR. Not to diminish the other challenges—getting the right photons into your eyes in AR is insanely hard, and making computer vision work within the power and weight budget of glasses is right up there as well—but at least they’re well-understood problems.
Whatever the standard AR user interface ultimately ends up being—and it will be years before that’s determined—it’s going to be something completely new, as clean a break from anything that’s come before as the mouse/GUI-based interface was from punch cards, printouts, and teletype machines. You’re going to have to be able to interact with your AR glasses in all the contexts you encounter in your day, so the interface will have to be multimodal. Hand gestures are good, but you’re unlikely to want to use them while you’re face to face with someone, and you won’t be able to use them at all if you’re carrying something. Voice is another good option, but it’s not a great choice in a meeting or in a noisy room. A handheld controller can be effective, but only when you have it with you (and haven’t misplaced or lost it), and only if your hands are free and it’s socially acceptable to be using it. Each mode of interaction has its strengths, but no one mode can meet all the needs, and the challenge is to design an interface that can switch seamlessly between them and decide which to use at any given moment.
Because AR glasses are enhancements for your perception, memory, and cognition, they’ll also need to anticipate what you want—and just as important, what you don’t want. As I noted earlier, if you can’t remember someone’s name, it would be great to have the glasses remind you. At the same time, if you walk into work and the glasses insist on telling you the name of every single person you see, you’ll never wear them again. AR will ultimately need to be a cloud of inference that surrounds you all day, helping you so intuitively that when you take the glasses off, it will feel like part of your brain has gone to sleep.
You might reasonably wonder why Sean is leading the user interface team, rather than a well-known UI researcher. In my experience, the key to true generational leaps is to have great problem solvers working on them, regardless of previous experience. As Thomas Kuhn observed in The Structure of Scientific Revolutions, it’s fresh faces, unattached to existing approaches, who end up trying the new, risky approaches that lead to paradigm shifts. And the truth is, VR and AR are developing so rapidly that there are no experts right now, only smart people who want to apply their skills and creativity to solving one of the hardest and most interesting multi-disciplinary problems around.
Taking new, risky approaches requires rock-solid organizational support, and for Oculus Research, the commitment comes straight from the top. Mark Zuckerberg frequently describes VR and AR together as the next computing platform, and as keys to Facebook’s 10-year strategy—check out any of his last few F8 keynotes (F8 2017, F8 2016) or many of the quarterly earnings calls. And, in fact, our AR program is the direct result of Mark’s vision—it started because Mark felt it was a long-term investment we needed to make.
Mark’s vision makes perfect sense, because both AR and VR fit seamlessly into Facebook’s mission to bring the world closer together. Even at this very early stage, social VR like Facebook Spaces shows the potential power of virtual communities. And AR will enable people to be connected more closely no matter where they are or what they’re doing.
I’ll be honest—when Mark first raised the topic of AR, I literally said that I wasn’t sure what it was useful for. That earned me a look of disbelief that was useful incentive to think a lot harder about AR’s potential. Three years later, I’m fully convinced that we’ll all be wearing AR glasses one of these years—myself included—but it was Mark’s vision that first got me thinking in that direction, and that convinced me to sign up to make AR happen.
While AR glasses have the potential to be one of the most important technologies of the twenty-first century, that won’t happen unless some very challenging practical constraints are overcome. They must be light and comfortable enough to wear all day, run off a wearable battery for many hours per charge without getting uncomfortably hot, work in full sunlight and in darkness, and have excellent visual and audio quality, both virtual and real. They must be completely socially acceptable – in fact, they need to be stylish. They need an entirely new user interface. Finally, all the rendering, display, audio, computer vision, communication, and interaction functionality needed to support virtual objects, telepresence, and perceptual/mental superpowers must come together in a system that operates within the above constraints. (See this for a broader overview of what it will take to make AR work.)
There is no combination of existing technologies that meet all those requirements today. The honest truth is that the laws of physics may make it impossible to ever build true all-day AR glasses; there’s no Moore’s Law for optics, batteries, weight, or thermal dissipation. My guess is that it is in fact possible (obviously, or I wouldn’t be trying to make it happen), and if it is possible, I think it’s highly likely that all-day AR glasses will happen within the next ten years, but it is an astonishingly difficult technical challenge on half a dozen axes, and a host of breakthroughs are going to be needed.
AR is also a largely unexplored space, so there’s no way to know in advance what the experiences are that will make AR glasses worth wearing all day. All of which means that our AR glasses effort, which spans all of the above, is an ongoing joint evolution of research, engineering, and experience prototyping; so, despite the name Oculus Research, the AR effort is in fact a mix of research, incubation, and product development.
Tackling such a huge, ambitious, multifaceted project requires close teamwork and constant communication across a large, diverse set of specialists and generalists, spanning user experience, hardware, software, optics, displays, sensing, silicon, perceptual science, computer vision, audio, user interface, operating systems, system architecture, program management, and more. It also requires fostering creativity and the ability to innovate among the various specialists and sub-projects, while still maintaining the discipline needed to get to the overall goal. That delicate balancing act perfectly suits Laura Fryer, the General Manager of AR Incubation.
Laura is a straight-talking, perpetually upbeat veteran of the games industry, with decades of management and production experience spanning Gears of War, the original Xbox, a four-year VP stint at WB Games that produced the 2014 Game of the Year, Shadow of Mordor, and creating the Epic Seattle office. I knew her from our time together at Microsoft, and jumped at the chance to get her for Oculus Research, because I knew we’d need her combination of great people skills and hard-nosed accountability. Her leadership style is key to enabling a diverse collection of passionate and strongly opinionated researchers and engineers to not only coexist but become something more than the sum of the parts, a team that makes better decisions than any individual could.
Laura’s first priority is always user experience, a lesson drilled home over the course of shipping dozens of games. As she puts it, “People buy experiences, not technology.” In the largely unexplored, technically challenging space of AR glasses, she emphasizes that neither technology nor experience alone can be a solution; it’s the creative dialogue between the two that will be the key.
Laura has an interesting observation about the people and culture that make Oculus Research unique: “Here, you are surrounded by people who are mostly smarter than you (at least in their specialty), and they truly do question everything, and we encourage that, because anything less won’t solve AR glasses. If you are used to mostly being the smartest/most right person, this is a hard place to be, but it’s the most remarkable opportunity for personal and professional growth that I’ve seen.”
Making AR glasses work is extraordinarily difficult. Success will require correspondingly extraordinary people, and Laura is a powerful force bringing those people together as a team.
It’s fair to say that I initially underestimated what it would take to move all this forward. When I joined Oculus, I signed up to build a 30-50 person research team. Oops. Orders of magnitude matter.
Fortunately, I also underestimated how interesting it would be.
There may have been a day in the last two years when Richard Newcombe wasn’t deeply, passionately energized, but if there was, I missed it. He messages me in the evening, in the middle of the night, first thing in the morning—he must sleep, but I’m not sure when. He is a human dynamo, and all that energy is focused on one thing, developing technology that can sense and understand the state of the world.
One example is a system that can scan a room and construct a digital model from it, which can then be used to render the room in VR, making it possible to mix the real and virtual worlds. Most people would call this computer vision, but in Richard’s mind computer vision is just part of the foundation for what he calls machine perception. Machine perception fuses various tracking systems, simultaneous localization and mapping (SLAM), machine learning, distributed networks, databases, and AI into systems that can build and maintain a dynamic model of the world, enabling personalized, contextual AI that can start to understand the parts of the world that matter to you—exactly what AR glasses will need in order to make you smarter.
Like any other modern data-based technology, machine perception gets better the more data it has to work with. The challenge is always how to get enough data to kick off that virtuous spiral of ever more data. To bootstrap the process, Richard and his Surreal Vision team have built a complete apartment, precisely measured everything in it, and constructed an exhaustive inventory of its contents at the level of detail with which a person might interact with them in a typical day. This can then be used to observe what people really do in a living space (as opposed to in artificial study settings), and as ground truth against which to measure the performance of various machine perception approaches.
Richard looks a bit like an Oxford don on the rare days when he wears his tweed jacket, an impression enhanced by his English accent. And he is one of the best computer vision researchers in the world, having won best paper award at CVPR in 2015. Nonetheless, he is anything but an ivory tower academic. The Surreal team is focused on advancing the state of the art and then getting it out into the world—in VR headsets, in AR glasses, in smartphones, wherever it can be useful.
Ultimately, all of the Surreal team’s work is directed toward answering a fundamental question: what is it possible to know about the world? What happened in the past; what can we know about the present; what can we predict about the future? Assuming we’re not in the Matrix, there’s a world of real things out there, but we only know about them by the various bits of information, such as sound waves, scents, and photons, that they transmit to our eyes, ears, nose, and so on. Richard’s interest lies in extracting the maximum amount of information from those traces. That’s really what computer vision is about: sensing energy from the real world—‚photons landing on a camera sensor, for example—then evaluating the probabilities of various possible states of the sensed region of the world in order to reconstruct the most probable one. (That’s exactly what our own perceptual system does; optical illusions are simply instances where the most probable state happens to be wrong.) Thus, the process of reconstructing the state of the world is sometimes referred to as “collapsing the probability distribution.”
When I first met Richard, I thought computer vision was simply a way to track a headset so the right virtual scene could be drawn. As we got to know each other, he gradually educated me about the wider scope of his thinking about machine perception, but it was half a year before he revealed the full scope of his ambition.
I remember it well. He leaned forward conspiratorially, a gleam in his eye, and said: “What I’m really trying to figure out is how to collapse the probability distribution for the entire universe.”
I wouldn’t bet against him.
Sixty years ago, a psychologist named J. C. R. Licklider had a vision of a world in which humans interacted directly with computers designed to enhance human capabilities. He nurtured that vision at ARPA in the 1960s (where he also planted the seed that grew into the Internet). His aide and eventual successor at ARPA, Bob Taylor, later led the Computer Science Lab at the Xerox Palo Alto Research Center that brought all the pieces together in the 1970s, resulting in the laser printer, Ethernet, and the Alto, the first true personal computer and the progenitor of the Mac, Windows, tablets, and smartphones. Thanks to Licklider and Xerox PARC, we are able to interact with the virtual world and each other through 2D surfaces whenever we want, wherever we go.
That was the first great wave of human-oriented computing, and it changed almost every aspect of our lives, but it wasn’t the end of the story. The full potential of human-oriented computing will only be realized when rather than interacting with the virtual world through flat portals, we live in a world that intertwines virtual and real however we want, and that is what AR and VR are all about. This is the second great wave, and unless it proves to be literally impossible to create good enough virtual experiences, augmented and virtual reality are the future, as surely as personal computers and all that’s sprung from them were the future half a century ago.
That doesn’t mean that VR and AR will automatically happen—they will require enormously sophisticated technology and breakthroughs across multiple fields. The magic can only happen when a critical mass of interlocking, collaborative talent and resources comes together, as happened at Xerox PARC 45 years ago. There are just a few places in the world that have the combination of vision, resources, business model, and people that makes that possible, and fewer still where that critical mass has actually come together.
You’ve just seen one of them.
Thanks for joining me today as I pulled the curtain back a bit on Oculus Research. There’s a lot more going on here, and over the upcoming months you’ll be hearing more about what we’re working on.
I’ll be honest and say that the primary reason I’ve written this is not so you can learn what we’re doing and how awesome it is (although it is), or so that some of the people doing groundbreaking work can be recognized for it (although they should), but because we’re looking for special people—smart, motivated people who want to change the world. And not just researchers—changing the world requires great programmers, hardware engineers, PMs, managers, and more, especially since our mission is centered on building useful technology and getting it out into the world. I wrote this because it was the most effective way I could think of to reach out to those exceptional people and get them to consider whether Oculus Research might be the most exciting, fulfilling, interesting place for them to spend the next however many years. So if, as you were reading, you started thinking, “Wow, this is what I need to be doing!”—please get in touch.
I look forward to hearing from you.