Accessibility Links

  • Skip to content
  • Skip to search IOPscience
  • Skip to Journals list
  • Accessibility help
  • Accessibility Help

Click here to close this panel.

JPhys Photonics

Purpose-led Publishing is a coalition of three not-for-profit publishers in the field of physical sciences: AIP Publishing, the American Physical Society and IOP Publishing.

Together, as publishers that will always put purpose above profit, we have defined a set of industry standards that underpin high-quality, ethical scholarly communications.

We are proudly declaring that science is our only shareholder.

Virtual reality and augmented reality displays: advances and future perspectives

Kun Yin 1 , Ziqian He 1 , Jianghao Xiong 1 , Junyu Zou 1 , Kun Li 2 and Shin-Tson Wu 3,1

Published 8 April 2021 • © 2021 The Author(s). Published by IOP Publishing Ltd Journal of Physics: Photonics , Volume 3 , Number 2 Citation Kun Yin et al 2021 J. Phys. Photonics 3 022010 DOI 10.1088/2515-7647/abf02e

Article metrics

14981 Total downloads

Share this article

Author e-mails.

[email protected]

Author affiliations

1 College of Optics and Photonics, University of Central Florida, Orlando, FL 32816, United States of America

2 Goertek Electronics, 5451 Great America Parkway, Suite 301, Santa Clara, CA 95054, United States of America

Author notes

3 Author to whom any correspondence should be addressed.

Shin-Tson Wu https://orcid.org/0000-0002-0943-0440

  • Received 10 December 2020
  • Accepted 18 March 2021
  • Published 8 April 2021

Peer review information

Method : Single-anonymous Revisions: 1 Screened for originality? Yes

Buy this article in print

Virtual reality (VR) and augmented reality (AR) are revolutionizing the ways we perceive and interact with various types of digital information. These near-eye displays have attracted significant attention and efforts due to their ability to reconstruct the interactions between computer-generated images and the real world. With rapid advances in optical elements, display technologies, and digital processing, some VR and AR products are emerging. In this review paper, we start with a brief development history and then define the system requirements based on visual and wearable comfort. Afterward, various VR and AR display architectures are analyzed and evaluated case by case, including some of the latest research progress and future perspectives.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license . Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

As a promising next-generation display, virtual reality (VR) and augmented reality (AR) provide an attractive new way for people to perceive the world. Unlike conventional display technologies, such as TVs, computers, and smartphones that place a panel in front of the viewer, VR and AR displays are designed to revolutionize the interactions between the viewer, display, and surrounding environment. As a kind of information acquisition medium, VR and AR displays bridge the gap between computer-generated (CG) images and the real world. On the one hand, VR displays generate a fully immersive virtual environment based on CG images, which has a sufficient field of view (FOV) to provide refreshing virtual experience without relying on the viewer's real environment. On the other hand, AR display offers see-through capability with an enriched surrounding environment. By overlapping virtual images with the real world, viewers can immerse in an imaginative world that combines fiction and reality.

Although some commercial VR and AR displays have emerged in recent years, the origin of this technology can be traced back to the last century [ 1 ]. With the introduction of head-mounted display (HMD) and the virtual environment in the 1960s [ 2 , 3 ], such a novel display concept was once considered as state-of-the-art. However, due to the lack of flat panel displays, image rendering capabilities, related sensors, wireless data transfer, and well-designed optical components, this display technology, which was ahead of its time then, came to an end. Fortunately, with the rapid development of optics [ 4 – 6 ], high resolution displays [ 7 ], and information technologies [ 8 ] in recent years, VR and AR are blooming again. Because of the impressive visual experience and high degrees of interaction between viewers and CG images, VR and AR are promising for widespread applications, including, but not limited to, healthcare, education, engineering design, manufacturing, and entertainment.

The goals of VR and AR displays are to provide reality-like clear images that can simulate, merge into, or rebuild the surrounding environment without wearer discomfort [ 9 , 10 ]. Specifically, visual comfort has to meet the requirements of the human visual system based on the eye-to-brain imaging process, otherwise the viewer will feel unreal, unclear, or even dizzy and nauseous. Usually, the human eye has a large FOV: about 160° in the horizontal and 130° in the vertical directions for each eye (monocular vision). The overlapped binocular vision still has 120° FOV in the horizontal direction [ 11 ]. In parallel, the dioptre and rotation of the human eye lens can collaborate to focus on different positions of a real object with the correct depth of field and blur the other portions [ 12 ]. Therefore, to achieve visual comfort, the optical system should provide an adequate FOV, generate 3D images with matched depth and high resolution, and offer sufficient contrast and brightness, to name just a few examples. Regarding wearer comfort, a compact and lightweight structure is desired for long-time use. At present, due to the pros and cons between different optical components and system designs, it is still challenging for VR and AR to meet these goals. Therefore, in this paper, we focus on advanced VR and AR architectures aiming at visual and wearer comfort, and a more comprehensive understanding of the status quo.

2. Advanced architectures for VR displays

Figure 1 (a) depicts a schematic diagram of a VR optical system. For visual comfort, a broad FOV covering the human vision range can be achieved by designing a compact eyepiece with a low f -number ( f /#) [ 13 ]. However, due to the immersive experience with a completely virtual environment, the main issue is with the CG-3D image generation. When evaluating the capability of generating 3D images in VR, an important aspect of the human visual system is stereo perception. The real observation of a 3D object induces an accommodation cue (the focus of the eyes) and a vergence cue (relative rotation of the eyes), that match with each other (figure 1 (b): left) [ 14 , 15 ]. However, in most of the current VR systems, there is only one fixed display plane with different rendered contents. To capture the image information, the viewer's eyes will focus on the display plane, but the position of the CG-3D object is usually not in the display plane. As a result, the visual system in the viewer's brain will force the eyeball to focus on the virtual 3D object, while the eye lens focuses on the display plane, which leads to mismatched accommodation distance and vergence distance (figure 1 (b): right). This phenomenon is called vergence–accommodation conflict (VAC) [ 16 ], which causes dizziness and nausea. Besides visual comfort, the overall weight and volume of the system will also limit the usage time and applications. To achieve wearer comfort, the system should be as light as possible while keeping a broad FOV in the virtual space. In this section, we will focus on advanced VR architectures that address 3D image generation to mitigate VAC and reduce the headset volume.

Figure 1.

Figure 1.  (a) The layout of a VR optical system. (b) The root cause of the VAC issue. The accommodation cue coincides with the vergence cue when viewing a real object (left). Mismatch occurs when viewing a virtual object displayed in a fixed plane (right).

Download figure:

2.1. VAC mitigation

2.1.1. multi-focal system.

The multi-focal display was proposed to solve the VAC problem of HMDs in the late 1990s [ 17 ]. The basic principle of a multi-focal system is to generate multiple image planes or shift the position of image planes to match the vergence distance and accommodation distance, thereby overcoming the VAC issue. Based on different architectures and principles, multi-focal VR systems can be categorized into space multiplexing, time multiplexing, and polarization multiplexing systems.

Space multiplexing can simultaneously generate multiple image planes with different depths. To achieve this goal, Rolland et al [ 18 ] proposed a very straightforward method to physically place multiple screens based on transparent panels, as illustrated in figure 2 (a). However, the transparent panels will not only increase the cost but also exhibit obvious moiré patterns after stacking multiple panels together [ 19 ]. To avoid this problem, beam splitters (BSs) can be utilized to help establish the space multiplexing system, as figure 2 (b) shows [ 20 ]. In this design, the display panel is placed on one side, while the BSs reflect different parts of the display. Since the distance between each BS and the human eye is different, the image is displayed at different depths. Space multiplexing provides a direct solution to address VAC in VR displays and maintains image quality and frame rate. However, this architecture requires multiple display panels or BSs, which leads to dramatically increased weight and volume. Recently, a focal plane display with a phase-only spatial light modulator (SLM) has been demonstrated [ 21 ]. This architecture can achieve multi-focal planes with reduced system size and weight, but it requires an expensive SLM, and the image quality is not ready for commercial products yet.

Figure 2.

Figure 2.  A schematic diagram of space multiplexing using multiple (a) transparent panels and (b) BSs. Time multiplexing by (c) shifting the display and (d) applying a tunable lens. Polarization multiplexing by (e) applying PPML.

The time multiplexing method relies on dynamic components and can timely change the panel distance (figure 2 (c)) or the effective focal length (figure 2 (d)) [ 22 , 23 ]. The panel distance is usually changed by a mechanical motor, which leads to a lack of stability and modulation rate. For time multiplexing, the modulation rate of the dynamic components should be at least N times ( N is the number of image planes) the display frame rate to avoid motion blurs. Therefore, compared with mechanically tuning the panel position, tuning the effective focal length through an electrically driven eyepiece is more favourable. Although it is still challenging to fabricate an adaptive lens with a wide tuning range and fast response time, this method can reduce the number of physical elements, so the system volume is much more compact than that of spatial multiplexing.

Polarization multiplexing generates multiple image planes based on different polarization states. To distinguish different polarization states, the most critical optical component is a polarization-dependent lens with different focal lengths for two orthogonal polarization states. Two such examples are: (a) the Pancharatnam–Berry phase lens, based on left-handed circularly polarized light (LCP)/right-handed circularly polarized light (RCP), and (b) the birefringent lens based on horizontal/vertical linearly polarized light [ 24 , 25 ]. Figure 2 (e) depicts the basic polarization multiplexing system. The light emitted from the display panel transmits through a pixelated polarization modulation layer (PPML), which can modulate the ratio of two orthogonal polarization states, so the light intensity of each pixel in the corresponding focal plane can be adjusted independently. PPML can be a polarization rotator for a linear polarization system [ 26 ] or an integrated polarization rotator and quarter-wave plate for a circularly polarized system [ 27 ]. The advantage of polarization multiplexing is that it can generate multiple image planes without sacrificing the frame rate or an enlarged system volume. However, the major limitation of polarization multiplexing is that only two orthogonal polarization states can be utilized. It should be mentioned that these multiplexing approaches can be combined. For example, time multiplexing or space multiplexing can be combined with polarization multiplexing to increase the number of focal planes [ 27 , 28 ].

2.1.2. Micro-lens array system

Unlike using a large single lens as an eyepiece, another advanced architecture involves adding a micro-lens array (MLA) in front of the display panel to globally or individually change the position of virtual images in a VR system [ 10 ]. When the MLA is precisely aligned with the display panel, a small movement of the MLA can lead to a large focus change for the virtual image. As a result, instead of moving a thicker display panel or bulkier lens over a longer range, pushing or pulling the MLA plate a small distance can significantly mitigate the VAC. It is worth mentioning that the focus of an MLA based on liquid crystal materials can be switched dynamically for several microns, which means the movement of virtual images can be obtained without any mechanical motion, as shown in figure 3 (a). Furthermore, as figure 3 (b) shows, if each MLA element can be precisely controlled independently, then we can produce a specific focus for each lenslet and generate pixelated depth. These techniques are suitable for VR displays as well as for free-space-based couplers in AR displays. It is worth mentioning that in the MLA system, the resolution is usually an important issue, which needs to be further improved.

Figure 3.

Figure 3.  A schematic diagram of focus tuning systems based on (a) an electronically addressable MLA and (b) an individually tunable MLA. A schematic diagram of (c) a real object and (d) light field with an MLA.

2.1.3. Light field system

To mitigate VAC, both temporally and spatially changed displays have been proposed. However, due to a limited or discrete tuning range, these methods can only partially recreate the 3D object with the correct depth. Rather than changing the image focus, light field displays ideally recreate a physical wavefront similar to that created by a real object. The light field capture (e.g. integral imaging) [ 29 – 31 ] can be achieved by a lens array to convert the light from display pixels to rays with arbitrary spatial angles. As depicted in figure 3 (d), the spatial points correspond to the pixels on the display panel. To display a virtual 3D object, we trace the points on the object and light the corresponding pixels on the display panel. Then, the light field on those points can be approximated with discrete emitting rays. Although this method can provide correct depth information and retinal blur, the resolution is sacrificed. If the amount of information is taken into consideration, it is not surprising that these approaches that aim to show true 3D information cannot offer sufficient resolution due to the limited bandwidth of the current devices. Generally, the resolution is limited by the display and the individual lens. Although a high-resolution display has been proposed, the pixel pitch is still determined by the diffraction limit of the employed lens [ 29 ]. These approaches should gradually mature in the long run and eventually reach a satisfactory level for viewers. But at the current stage, the main drawbacks of this architecture are resolution loss, refresh rate increase, and/or redundancy of display panels.

2.2. Pancake VR

As discussed above, aside from visual comfort, wearable comfort is another important consideration. To reduce the volume and weight of a VR system, thereby improving its wearable comfort, a compact optical design, while taking the headset's central gravity into consideration, is urgently needed. Recently, polarization-based folded optics (or pancake optics) with further reduced form factors have attracted increasing attention. The system was originally proposed for use in flight simulators [ 32 ] and it has gained renewed interest due to rapid development of VR [ 33 , 34 ]. The basic concept is to create a cavity to fold the optical path into a smaller space. The working mechanism is illustrated in figure 4 (a). The cavity lies between a BS and a reflective polarizer. The BS (including a metallic or dielectric half mirror) has 50% transmittance and it flips the handedness of incident polarized light upon reflection. The reflective polarizer selectively transmits light with one polarization state and reflects the orthogonal one, which can be achieved by a wire-grid polarizer, birefringent multi-layer film, or cholesteric liquid crystal (CLC). The former two respond to a linear polarization, while the latter respond to a circular polarization. To explain the working principle, we use a circularly polarized light as an example. As shown in figure 4 (a), the incoming RCP light in region A firstly passes through the BS (50%) and gets reflected by the reflective polarizer. Then, it is reflected by the BS again (25%), while flipped to the LCP state. Finally, the LCP light passes the reflective polarizer and enters region C. Because of the BS, only 25% of the total energy is delivered to the viewer's side. Therefore, system efficiency is an important issue in the pancake VR system. Practical systems often involve one or more refractive elements, which can be placed in any of the specified ABC regions. The surfaces of the reflective polarizer and BS can also be curved according to design requirements. An example with a refractive lens placed in region B is plotted in figure 4 (b). The BS (half reflector) in this case is coated on the curved surface of the lens.

Figure 4.

Figure 4.  Polarization-based folded optics. (a) An illustration of the working principle. (b) An example of folded optics with a refractive lens. (c) A CLC-based reflective polarizer with optical power. (d) A reflective hologram with angular selectivity.

All the above discussions only consider traditional geometric optics, where the optical power is provided by reflection or refraction of the curved surfaces. Recent advances in holographic optics, however, offer an even wider range of choices for optical elements. Both the reflective polarizer and BS can be flat holographic films [ 35 ]. As figure 4 (c) shows, the reflective polarizer can have a focusing power by patterning the CLC molecules. The polarization selectivity of CLC leads to an optical power for one circular polarization and total transparency for the other. The BS can also be replaced by a phase hologram. Such a phase hologram is often fabricated by holographic exposure of a photopolymer [ 36 ]. Its index modulation is usually small, resulting in narrow angular and spectral responses. This angular selectivity can be utilized to boost the overall system efficiency. As depicted in figure 4 (d), for a certain reflective hologram, light within the angular response is reflected with flipped handedness. Other incident lights that do not meet the Bragg conditions will traverse the hologram. With this feature, the BS efficiency can potentially reach 100% because both the transmission and reflection efficiencies can reach 100%. This means the overall system efficiency can be improved from 25% to nearly 100%. However, the narrow angular and spectral selectivity also indicates the requirement for a directional backlight with narrow spectral linewidth, which could be challenging for practical implementation.

3. Advanced architectures for AR displays

In contrast to the immersive experience provided by VR displays, AR displays aim for see-through systems that overlap CG images with physical environments. To obtain this unique visual experience with wearable comfort, the near-eye systems need to possess high transmittance with sufficient FOV and a compact form factor. Therefore, freeform optics with broad FOV and high transmittance are essential for AR displays. However, due to the prism-shape, this architecture presents a relatively large volume and heavy weight. To reduce the system size while keeping a sufficient FOV, a lightguide-based structure and free-space coupler are commonly used to create a delicate balance between visual comfort and wearable comfort.

3.1. Freeform prisms and BS architectures

Freeform prisms have been extensively investigated due to the development of diamond-turning machines. Typically, the freeform prisms used in an AR system need a partially reflective surface and a total internal reflection (TIR) surface to overlap the CG images and transmit the surrounding environments. As shown in figure 5 (a), this configuration sophisticatedly incorporates two refraction surfaces, a TIR surface, and a partial reflection surface into one element, and therefore allows extra design freedom [ 37 , 38 ]. This design provides high-quality images with a wide FOV, but due to its volume limitation the entire system will be bulky and heavy. Another common example of a freeform-based AR device uses a designed BS cube as the coupler. In figure 5 (b), the magnifying optics is a reflective concave mirror disposed directly on the BS cube, which has more freedom to be further optimized. This device architecture provides the simplest solution to AR display with a broad FOV but a larger form factor. Moreover, there is another trade-off between the FOV and eyebox (or exit pupil) due to the conservation of étendue, which is the product of the FOV and eyebox. Therefore, the larger the FOV, the smaller the eyebox [ 39 ].

Figure 5.

Figure 5.  A schematic AR diagram with a (a) a freeform prism and (b) a specially designed BS cube.

3.2. Lightguide-based architectures

Compared to the freeform design, the lightguide-based structure has a more balanced performance between visual comfort and wearable comfort, especially in the compact and thin form factor [ 40 , 41 ]. Over the past decade, lightguide-based near-eye display (LNED) has become one of the most widely used architectures for AR displays and is applied in many commercial products, such as HoloLens 2, Magic Leap 1, and Lumus. For an LNED, input and output couplers are pivotal optical elements affecting the system's performance. Typically, the input coupler has a high efficiency enabling it to fully utilize the light emitted from the optical engine. In contrast, the output coupler has low and gradient efficiency across the exit pupil to ensure an expanded and uniform eyebox. According to different coupler designs, LNEDs can be categorized into grating-based lightguides (figure 6 (a)) and geometrical lightguides (figure 6 (b)).

Figure 6.

Figure 6.  Schematic diagrams of (a) grating-based and (b) geometrical lightguide-based AR architectures.

3.2.1. Grating-based lightguide

As shown in figure 6 (a), the display light is coupled into the lightguide by an input grating and then propagates inside the lightguide through TIR. When it encounters the output grating, the light is replicated and diffracted into the viewer's eyes. To provide a comprehensive understanding, we will theoretically analyze the FOV limit and discuss the commonly used grating couplers. For a diffractive grating, the first-order grating equation can be stated as:

where θ in and θ out represent the incident angle and diffracted angle, respectively, n in and n out are the refractive index of the incident medium and output medium, λ is the wavelength in vacuum, and Λ is the grating period. With this simple grating equation, the maximum system FOV can be calculated. If we assume the FOV in air is centrosymmetric, then the viewing angle in air ( θ air ) is related to the minimum/maximum guiding angles ( θ min / θ max ) in the lightguide as:

where n g is the refractive index of the lightguide, n air is the refractive index of air, θ min can be set to the TIR angle in the lightguide, and θ max should be less than 90°. Thus, the maximum horizontal FOV is [ 42 ]:

Figure 7 (a) shows the FOV as a function of n g and θ max . In an ideal case where θ max = 90° and n g = 2, the maximum system FOV is only 60°. In practical designs, such a high index lightguide substrate is still challenging to achieve, and θ max cannot approach 90° due to image quality considerations. This FOV limit is generally true for most grating-based lightguide AR. However, some methods can be employed to circumvent this limit. For instance, using a different system configuration [ 42 ], FOV can be expanded to 100°, or by leveraging polarization-dependent optical elements the FOV can be nearly doubled [ 43 ]. In equation ( 3 ), it seems that the FOV is independent of the wavelength, but the wavelength dependency is implicitly embedded in equation ( 2 ). For the extreme case with θ max = 90° and n g = 2, if the waveguide is designed at 535 nm, then the grating period is calculated to be 357 nm and the horizontal FOV is [−30°, 30°]. Utilizing such a grating period for blue (e.g. 450 nm) and red (e.g. 630 nm) with the assumption that the angle ranges in the lightguide are the same will lead to an FOV of [−15°, 48°] and [−50°, 14°], respectively. Thus, more than one grating is needed to obtain the same FOV for RGB colours. Although implementation of three gratings with narrow spectral bandwidths for R, G, and B in one lightguide is possible, it is still hard to eliminate colour crosstalk among different gratings. A more common choice is to have two (e.g. one for R, and one for G and B) or three (e.g. R, G, and B) lightguides [ 44 ], where the system's compactness is slightly sacrificed. Another important aspect is that the spectral response of most gratings depends on the incident angle. This can be well illustrated using a volume Bragg grating (VBG) as an example. For a VBG, the central wavelength is defined by the Bragg condition as:

Figure 7.

Figure 7.  (a) FOV as a function of lightguide refractive indexes and maximum guiding angles. (b) Angle dependency of a VBG ( n eff = 1.5) designed for 535 nm and a diffraction angle of 50° at normal incidence. The inset shows the definition of θ , which is the angle of incident light relative to the normal direction of Bragg planes. For reflective VBGs: diffraction efficiency as a function of (c) wavelength and (d) incident angle; for transmissive VBGs: diffraction efficiency as a function of (e) wavelength and (f) incident angle. Simulations are based on rigorous coupled wave analysis.

where θ represents the incident light angle with respect to the normal direction of Bragg planes (see the inset in figure 7 (b)), and n eff is the effective refractive index of the VBG. If a VBG (e.g. n eff = 1.5) is designed for a normally incident green light ( λ = 535 nm) with 50° diffraction angle in a lightguide, then the angle-dependent central wavelength can be calculated, as figure 7 (b) depicts. For such a VBG, the central wavelength would shift from green to blue as the incident angle increases. Therefore, when designing a VBG-based lightguide AR for full-colour operation, such a colour crosstalk should be carefully analyzed.

In terms of selecting grating couplers, two types of gratings are commonly used in lightguide AR: holographic VBGs and surface relief gratings (SRGs). In holographic VBGs, sinusoidal refractive index modulation in the volume is introduced by interference exposure of holographic photopolymers. The refractive index modulation can be described by [ 45 ]:

Unlike holographic VBGs that have refractive index modulation in the bulk, SRGs have specially designed microstructures on the surface, which can be massively produced by nanoimprinting [ 48 ]. The surface structures have a large design degree of freedom. The shapes of grating structures can be blazed, slanted, binary, and even analogue, according to different needs [ 10 ]. The spectral and angular responses of SRGs strongly depend on the shape of surface structures. Due to high refractive index contrast between the substrate and air, the structure height can be submicron to achieve high diffraction efficiency.

Besides holographic VBG and SRG, CLC-based polarization volume grating (PVG) is also a strong contender [ 49 , 50 ]. Due to their volume grating nature, PVGs can be treated as a branch of holographic VBGs and their spectral and angular responses are very similar. However, PVGs exhibit some unique properties. First, PVGs are strongly circular-polarization dependent originating from CLCs [ 51 ], while VBGs and SRGs have weak polarization dependency on linear polarizations. For example, for a left-handed reflective PVG, it only diffracts the LCP light within the bandwidth into the first order, while transmitting the RCP light. This feature is useful for designing polarization-dependent optical elements. Second, if we use equation ( 5 ) to approximately describe the behaviour of PVGs (in fact, to describe PVGs the refractive indices in equation ( 5 ) should be replaced by dielectric constants), Δ n can be very large. For instance, if the host liquid crystal has a birefringence of 0.2, the effective Δ n can be as large as 0.5 ∼ 0.6 for a VBG. As a result, its spectral and angular bandwidths can be much larger than those of holographic VBGs. Moreover, recent studies show that multi-layer PVGs or gradient-pitch PVGs can be easily achieved to further enlarge the angular bandwidth [ 52 , 53 ].

3.2.2. Geometrical lightguide

Compared to grating-based lightguides, geometrical lightguides need more complex designs (e.g. spatial variant coatings) to achieve gradient efficiency, and it is relatively hard to add a lens power to the output. However, the working principle is very simple, and all the designs are based on surface reflection. Generally, geometrical lightguides use embedded reflective surfaces as the exit pupil expander to reflect and replicate the light [ 54 , 55 ].

As figure 6 (b) shows, a series of cascaded, embedded, and partially reflective surfaces can be used as output couplers in the geometrical lightguide architecture. As the embedded surface is reflective, it yields good colour uniformity over the entire FOV. However, this cascaded design produces the Louver effect [ 10 ], which is unfavourable for see-through devices. Recently, this effect has been reduced due to better cutting, polishing, coating, and design, but it is still a limitation. In addition, these complicated fabricating processes put more burdens on manufacturers. As an extension, the embedded partially reflective surfaces can be designed as flat surfaces (figure 6 (b)), pin-shaped mirror arrays (figure 8 (a)), microprism arrays (figure 8 (b)), or a curved lightguide with curved surfaces (figure 8 (c)) [ 56 ].

Figure 8.

Figure 8.  A schematic diagram of geometrical lightguide AR: (a) microprism array, (b) pin-shaped mirror array, and (c) curved coupler.

3.3. Free-space coupler-based architectures

Unlike freeform optical devices or LNEDs, free-space couplers have greater freedom in the architecture, and there are no special restrictions on volume or TIR. Undoubtedly, due to large degrees of freedom, numerous architectures based on free-space couplers have been proposed, but each design has its pros and cons. These systems can be classified into three categories based on the working principles: reflective coupler, diffusive coupler, and diffractive coupler.

3.3.1. Reflective coupler

A reflective free-space coupler is based on the surface reflection of a flat or curved surface. Due to the high transmittance requirement, these surfaces should be partially reflective with sufficient reflection and transmittance. Figure 9 (a) depicts the most straightforward architecture with a flat coupler, which is a tilted partial reflective surface. The CG images emitted from the display are collimated by the lens and then reflected into the viewer's eye through the flat coupler. To further simplify the system, such a flat coupler can be replaced by a partially reflective curved or freeform surface with a specially designed profile, as shown in figure 9 (b). This design is aimed at smartphone displays rather than complex off-axis imaging and micro-display. This architecture has been successfully applied to Meta 2 by Meta Vison, DreamGlass by Dream World, and NorthStar by LeapMotion. Due to a large display panel and curved reflective surface, such a reflective coupler exhibits a relatively broad FOV but also a large system volume.

Figure 9.

Figure 9.  A schematic diagram of reflective free-space coupler-based AR: (a) a flat coupler, and (b) a curved coupler. llustrations of diffusive free-space coupler-based AR: (c) a single diffuser, and (d) multiple diffusers.

3.3.2. Diffusive coupler

A diffusive free-space coupler is based on the light scattering of optical elements [ 57 ]. In such a system, the displayed images are directly projected onto the coupler, which is usually a diffuser with a flat or curved surface. As illustrated in figure 9 (c), the light is scattered by the coupler and then the image is displayed on the diffuser surface. Usually, the image source is a liquid-crystal-on-silicon (LCoS) or digital micro-mirror device, and the image resolution is controlled by the display and projection lens. To keep see-through capability, the diffuser should have angular selectivity to scatter the off-axis incident image and transmit the environment light in front of the eye. Therefore, the system can accommodate more than one diffuser, and thereby has the space to construct a 3D image with multiple planes [ 58 ], similar to the multiplane design in a VR system. As depicted in figure 9 (d), each diffuser scatters the incoming light with the corresponding incident angle and do not interfere with each other.

3.3.3. Diffractive coupler

A diffractive free-space coupler is based on flat diffraction optical elements with designed phase profiles, such as lens or freeform optics [ 59 , 60 ]. More specifically, the architectures based on diffractive couplers can be divided into free-space systems, Maxwellian systems, and integral imaging systems. The free-space-based diffractive couplers, as illustrated in figure 10 (a), are utilized in a pupil-forming system, which means it uses relay optics to first image the object and then deliver the relayed image to the viewer's eye with the diffractive coupler [ 61 , 62 ]. The image source includes but is not limited to a conventional 2D display and laser light source. However, due to the nature of diffractive flat optics and off-axis system configuration, aberrations like coma and astigmatism are large and need to be tackled with sophisticated optical design or image pre-processing. The Maxwellian system adopts the principle of a Maxwellian view [ 63 ], which directly forms a focus-free image on the retina. The diffractive couple can be a reflective off-axis lens with a designed focal length [ 64 , 65 ]. It is worth mentioning that because the light needs to be focused on the pupil, the eyebox in the Maxwellian system is relatively small. To expand the eyebox, an exit pupil shifting can be applied to increase the area covered by the focal point [ 66 ]. Generally, the image light is focused by the coupler and the focal spot is located at the eye lens. As a result, the image on the retina stays in-focus no matter how much the optical power of the eye lens changes. Depending on the image source, the system can be achieved by an LCoS (figure 10 (a)) or a laser beam scanner (LBS) (figure 10 (c)) for a simpler design. The light field system with an MLA can also be applied to the AR system, such as the light field in a VR display [ 67 , 68 ]. As depicted in figure 10 (d), a typical configuration is the projection system which is used to relay the original image from the image source to near the focus of the diffractive coupler, similar to the free-space combination system. The relayed image then works in the same way as depicted in figure 3 (d) and produces the light field to display 3D virtual objects. Similar to the multiplexing method in VR displays, these different AR architectures with fictional optics are not independent. On the contrary, they can be combined with each other to balance their respective advantages and trade-offs, and even enable new features [ 69 ].

Figure 10.

Figure 10.  A schematic diagram of diffractive free-space coupler-based AR: (a) a free-space diffractive coupler, a Maxwellian system with (b) SLM and (c) LBS, and (d) an integral imaging system.

To quantitatively summarize the performance of AR architectures based on visual comfort and wearable comfort, table 1 compares the form factor and FOV among different coupling methods. It should be mentioned that for each architecture, the performance can be further improved based on the current value but at the cost of other parameters. Therefore, the contents listed in table 1 are the general conditions rather than strict restrictions.

Table 1.  Performance comparison of various AR architectures.

a These not only depend on the FOV and eyebox design but also include an optical engine part. b These typical values come from products and prototypes.

4. Conclusions

In this review, we summarize the advanced architectures with different optical components in the rapidly evolving VR and AR systems, including the most recent optical research and products, and analyze the systems based on the visual and wearable comforts case by case. Because of the various advanced architectures with unique features, such as reducing VAC through adjustable lenses, solving compact size issues using polarizing films, and providing a large FOV through freeform optics, VR and AR displays present both scientific significance and broad application prospects. Although, at the current stage, it is still challenging for these architectures to meet all the requirements for visual and wearable comfort, learning about and reviewing advanced systems will certainly help us focus on unresolved issues and inspire more elegant solutions.

Acknowledgments

The authors are indebted to GoerTek Electronics for financial support.

Data availability statement

All data that support the findings of this study are included within the article (and any supplementary files).

A User Study Trends in Augmented Reality and Virtual Reality Research: A Qualitative Study with the Past Three Years of the ISMAR and IEEE VR Conference Papers

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Augmented reality and virtual reality: Changing realities in a dynamic world

Edited by Timothy Jung, M. Claudia tom Dieck, Philipp A. Rauschnabel. Springer International Publishing, Cham, 2020. pp. 409. ISBN: 978-3-030-37868-4 (hbk)

  • Book Review
  • Published: 06 October 2021
  • Volume 23 , pages 637–639, ( 2021 )

Cite this article

virtual reality and augmented reality research paper

  • Tsz-Wai Lui   ORCID: orcid.org/0000-0002-8986-2306 1  

858 Accesses

4 Citations

Explore all metrics

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Bostrom RP, Heinen JS (1977) MIS problems and failures: a socio-technical perspective. Part I: the causes. MIS Q 1(3):17–32

Article   Google Scholar  

Dimensional Research (2018) The state of augmented and virtual reality, from Jabil: what is the future of augmented and virtual reality? https://www.jabil.com/dam/jcr:eacf5015-c278-4408-8eca-6c505e9ff4ff/ar-vr-trends-white-paper.pdf . Accessed 1 Sept 2021

Needleman SE, Horwitz J (2021) Computerized glasses arrive—Facebook, Apple and Niantic bet people are ready to embrace the face-borne devices. Wall Str J. https://www-proquest-com.erm.lib.mcu.edu.tw/newspapers/computerized-glasses-arrive-facebook-apple/docview/2509156612/se-2?accountid=12469 . Accessed 1 Sept 2021

Download references

Author information

Authors and affiliations.

The Graduate Institute of Sport, Leisure and Hospitality Management, National Taiwan Normal University, No. 162, Section 1, Heping E. Road, Taipei, 106, Taiwan

Tsz-Wai Lui

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Tsz-Wai Lui .

Ethics declarations

Conflict of interest.

The author declares that there is no conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Lui, TW. Augmented reality and virtual reality: Changing realities in a dynamic world. Inf Technol Tourism 23 , 637–639 (2021). https://doi.org/10.1007/s40558-021-00212-7

Download citation

Received : 01 September 2021

Revised : 01 September 2021

Accepted : 09 September 2021

Published : 06 October 2021

Issue Date : December 2021

DOI : https://doi.org/10.1007/s40558-021-00212-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Find a journal
  • Publish with us
  • Track your research

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 08 May 2024

Full-colour 3D holographic augmented-reality displays with metasurface waveguides

  • Manu Gopakumar   ORCID: orcid.org/0000-0001-9017-4968 1   na1 ,
  • Gun-Yeal Lee   ORCID: orcid.org/0000-0001-6274-8776 1   na1 ,
  • Suyeon Choi 1 ,
  • Brian Chao 1 ,
  • Yifan Peng 2 ,
  • Jonghyun Kim 3 &
  • Gordon Wetzstein   ORCID: orcid.org/0000-0002-9243-6885 1  

Nature ( 2024 ) Cite this article

Metrics details

  • Nanophotonics and plasmonics

Emerging spatial computing systems seamlessly superimpose digital information on the physical environment observed by a user, enabling transformative experiences across various domains, such as entertainment, education, communication and training 1 , 2 , 3 . However, the widespread adoption of augmented-reality (AR) displays has been limited due to the bulky projection optics of their light engines and their inability to accurately portray three-dimensional (3D) depth cues for virtual content, among other factors 4 , 5 . Here we introduce a holographic AR system that overcomes these challenges using a unique combination of inverse-designed full-colour metasurface gratings, a compact dispersion-compensating waveguide geometry and artificial-intelligence-driven holography algorithms. These elements are co-designed to eliminate the need for bulky collimation optics between the spatial light modulator and the waveguide and to present vibrant, full-colour, 3D AR content in a compact device form factor. To deliver unprecedented visual quality with our prototype, we develop an innovative image formation model that combines a physically accurate waveguide model with learned components that are automatically calibrated using camera feedback. Our unique co-design of a nanophotonic metasurface waveguide and artificial-intelligence-driven holographic algorithms represents a significant advancement in creating visually compelling 3D AR experiences in a compact wearable device.

Emerging augmented-reality (AR) systems offer new experiences to users and have far-reaching implications for applications that span entertainment, education, communication, training, behavioural therapy and basic vision research 1 , 2 , 3 . To unlock their full potential in consumer applications, however, AR display systems must be compact—ideally no larger than conventional eyeglasses—to enable comfort and style for all-day use. Among the plethora of optical designs proposed for such near-eye displays 6 , 7 , waveguide image combiners are the most promising solution for AR glasses because of their compact form factors. Current waveguide designs, however, require projection optics with a thickness proportional to the focal length of the projection lens (Fig. 1a ), introducing optical bulk, and they are limited to displaying two-dimensional (2D) images at a fixed distance to the user. These limitations result in reduced perceptual realism and visual discomfort due to the vergence–accommodation conflict 4 , 5 and, even with small projector optics, it is challenging to achieve a device form factor that matches the style of common eyeglasses.

figure 1

a , Conventional AR glasses use amplitude SLMs, such as organic light-emitting diodes or micro light-emitting diodes, which require a projector-based light engine that is typically at least as thick as the focal length f of the projection lens. b , The design of our holographic AR glasses uses a phase-only SLM that can be mounted very close to the in-coupling grating, thereby minimizing the device form factor. Additionally, unlike conventional AR glasses, our holographic design can provide full 3D depth cues for virtual content, as illustrated by the bunny (adapted from the Stanford Computer Graphics Laboratory). c , Compact 3D-printed prototype illustrating the components of our holographic AR glasses in a wearable form factor.

Holographic principles 8 could enable the ‘ultimate display’ 9 using their ability to produce perceptually realistic 3D content using ultrathin optical films 10 , 11 . This ability motivated previous attempts to adapt digital holography to AR display configurations 12 , 13 ; though promising, these methods failed to achieve the compact form factors and high 3D image quality required to unlock future spatial computing applications.

Here we develop a new AR display system that pairs a lensless holographic light engine with a metasurface waveguide optimized for full-colour optical-see-through (OST) AR display applications in a compact form factor (Fig. 1b ). Compared with other waveguides, our optical system is unique in enabling the relay of full-colour 3D holographic images with high uniformity and see-through efficiency. This remarkable capability is enabled by the use of inverse-designed metasurface 14 , 15 , 16 grating couplers. Metasurfaces 17 , 18 have been demonstrated to offer higher diffraction efficiency 19 , spectral selectivity 20 , Q -factor 21 and transmittance 22 than conventional refractive and diffractive optical elements in applications, including AR 23 , virtual reality 24 and wearable devices 20 . Unlike these approaches, ours not only optimizes the devices and demonstrates novel applications of metasurfaces, but also co-designs the entire optical system, including the geometry of a high-index glass waveguide and the metasurface grating couplers, to enable compatability with holographic AR display systems. Waveguide holography has been described in recent work for non-see-through virtual reality settings 25 , but it has seen limited adoption because of its poor image quality. To address this challenge, we develop a mathematical model that describes the propagation of coherent waves in a waveguide using a combination of physically accurate modelling techniques and artificial intelligence. The learnable parts of this model are automatically calibrated using camera feedback with our prototype. This approach significantly advances recent artificial-intelligence-driven holography algorithms 26 , 27 , 28 , 29 by making them suitable for compact waveguides in see-through AR configurations. With our system, we obtained high-quality, full-colour multiplane 3D holographic images using a single OST AR waveguide. Compared with related optical designs 30 , 31 , 32 , 33 , our system provides unprecedented full-colour image quality in a compact form factor, enabling a path towards true 3D holographic AR glasses.

Inverse-designed metasurface waveguide

For OST AR displays, it is critical to provide the user with an unobstructed view of the physical environment while overlaying digital information on their vision of the world. Waveguide image combiners are thin transparent optical systems that have become the industry norm for these applications 7 , enabling the aforementioned capabilities. Our metasurface waveguide system design optimizes compactness, dispersion correction, transmission efficiency and angular uniformity to meet the high demands of 3D-capable AR applications.

Precise manipulation of coherent wavefronts in a waveguide system is crucial for holographic displays, but is very challenging due to the interfering nature of coherent light. We address this challenge using a high-index glass material with a homogeneous design of all-glass metasurfaces (Fig. 2 ). For a compact waveguide system to minimize boundary reflection and interference, a single-layer coupler is necessary. This coupler must guide broadband visible light through the waveguide at a high diffraction angle, ensuring total internal reflection (TIR). The critical angle, represented as \({\theta }_{{\rm{c}}}(\lambda )={\sin }^{-1}\left(\frac{1}{n(\lambda )}\right)\) , dictates that shorter wavelengths λ require a higher refractive index n to achieve TIR. Our numerical analysis indicates that a refractive index of 1.8 or higher is necessary to transmit all red, green and blue wavelengths through a single coupler, with a higher index expanding the field of view. This underscores the importance of employing a high-index material in our system design. In addition, the high-index glass ( n  > 1.8), with a complex refractive index denoted as \(\widetilde{n}=n+ik\) , assures minimal absorption loss ( k  ≈ 0) and provides sufficient light–matter interaction, while typical glass ( n  < 1.5) is insufficient to locally manipulate electromagnetic waves due to weak light–matter interaction. As a result, the high-index glass metasurface attains a balance between high see-through efficiency and diffraction efficiency, surpassing the capabilities of typical glass metasurfaces.

figure 2

a , Visualization of the waveguide geometry for full-colour operation. b , Electric field maps at red (638 nm), green (521 nm) and blue (445 nm) wavelengths for light passing through the metasurface out-coupler towards the user’s eye. The black arrows illustrate the wave vectors of the incident and diffracted light. c , Visualization of the inverse-designed metasurfaces optimized for waveguide couplers. The period ( Λ ) and height ( H ) of the metasurfaces are 384 nm and 220 nm, respectively. d , Scanning electron microscope images of the fabricated metasurfaces. e , The simulated and experimentally measured transmittance spectra of unpolarized light for the inverse-designed metasurfaces in the visible range, corresponding to see-through efficiency for real-world scenes. f , The simulated (dashed lines) transfer functions along the x axis for the conventional single-lined gratings and the simulated (solid lines) and experimentally measured (circles) transfer functions for our inverse-designed metasurfaces. The colour of the plots corresponds to the red, green and blue wavelengths. The designed metasurfaces are much more efficient than conventional gratings in green and blue, but, due to the very large diffraction angle of red, further improvement of the efficiency of the red channel is more difficult. g , Uniformities of the transfer functions for the conventional gratings without optimization and the inverse-designed metasurfaces with optimization. Scale bars, 400 nm ( b ), 2 μm ( d , left), 200 nm ( d , right). E , electromagnetic field.

Although the high-index glass enables propagation of broadband light with TIR, dispersion correction is further required for full-colour operation. Dispersion-engineered metasurfaces could be an option 34 , 35 , as a device-level solution, but they often have insufficient degrees-of-freedom to meet the system performance required for AR applications (namely, high uniformity and see-through efficiency). To this end, we correct the chromatic dispersion at the system level through geometric design of the metasurface waveguide system and k-vector matching of the input and output couplers. The in- and out-couplers are designed to have the same momentum but with an opposite direction, so they can couple the incident light in and out without observable dispersion. 7 Additionally, to spatially match the couplers, we design a dispersion-compensating waveguide geometry by precisely engineering the waveguide thickness and the dimensions and distances of the symmetric metasurface couplers. The lateral displacement of a replicated pupil inside the waveguide can be expressed as \(l(\lambda )=2{d}_{{\rm{w}}{\rm{g}}}\tan \left({\sin }^{-1}(\frac{\lambda }{n(\lambda )\varLambda })\right)\) , where d wg , λ and Λ are the waveguide thickness, the wavelength of light in free space and the grating period, respectively. Our idea is to design the waveguide geometry to have a suitable least common multiple of the \(l\left(\lambda \right)\) function for red, green and blue wavelengths, which can be described by  ∃   d wg ,  Λ : LCM(  l ( λ R ),  l ( λ G ), l ( λ B ) ) <  L wg , where L wg is the maximum length between in- and out-couplers for a compact near-eye display and LCM is the least common multiple function. Specifically, we set d wg and Λ to 5 mm and 384 nm, respectively; with these parameters, the red, green and blue wavefronts from the in-coupler propagate through the waveguide through one, three and five internal reflections, respectively, before meeting at the out-coupler, as illustrated in Fig. 2a .

To optimize the geometry of the metasurface gratings for maximum diffraction efficiency and uniformity of angular response, we employ a rigorous-coupled-wave-analysis solver 36 . Our metasurface couplers operate in transverse electric polarization mode to provide a more uniform optical response. The optimization process uses the gradient descent method, starting from a randomly initialized geometry in the 2D spatial domain and utilizing the Adam solver 37 to refine the profiles of the metasurface gratings. The loss function in the optimization loop maximizes the sum of the first diffraction order efficiencies for red, green and blue wavelengths (638 nm, 521 nm and 445 nm), while minimizing the standard deviations of efficiencies for different incident angles, ranging from −5° to 5°, for these three wavelengths. We simplify the design process to one dimension by assuming x axis symmetry and account for fabrication tolerances of these large-area metasurfaces by adding Gaussian blur. The resulting design converged to a double-lined metasurface grating, as shown in Fig. 2c . This geometry yields metasurface couplers that steer the incident wave to high diffraction angles for red, green and blue wavelengths, as confirmed by the electric field profiles and overlaid Poynting vectors (Fig. 2b ). Importantly, the optimized asymmetric nanostructure not only enhances the diffraction efficiency in one direction but also improves uniformity over the angle of incidence.

Figure 2e shows the high see-through efficiency our inverse-designed metasurface couplers achieve, reaching approximately 78.4% in the visible spectrum. Figure 2f contains the transfer functions of our inverse-designed metasurfaces and typical gratings for red, green and blue wavelengths (full 2D transfer functions are shown in the Supplementary Information ). As opposed to conventional gratings, our metasurfaces exhibit uniform transmittance regardless of the angle of incidence, thanks to the optimized electromagnetic resonances in the nanostructures. Figure 2g quantifies the uniformity of the transfer function that is defined as the ratio of the minimum and maximum amplitudes within the viewing angle range. The inverse-designed metasurface has high uniformities of 61.7%, 91.2% and 98.3% for red, green and blue, respectively, whereas conventional gratings achieve much lower uniformities of 58.9%, 47.7% and 88.8%. These findings confirm that our inverse-designed all-glass metasurface couplers provide excellent angular uniformity and high see-through efficiency for full-colour operation.

A key challenge for the fabrication of holographic waveguides is a high sensitivity to surface irregularities or particle contamination, which directly affects the observed image quality. For this reason, we fabricate our metasurface system directly on lead-containing high-index glass (SF6 glass, SCHOTT), without any other composing materials, using electron beam (e-beam) lithography. To avoid residue particle contamination or surface damage of the lift-off process or surface irregularities introduced by physical etching, we avoid commonly used lithography processes for metasurface fabrication, including positive e-beam resist with metal lift-off or negative e-beam resist to make an etching mask. Instead, our method is based on reverse patterning with a positive e-beam resist (polymethyl methacrylate (PMMA)) using multiple dry etching methods, thus avoiding lift-off hard masks and ensuring the glass surface remains protected throughout the fabrication process ( Methods ). Note that this method can also be applied to photolithography or nanoimprint lithography for mass production 38 , 39 .

Waveguide propagation model

To simulate the propagation of coherent light through our metasurface waveguide, we first derive a physically motivated model. We then show how this model can be parameterized by neural network components that can be automatically learned from camera feedback. As shown by our experiments, the unique combination of physical and artificial-intelligence components is crucial for accurately modelling the physical optics of such a waveguide and synthesizing high-quality holograms with it.

The wavefront u IC coupled into the waveguide can be computed as the product of the phase-only spatial light modulator (SLM) pattern, e i ϕ , the incident illumination and the in-coupler aperture a IC . Since we use a converging wavefront for illumination with focal length f illum , the in-coupled wavefront is expressed as

where and x and y are the transverse coordinates.

Next, this wavefront is propagated through the waveguide to compute the out-coupled field, u OC . A physically motivated model of the waveguide is adequately described by its frequency-dependent transfer function, H WG and the aperture a OC of the out-coupler:

where \({\mathcal{F}}\) is the Fourier transform and f x and f y are the frequency coordinates. The transfer function H WG incorporates the reflection coefficients within the waveguide, coupling efficiencies, the propagation of the first diffracted order and the translation between the in- and out-coupler. The contributions of each of these components are used to derive the full expression for H WG in our Supplementary Information . Note that we can set H WG to the identity operator, ignoring the transfer function, as a naive, non-physical baseline.

Finally, the 3D images observed by a user looking through the holographic AR glasses can be simulated by propagating the out-coupled field with a model of free-space propagation, f free , to different target distances, d target , in front of the viewer:

With these equations, f WG maps phase patterns shown on the SLM to the image that a user would see while focusing at a particular depth, d target , through the waveguide, and f free maps the wavefront in front of the user’s eye to the image that a user would see while focusing at a particular depth, d target .

Although a physical model, such as f WG , should accurately describe the wave propagation in a waveguide, in practice it is challenging to model all aspects of such a physical optical system at the required accuracy. Nanoscopic differences, on the order of the wavelength of light, between the simulated model and the optical aberrations, fabrication errors, source beam, or electro-optical effect of the SLM strongly degrade the observed holographic image quality. To account for these small differences between the simulated model and physical optics, we add learnable components in the form of convolutional neural networks (CNNs) to our model. Although related approaches have recently been proposed for bulky benchtop holographic virtual reality displays 26 , 40 , 41 , 42 , ours characterizes the propagation of full-colour coherent wavefronts through an OST waveguide using this emerging paradigm. Specifically, we propose to learn parameters a IC and a OC as complex-valued fields, the spatially varying diffraction efficiencies and the CNNs at the in-coupler and target planes to account for a mismatch between simulated model and physical optics. These learned components, which are illustrated with our full waveguide model in Fig. 3 , result in the following learnable physical waveguide model:

In Methods , we detail our training procedure and CNN architecture.

figure 3

We combine physical aspects of the waveguide (highlighted in green) with artificial-intelligence components that are learned from camera feedback (highlighted in orange). In our model, the input phase pattern (left) applies a per-pixel phase delay, from 0 to 2π, to the converging illumination before the wavefront is modulated by the learned in-coupler efficiency. This wavefront is then sent through a CNN at the in-coupler plane and propagated through the waveguide, using its physically motivated transfer function, before an additional learned out-coupler efficiency is used to determine the out-coupled wavefront (centre). The latter is propagated to the target scene at various distances from the user where a CNN is applied, converting the complex-valued field into observed intensities (right). When trained on a captured dataset, the learned parameters of the CNNs, the coupler efficiencies and the waveguide propagation enable this model to accurately predict the output of our holographic AR glasses. The model is fully differentiable, enabling simple gradient descent CGH algorithms to compute the phase pattern for a target scene at runtime. The bunny scene is from Big Buck Bunny , © 2008 Blender Foundation/ www.bigbuckbunny.org , under a Creative Commons licence CC BY 3.0 .

Experimental results

Our prototype AR display combines the fabricated metasurface waveguide with a HOLOEYE LETO-3 phase-only SLM. This SLM has a resolution of 1080 × 1920 pixels with a pitch of 6.4 μm. A FISBA READYBeam fibre-coupled module with optically aligned red, green and blue laser diodes with wavelengths of 638, 521 and 445 nm is used as the light source. Since our illumination comes through the back of our waveguide, we slightly tilt our SLM and illumination, so that our digital content is not obscured by any unwanted light that is coupled into the waveguide before reaching the SLM. We capture calibration data for our artificial-intelligence-based wave propagation model and also capture results of using a FLIR Grasshopper3 12.3 MP colour USB3 sensor through a Canon EF 35 mm lens with an Arduino controlling the focus of the lens. Following recent work 42 , our experimental setup operates in a partially coherent setting where a few coherent modes are multiplexed in time to achieve optimal 3D holographic image quality with realistic depth-of-field effects. All holograms are computed using a gradient descent computer-generated holography (CGH) algorithm 26 that incorporates our camera-calibrated wave propagation model.

We show experimentally captured results from our prototype in Fig. 4 . In Fig. 4a , we qualitatively and quantitatively assess the 2D image quality and compare a naive free-space propagation model, a physically motivated wave propagation model using the rigorous-coupled-wave-analysis-simulated transfer functions and the proposed artificial-intelligence-based variant combining the physical model with camera-calibrated learnable parameters. In all examples, the artificial-intelligence-based wave propagation model outperforms the baselines by a large margin of 3–5 dB peak signal-to-noise ratio. The full-colour 3D results shown in Fig. 4b validate the high image quality our system achieves for both in- and out-of-focus regions of the presented digital content. The accurate depiction of 3D defocus behaviour can mitigate the vergence–accommodation conflict and associated discomfort for users of our display system. To our knowledge, no existing waveguide-based AR display has demonstrated full-colour 3D results with a comparable quality 25 , 43 . Finally, we also show experimental full-colour 3D results in Fig. 4c where we optically combine a physical scene with digitally overlaid content and capture the scene using different focus settings of the camera. Again, our approach outperforms baseline models by a large margin.

figure 4

a , Comparison of 2D holograms synthesized using several different wave propagation models, including free-space propagation, a physically motivated model and our proposed model combining physics and learnable parameters that are calibrated using camera feedback.  b , Comparison of two 3D holograms. Zoomed-in crops show the scene with the camera focused at different depths. Blue boxes highlight content that the camera is focused on while white boxes emphasize camera defocus. c , Comparison of a 3D hologram captured in an optical-see-through AR mode. The bird, fish and butterfly are digitally superimposed objects, and the elephant and letters are part of the physical environment. In all examples, the proposed wave propagation model represents the physical optics much more accurately, resulting in significant image quality improvements over alternative models. In a , the squirrel scene is from Big Buck Bunny , © 2008 Blender Foundation/ www.bigbuckbunny.org , under a Creative Commons licence CC BY 3.0. In  b , couch and market target scenes are, respectively, from the High Spatio-Angular Light Field dataset 49 and the Durian Open Movie project (© copyright Blender Foundation/ durian.blender.org ) under a Creative Commons licence CC BY 3.0 .

The co-design of a metasurface waveguide and artificial-intelligence-based holography algorithms facilitates a compact full-colour 3D holographic OST AR display system. To our knowledge, no system with comparable characteristics has previously been described and our experimental image quality far exceeds that demonstrated by related waveguide designs for non-see-through applications 25 .

The field of view of our waveguide design is currently limited to 11.7°. While this is comparable to many commercial AR systems, it would be desirable to enlarge it. This could be achieved using higher refractive index materials for the waveguide or by engineering an additional metasurface eyepiece into the out-coupler. Related ideas have recently been explored for other optical AR system designs 23 , which could be adapted to ours. Our waveguide is compact, but it would be interesting to further reduce its thickness d wg . In our Supplementary Information , we derive the relationship between waveguide thickness, SLM size L slm and nasal field of view θ − as

This equation shows that the thickness of the waveguide is directly proportional to the SLM size, among other factors. Therefore, the most promising path to reducing the thickness of the waveguide is to use a smaller SLM. There is a clear path to achieving this with emerging SLMs that provide very small pixel pitches, down to 1 μm (ref. 44 ), compared with the 6.4 μm of our SLM. Although not commercially available yet, these SLMs would enable ultrathin waveguides using our approach.

Similar to all holographic displays, the étendue of our display is limited by the space–bandwidth product of the SLM. Étendue expansion techniques 7 , 43 , 45 , 46 , 47 could be adapted to our settings, although no such technique has been demonstrated to support full-colour 3D waveguide holography. Another potential direction for future work would be to combine our design with an illumination waveguide as shown in prior work for a compact illumination path 25 . Finally, we have not attempted to optimize the efficiency of our CGH algorithm at runtime. While hologram generation currently takes several minutes per phase pattern, recent methods have shown that real-time inversion of wave propagation models for hologram synthesis can be achieved using machine-learning approaches 26 , 27 , 29 , 48 .

The proposed co-design of nanophotonic hardware and artificial-intelligence-driven algorithms enables optical-see-through AR display modes in smaller form factors and with higher 3D image quality than any existing approach of which we are aware, enabling a path towards true 3D holographic AR glasses.

Fabrication details

The fabrication procedure begins by coating the substrate with a 30-nm-thick Chromium (Cr) film through e-beam evaporation (Kurt J. Lesker Company). We then proceed to an e-beam lithography process (Raith Voyager) using a 50 kV e-beam to accurately create the metasurface patterns with a dimension of 6.5 mm by 6.5 mm for the in-coupler and 6.5 mm by 7.1 mm for the out-coupler, after spin-coating a positive-tone e-beam resist layer (950 PMMA a4, 1000 rpm for 60 s), post-backing the PMMA layer (180 °C for 5 min) and spin-coating a charge dissipation layer (e-spacer, Showa Denko). Then the patterns are transferred onto the high-index glass substrate using multiple dry etching steps. These steps involve an inductively coupled plasma reactive ion etcher (ICP-RIE, PlasmaTherm Metal Etcher) for Cr etching with the PMMA mask and a reactive ion etcher (RIE, Oxford Dielectric Etcher) for glass etching with the Cr mask, with a specific gas mixture of Cl 2 , O 2 , CHF 3 , CF 4 and Ar, and further aided by helium backside cooling. The remaining Cr mask is eliminated by an additional ICP-RIE process. Figure 2d presents the scanning electron microscope images of the precisely fabricated all-glass metasurface couplers.

Metasurface sample images are taken by a scanning electron microscope (FEI Nova NanoSEM 450). The representative samples are coated with a thin 3 nm film of gold/palladium to reduce charing in the images. Images are acquired with an accelerating voltage of 10 kV.

CNN network architecture

Our CNNs, CNN IC and CNN target , use a modified UNet architecture 50 to efficiently learn the residual aberrations in a physical optical system. The input wavefront is augmented by concatenating its real and imaginary values with their corresponding amplitude and phase components. After the input layer, both CNNs use 32 feature channels and perform five downsampling operations using strided convolutions, as well as five upsampling operations using transposed convolutions. The networks use instance normalization 51 , leaky rectified linear unit activation (slope −0.2) for the down blocks, rectified linear unit nonlinearities for the up blocks and skip connections. CNN IC has two-channel outputs representing the real and imaginary values, while CNN target directly outputs a single-channel amplitude. a IC and a OC are the binary aperture functions of the grating couplers for the physically motivated wave propagation model. When using the artificial-intelligence-augmented model, these quantities are complex-valued fields that are learned per colour channel.

Training the waveguide model

We train our neural-network-parameterized wave propagation model using a dataset comprising a large number of pairs of SLM phase patterns and corresponding intensity images captured by a camera focusing at different depths at the output of our prototype holographic display. The SLM phase patterns in our dataset are generated using our physical waveguide model to produce images from the DIV2K dataset, at different virtual distances through the waveguide. The model is trained over four intensity planes, corresponding to 0 D ( ∞  m), 0.33 D (3 m), 0.67 D (1.5 m), 1.0 D (1 m) in the physical space. We perform our model training on a 48 GB NVIDIA RTX A6000 with a batch size of 1 and a learning rate of 3 × 10 −4 . We note that the diversity of phase patterns is important for the model training. A dataset generated using the gradient descent CGH algorithm 26 typically consists of holographic images that primarily cover a narrow angular spectrum. Thus, we generate phase patterns with a set of random parameters, including learning rates, initial phase distribution and propagation distances. We generate 10,000 patterns for each channel and capture the corresponding intensities. The dataset is divided into training, validation and test sets with a ratio of 8:1:1. The initially trained model can be used to synthesize an additional phase dataset that is used to refine the model. Such a refinement stage improves the experimental quality. We perform this refinement procedure twice for the best quality. After this training procedure, we use our learned waveguide propagation model to synthesize holograms for new 2D and 3D scenes enabling our holographic AR glasses to operate without any additional camera feedback.

Data availability

A full-colour captured dataset specific to our holographic AR glasses prototype is available upon request.

Code availability

Computer code supporting the findings of this study is available online at https://github.com/computational-imaging/holographic-AR-glasses.git .

Azuma, R. T. A survey of augmented reality. Presence: Teleoperators Virtual Environ. 6 , 355–385 (1997).

Article   Google Scholar  

Xiong, J., Hsiang, E.-L., He, Z., Zhan, T. & Wu, S.-T. Augmented reality and virtual reality displays: emerging technologies and future perspectives. Light: Sci. Appl. 10 , 216 (2021).

Article   ADS   CAS   PubMed   Google Scholar  

Chang, C., Bang, K., Wetzstein, G., Lee, B. & Gao, L. Toward the next-generation VR/AR optics: a review of holographic near-eye displays from a human-centric perspective. Optica 7 , 1563–1578 (2020).

Article   ADS   PubMed   PubMed Central   Google Scholar  

Kooi, F. L. & Toet, A. Visual comfort of binocular and 3D displays. Displays 25 , 99–108 (2004).

Shibata, T., Kim, J., Hoffman, D. M. & Banks, M. S. The zone of comfort: predicting visual discomfort with stereo displays. J. Vis. 11 , 11 (2011).

Article   PubMed   Google Scholar  

Cakmakci, O. & Rolland, J. Head-worn displays: a review. J. Disp. Technol. 2 , 199–216 (2006).

Article   ADS   Google Scholar  

Kress, B. C. & Chatterjee, I. Waveguide combiners for mixed reality headsets: a nanophotonics design perspective. Nanophotonics 10 , 41–74 (2021).

Gabor, D. A new microscopic principle. Nature 161 , 777–778 (1949).

Sutherland, I. E. The ultimate display. In Proc. of the IFIP Congress (ed. Kalenich, W. A.) 2 , 506–508 (Spartan, 1965).

Tay, S. et al. An updatable holographic three-dimensional display. Nature 451 , 694–698 (2008).

Blanche, P.-A. et al. Holographic three-dimensional telepresence using large-area photorefractive polymer. Nature 468 , 80–83 (2010).

Smalley, D. E., Smithwick, Q., Bove, V., Barabas, J. & Jolly, S. Anisotropic leaky-mode modulator for holographic video displays. Nature 498 , 313–317 (2013).

Maimone, A., Georgiou, A. & Kollin, J. S. Holographic near-eye displays for virtual and augmented reality. ACM Trans. Graph. 36 , 85 (2017).

Molesky, S. et al. Inverse design in nanophotonics. Nat. Photon.   12 , 659–670 (2018).

Article   ADS   CAS   Google Scholar  

Li, Z., Pestourie, R., Lin, Z., Johnson, S. G. & Capasso, F. Empowering metasurfaces with inverse design: principles and applications. ACS Photonics 9 , 2178–2192 (2022).

Article   CAS   Google Scholar  

Jiang, J., Chen, M. & Fan, J. A. Deep neural networks for the evaluation and design of photonic devices. Nat. Rev. Mater. 6 , 679–700 (2021).

Genevet, P., Capasso, F., Aieta, F., Khorasaninejad, M. & Devlin, R. Recent advances in planar optics: from plasmonic to dielectric metasurfaces. Optica 4 , 139–152 (2017).

Lee, G.-Y., Sung, J. & Lee, B. Metasurface optics for imaging applications. MRS Bull. 45 , 202–209 (2020).

Lin, D. et al. Optical metasurfaces for high angle steering at visible wavelengths. Sci. Rep.   7 , 2286 (2017).

Song, J.-H., van de Groep, J., Kim, S. J. & Brongersma, M. L. Non-local metasurfaces for spectrally decoupled wavefront manipulation and eye tracking. Nat. Nanotechnol. 16 , 1224–1230 (2021).

Lawrence, M. et al. High quality factor phase gradient metasurfaces. Nat. Nanotechnol. 15 , 956–961 (2020).

Cordaro, A. et al. Solving integral equations in free space with inverse-designed ultrathin optical metagratings. Nat. Nanotechnol. 18 , 365–372 (2023).

Lee, G.-Y. et al. Metasurface eyepiece for augmented reality. Nat. Commun. 9 , 4562 (2018).

Joo, W.-J. & Brongersma, M. L. Creating the ultimate virtual reality display. Science 377 , 1376–1378 (2022).

Kim, J. et al. Holographic glasses for virtual reality. In ACM SIGGRAPH 2022 Conference Proc. (eds Nandigjav, M. et al.) 33 (ACM, 2022).

Peng, Y., Choi, S., Padmanaban, N. & Wetzstein, G. Neural holography with camera-in-the-loop training. ACM Trans. Graph. 39 , 185 (2020).

Shi, L., Li, B., Kim, C., Kellnhofer, P. & Matusik, W. Towards real-time photorealistic 3D holography with deep neural networks. Nature 591 , 234–239 (2021).

Peng, Y., Choi, S., Kim, J. & Wetzstein, G. Speckle-free holography with partially coherent light sources and camera-in-the-loop calibration. Sci. Adv. 7 , eabg5040 (2021).

Shi, L., Li, B. & Matusik, W. End-to-end learning of 3D phase-only holograms for holographic display. Light Sci. Appl. 11 , 247 (2022).

Yeom, H.-J. et al. 3d holographic head mounted display using holographic optical elements with astigmatism aberration compensation. Opt, Express 23 , 32025–32034 (2015).

Article   ADS   PubMed   Google Scholar  

Jeong, J. et al. Holographically customized optical combiner for eye-box extended near-eye display. Opt. Express 27 , 38006–38018 (2019).

Yeom, J., Son, Y. & Choi, K. Crosstalk reduction in voxels for a see-through holographic waveguide by using integral imaging with compensated elemental images. Photonics 8 , 217 (2021).

Choi, M.-H., Shin, K.-S., Jang, J., Han, W. & Park, J.-H. Waveguide-type Maxwellian near-eye display using a pin-mirror holographic optical element array. Opt. Lett. 47 , 405–408 (2022).

Chen, W. T. et al. A broadband achromatic metalens for focusing and imaging in the visible. Nat. Nanotechnol. 13 , 220–226 (2018).

Li, Z. et al. Meta-optics achieves RGB-achromatic focusing for virtual reality. Sci. Adv. 7 , eabe4458 (2021).

Article   ADS   CAS   PubMed   PubMed Central   Google Scholar  

Kim, C. & Lee, B. Torcwa: GPU-accelerated Fourier modal method and gradient-based optimization for metasurface design. Comput. Phys. Comm. 282 , 108552 (2023).

Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (2015).

Park, J.-S. et al. All-glass, large metalens at visible wavelength using deep-ultraviolet projection lithography. Nano Lett. 19 , 8673–8682 (2019).

Kim, J. et al. Scalable manufacturing of high-index atomic layer–polymer hybrid metasurfaces for metaphotonics in the visible. Nat. Mater. 22 , 474–481 (2023).

Chakravarthula, P., Tseng, E., Srivastava, T., Fuchs, H. & Heide, F. Learned hardware-in-the-loop phase retrieval for holographic near-eye displays. ACM Trans. Graph. 39 , 186 (2020).

Choi, S., Gopakumar, M., Peng, Y., Kim, J. & Wetzstein, G. Neural 3D holography: learning accurate wave propagation models for 3D holographic virtual and augmented reality displays. ACM Trans. Graph. 40 , 240 (2021).

Choi, S. et al. Time-multiplexed neural holography: a flexible framework for holographic near-eye displays with fast heavily-quantized spatial light modulators. In ACM SIGGRAPH 2022 Conference Proc. (eds Nandigjav, M. et al.) 32 (2022).

Jang, C., Bang, K., Chae, M., Lee, B. & Lanman, D. Waveguide holography for 3D augmented reality glasses. Nat. Commun. 15 , 66 (2024).

Hwang, C.-S. et al. 21-2: Invited paper: 1µm pixel pitch spatial light modulator panel for digital holography. Dig. Tech. Pap. SID Int. Symp. 51 , 297–300 (2020).

Park, J., Lee, K. & Park, Y. Ultrathin wide-angle large-area digital 3D holographic display using a non-periodic photon sieve. Nat. Commun. 10 , 1304 (2019).

Kuo, G., Waller, L., Ng, R. & Maimone, A. High resolution étendue expansion for holographic displays. ACM Trans. Graph. 39 , 66 (2020).

Jang, C., Bang, K., Li, G. & Lee, B. Holographic near-eye display with expanded eye-box. ACM Trans. Graph. 37 , 195 (2018).

Horisaki, R., Takagi, R. & Tanida, J. Deep-learning-generated holography. Appl. Optics 57 , 3859–3863 (2018).

Kim, C., Zimmer, H., Pritch, Y., Sorkine-Hornung, A. & Gross, M. Scene reconstruction from high spatio-angular resolution light fields. ACM Trans. Graph. 32 , 73 (2013).

Ronneberger, O., Fischer, P. & Brox, T. U-net: convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 (eds Navab, N., Hornegger, J., Wells, W. & Frangi, A.) 234–241 (Springer, 2015).

Ulyanov, D., Vedaldi, A. & Lempitsky, V. Improved texture networks: maximizing quality and diversity in feed-forward stylization and texture synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 6924–6932 (2017).

Download references

Acknowledgements

M.G. is supported by a Stanford Graduate Fellowship in Science and Engineering. G.-Y.L. is supported by a Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2022R1A6A3A03073823). S.C. is supported by a Kwanjeong Scholarship and a Meta Research PhD Fellowship. B.C. is supported by a Stanford Graduate Fellowship in Science and Engineering and a National Science Foundation Graduate Research Fellowship. G.W. is supported by the ARO (PECASE Award W911NF-19-1-0120), Samsung and the Sony Research Award Program. Part of this work was performed at the Stanford Nano Shared Facilities (SNSF) and Stanford Nanofabrication Facility (SNF), supported by the National Science Foundation and the National Nanotechnology Coordinated Infrastructure under award ECCS-2026822. We also thank Y. Park for her ongoing support.

Author information

These authors contributed equally: Manu Gopakumar, Gun-Yeal Lee

Authors and Affiliations

Department of Electrical Engineering, Stanford University, Stanford, CA, USA

Manu Gopakumar, Gun-Yeal Lee, Suyeon Choi, Brian Chao & Gordon Wetzstein

Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong, China

NVIDIA, Santa Clara, CA, USA

Jonghyun Kim

You can also search for this author in PubMed   Google Scholar

Contributions

M.G. developed the experimental setup and captured the measurements. G.-Y.L. designed and fabricated the metasurface waveguide and performed the theoretical analysis, numerical simulations and experimental measurements on metasurfaces. M.G. and S.C. developed and implemented the algorithmic procedures with input from G.-Y.L., B.C., Y.P. and J.K. G.W. conceived the method and supervised all aspects of the project. All authors took part in designing the experiments and writing the paper and the Supplementary Information .

Corresponding author

Correspondence to Gordon Wetzstein .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Nature thanks Ni Chen, Lingling Huang and Tim Wilkinson for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information.

This file contains Supplementary Notes 1–5, Figs. 1–18, Table 1 and References.

Supplementary Video 1

Laser-synchronized 2D video results, 3D video results, 2D AR video results and 3D AR video results.

Supplementary Video 2

Metasurface optimization animation.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Gopakumar, M., Lee, GY., Choi, S. et al. Full-colour 3D holographic augmented-reality displays with metasurface waveguides. Nature (2024). https://doi.org/10.1038/s41586-024-07386-0

Download citation

Received : 02 July 2023

Accepted : 04 April 2024

Published : 08 May 2024

DOI : https://doi.org/10.1038/s41586-024-07386-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

virtual reality and augmented reality research paper

Stanford University

Search form

  • Find Stories
  • For Journalists

AI and holography bring 3D augmented reality to regular glasses

Combining advances in display technologies, holographic imaging, and artificial intelligence, engineers at Stanford say they have produced a leap forward for augmented reality.

virtual reality and augmented reality research paper

Prototype of the compact augmented reality glasses. Through holography and AI, these glasses can display full-color, 3D moving images over an otherwise direct view of the real world. (Image credit: Andrew Brodhead)

Researchers in the emerging field of spatial computing have developed a prototype augmented reality headset that uses holographic imaging to overlay full-color, 3D moving images on the lenses of what would appear to be an ordinary pair of glasses. Unlike the bulky headsets of present-day augmented reality systems, the new approach delivers a visually satisfying 3D viewing experience in a compact, comfortable, and attractive form factor suitable for all-day wear.

“Our headset appears to the outside world just like an everyday pair of glasses, but what the wearer sees through the lenses is an enriched world overlaid with vibrant, full-color 3D computed imagery,” said Gordon Wetzstein , an associate professor of electrical engineering and an expert in the fast-emerging field of spatial computing. Wetzstein and a team of engineers introduce their device in a new paper in the journal Nature .

Though only a prototype now, such a technology, they say, could transform fields stretching from gaming and entertainment to training and education – anywhere computed imagery might enhance or inform the wearer’s understanding of the world around them.

“One could imagine a surgeon wearing such glasses to plan a delicate or complex surgery or airplane mechanic using them to learn to work on the latest jet engine,” Manu Gopakumar , a doctoral student in the Wetzstein-led Stanford Computational Imaging lab and co-first author of the paper said.

Five people standing and facing the camera. The person in the middle is holding a mannequin head with protoype glasses.

New holographic augmented reality system that enables more compact 3D displays (Image credit: Andrew Brodhead)

Barriers overcome

The new approach is the first to thread a complex maze of engineering requirements that have so far produced either ungainly headsets or less-than-satisfying 3D visual experiences that can leave the wearer visually fatigued, or even a bit nauseous at times.

“There is no other augmented reality system out there now with comparable compact form factor or that matches our 3D image quality,” said Gun-Yeal Lee , a postdoctoral researcher in the Stanford Computational Imaging lab and co-first author of the paper.

To succeed, the researchers have overcome technical barriers through a combination of AI-enhanced holographic imaging and new nanophotonic device approaches. The first hurdle was that the techniques for displaying augmented reality imagery often require the use of complex optical systems. In these systems, the user does not actually see the real world through the lenses of the headset. Instead, cameras mounted on the exterior of the headset capture the world in real time and combine that imagery with computed imagery. The resulting blended image is then projected to the user’s eye stereoscopically.

“The user sees a digitized approximation of the real world with computed imagery overlaid. It’s sort of augmented virtual reality, not true augmented reality,” explained Lee.

These systems, Wetzstein explains, are necessarily bulky because they use magnifying lenses between the wearer’s eye and the projection screens that require a minimum distance between the eye, the lenses, and the screens, leading to additional size.

“Beyond bulkiness, these limitations can also lead to unsatisfactory perceptual realism and, often, visual discomfort,” said Suyeon Choi , a doctoral student in the Stanford Computational Imaging lab and co-author of the paper.

Videos showing the researcher team’s AI-enhanced holography in action. (Stanford Computational Imaging lab)

To produce more visually satisfying 3D images, Wetzstein leapfrogged traditional stereoscopic approaches in favor of holography, a Nobel-winning visual technique developed in the late-1940s. Despite great promise in 3D imaging, more widespread adoption of holography has been limited by an inability to portray accurate 3D depth cues, leading to an underwhelming, sometimes nausea-inducing, visual experience.

The Wetzstein team used AI to improve the depth cues in the holographic images. Then, using advances in nanophotonics and waveguide display technologies, the researchers were able to project computed holograms onto the lenses of the glasses without relying on bulky additional optics.

Light shining through a glass rectangle highlights a square of the nanophotonic metasurface on the glass

Using nanophotonic technologies called metasurface optics, the researchers designed and fabricated a novel waveguide design that can relay 3D hologram information of RGB visible light into a single compact device with high transparency. These nanophotonic waveguide samples were fabricated in-house at Stanford Nanofabrication Facility and Stanford Nano Shared Facilities. (Image credit: Andrew Brodhead)

Life-like quality

The 3D effect is enhanced because it is created both stereoscopically, in the sense that each eye gets to see a slightly different image as they would in traditional 3D imaging, and holographically.

“With holography, you also get the full 3D volume in front of each eye increasing the life-like 3D image quality,” said Brian Chao , a doctoral student in the Stanford Computational Imaging lab and also co-author of the paper.

The ultimate outcome of the new waveguide display techniques and the improvement in holographic imaging is a true-to-life 3D visual experience that is both visually satisfying to the user without the fatigue that has challenged earlier approaches.

“Holographic displays have long been considered the ultimate 3D technique, but it’s never quite achieved that big commercial breakthrough,” Wetzstein said. “Maybe now they have the killer app they’ve been waiting for all these years.”

Additional authors are from The University of Hong Kong and NVIDIA. Wetzstein is also member of  Stanford Bio-X , the  Wu Tsai Human Performance Alliance , and the  Wu Tsai Neurosciences Institute .

This research was funded by a Stanford Graduate Fellowship in Science and Engineering, the National Research Foundation of Korea (NRF) funded by the Ministry of Education, a Kwanjeong Scholarship, a Meta Research PhD Fellowship, the ARO PECASE Award, Samsung, and the Sony Research Award Program. Part of this work was performed at the Stanford Nano Shared Facilities (SNSF) and Stanford Nanofabrication Facility (SNF) , supported by the National Science Foundation and the National Nanotechnology Coordinated Infrastructure.

share this!

May 8, 2024

This article has been reviewed according to Science X's editorial process and policies . Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

peer-reviewed publication

trusted source

AI and holography bring 3D augmented reality to regular glasses

by Stanford University

AI and holography bring 3D augmented reality to regular glasses

Researchers in the emerging field of spatial computing have developed a prototype augmented reality headset that uses holographic imaging to overlay full-color, 3D moving images on the lenses of what would appear to be an ordinary pair of glasses. Unlike the bulky headsets of present-day augmented reality systems, the new approach delivers a visually satisfying 3D viewing experience in a compact, comfortable, and attractive form factor suitable for all-day wear.

"Our headset appears to the outside world just like an everyday pair of glasses, but what the wearer sees through the lenses is an enriched world overlaid with vibrant, full-color 3D computed imagery," said Gordon Wetzstein, an associate professor of electrical engineering and an expert in the fast-emerging field of spatial computing.

Wetzstein and a team of engineers introduce their device in a paper in the journal Nature .

Though only a prototype now, such a technology, they say, could transform fields stretching from gaming and entertainment to training and education—anywhere computed imagery might enhance or inform the wearer's understanding of the world around them.

"One could imagine a surgeon wearing such glasses to plan a delicate or complex surgery or airplane mechanic using them to learn to work on the latest jet engine," Manu Gopakumar, a doctoral student in the Wetzstein-led Stanford Computational Imaging lab and co-first author of the paper said.

Barriers overcome

The new approach is the first to thread a complex maze of engineering requirements that have so far produced either ungainly headsets or less-than-satisfying 3D visual experiences that can leave the wearer visually fatigued, or even a bit nauseous at times.

"There is no other augmented reality system out there now with comparable compact form factor or that matches our 3D image quality," said Gun-Yeal Lee, a postdoctoral researcher in the Stanford Computational Imaging lab and co-first author of the paper.

To succeed, the researchers have overcome technical barriers through a combination of AI-enhanced holographic imaging and new nanophotonic device approaches. The first hurdle was that the techniques for displaying augmented reality imagery often require the use of complex optical systems.

In these systems, the user does not actually see the real world through the lenses of the headset. Instead, cameras mounted on the exterior of the headset capture the world in real time and combine that imagery with computed imagery. The resulting blended image is then projected to the user's eye stereoscopically.

"The user sees a digitized approximation of the real world with computed imagery overlaid. It's sort of augmented virtual reality, not true augmented reality," explained Lee.

These systems, Wetzstein explains, are necessarily bulky because they use magnifying lenses between the wearer's eye and the projection screens that require a minimum distance between the eye, the lenses, and the screens, leading to additional size.

"Beyond bulkiness, these limitations can also lead to unsatisfactory perceptual realism and, often, visual discomfort," said Suyeon Choi, a doctoral student in the Stanford Computational Imaging lab and co-author of the paper.

To produce more visually satisfying 3D images, Wetzstein leapfrogged traditional stereoscopic approaches in favor of holography, a Nobel-winning visual technique developed in the late-1940s. Despite great promise in 3D imaging, more widespread adoption of holography has been limited by an inability to portray accurate 3D depth cues, leading to an underwhelming, sometimes nausea-inducing, visual experience.

The Wetzstein team used AI to improve the depth cues in the holographic images . Then, using advances in nanophotonics and waveguide display technologies, the researchers were able to project computed holograms onto the lenses of the glasses without relying on bulky additional optics.

A waveguide is constructed by etching nanometer-scale patterns onto the lens surface. Small holographic displays mounted at each temple project the computed imagery through the etched patterns which bounce the light within the lens before it is delivered directly to the viewer's eye. Looking through the glasses' lenses, the user sees both the real world and the full-color, 3D computed images displayed on top.

Life-like quality

The 3D effect is enhanced because it is created both stereoscopically, in the sense that each eye gets to see a slightly different image as they would in traditional 3D imaging, and holographically.

"With holography, you also get the full 3D volume in front of each eye increasing the life-like 3D image quality," said Brian Chao, a doctoral student in the Stanford Computational Imaging lab and also co-author of the paper.

The ultimate outcome of the new waveguide display techniques and the improvement in holographic imaging is a true-to-life 3D visual experience that is both visually satisfying to the user without the fatigue that has challenged earlier approaches.

"Holographic displays have long been considered the ultimate 3D technique, but it's never quite achieved that big commercial breakthrough," Wetzstein said. "Maybe now they have the killer app they've been waiting for all these years."

Explore further

Feedback to editors

virtual reality and augmented reality research paper

'Digital afterlife': Call for safeguards to prevent unwanted 'hauntings' by AI chatbots of dead loved ones

5 hours ago

virtual reality and augmented reality research paper

New approach uses generative AI to imitate human motion

12 hours ago

virtual reality and augmented reality research paper

A new, low-cost, high-efficiency photonic integrated circuit

14 hours ago

virtual reality and augmented reality research paper

Scientists determine disorder improves lithium-ion battery life

virtual reality and augmented reality research paper

Chemists present roadmap to a carbon-neutral refinery by 2050

virtual reality and augmented reality research paper

Flexible pseudocapacitor defies climate extremes, packs energy punch

15 hours ago

virtual reality and augmented reality research paper

A low-energy process for high-performance solar cells could simplify the manufacturing process

virtual reality and augmented reality research paper

Lab's AI work results in increased revenue, decreased land requirements for wind power industry

virtual reality and augmented reality research paper

Teaching robots to move by sketching trajectories

virtual reality and augmented reality research paper

Researchers identify cause of electron-hole separation in thin-film solar cells to increase solar cell efficiency

16 hours ago

Related Stories

virtual reality and augmented reality research paper

Researchers use smartphone screen to create 3D layered holographic images

Apr 2, 2024

virtual reality and augmented reality research paper

Holographic displays offer a glimpse into an immersive future

Apr 23, 2024

virtual reality and augmented reality research paper

Speckle-free holography for virtual displays

Nov 12, 2021

Smart glasses follow our eyes, focus automatically

Jul 1, 2019

virtual reality and augmented reality research paper

Advancing real-time 3D holographic display: A new method for computer-generated holography

Apr 8, 2024

virtual reality and augmented reality research paper

Improvements to holographic displays poised to enhance virtual and augmented reality

Jan 28, 2021

Recommended for you

virtual reality and augmented reality research paper

Video shows how swarms of miniature robots simultaneously clean up microplastics and microbes

17 hours ago

virtual reality and augmented reality research paper

A framework to detect hallucinations in the text generated by LLMs

May 7, 2024

Let us know if there is a problem with our content

Use this form if you have come across a typo, inaccuracy or would like to send an edit request for the content on this page. For general inquiries, please use our contact form . For general feedback, use the public comments section below (please adhere to guidelines ).

Please select the most appropriate category to facilitate processing of your request

Thank you for taking time to provide your feedback to the editors.

Your feedback is important to us. However, we do not guarantee individual replies due to the high volume of messages.

E-mail the story

Your email address is used only to let the recipient know who sent the email. Neither your address nor the recipient's address will be used for any other purpose. The information you enter will appear in your e-mail message and is not retained by Tech Xplore in any form.

Your Privacy

This site uses cookies to assist with navigation, analyse your use of our services, collect data for ads personalisation and provide content from third parties. By using our site, you acknowledge that you have read and understand our Privacy Policy and Terms of Use .

E-mail newsletter

IMAGES

  1. Augmented Reality vs Virtual Reality: What's the Difference?

    virtual reality and augmented reality research paper

  2. The Technology, Applications, and Usage of Augmented Reality

    virtual reality and augmented reality research paper

  3. (PDF) Augmented Reality

    virtual reality and augmented reality research paper

  4. Augmented Reality: Investigation from the Perspective of a Mobile Developer

    virtual reality and augmented reality research paper

  5. (PDF) Virtual and Augmented Reality

    virtual reality and augmented reality research paper

  6. The Technology, Applications, and Usage of Augmented Reality

    virtual reality and augmented reality research paper

VIDEO

  1. Understanding the Technologies Augmented Reality (AR) and Virtual Reality (VR)

  2. Wearable Augmented Reality: Research Trends and Future Directions from Three Major Venues

  3. Action-Origami Inspired Haptic Devices for Virtual Reality

  4. Introduction to AR and VR: The Future of Immersive Technology

  5. Augmented Reality

  6. Architecture

COMMENTS

  1. Review Article Analyzing augmented reality (AR) and virtual reality (VR) recent development in education

    Augmented Reality (AR) and Virtual Reality (VR) technologies have revolutionized learning approaches through immersive digital experience, interactive environment, simulation and engagement. ... One more step to finalize characteristics of selected papers in this research is to determine the related journals and contribution of each one of them ...

  2. Augmented reality and virtual reality displays: emerging ...

    With rapid advances in high-speed communication and computation, augmented reality (AR) and virtual reality (VR) are emerging as next-generation display platforms for deeper human-digital ...

  3. The Past, Present, and Future of Virtual and Augmented Reality Research

    Introduction. In the last 5 years, virtual reality (VR) and augmented reality (AR) have attracted the interest of investors and the general public, especially after Mark Zuckerberg bought Oculus for two billion dollars (Luckerson, 2014; Castelvecchi, 2016).Currently, many other companies, such as Sony, Samsung, HTC, and Google are making huge investments in VR and AR (Korolov, 2014; Ebert ...

  4. Leading Virtual Reality (VR) and Augmented Reality (AR) in Education

    In various areas, research into augmented reality (AR) and virtual reality (VR) in education is limited. One difficulty is teachers' lack of expertise, as well as inadequate instructional design. Furthermore, there are hurdles to faculty use of AR/VR technologies in minority-serving institutions (MSIs) ( Boland et al., 2021 ).

  5. Virtual, mixed, and augmented reality: a systematic review for

    2.1 Immersion "Immersion" and "presence" are important concepts for research in immersive systems. Nilsson et al. note that "the term immersion continues to be applied inconsistently within and across different fields of research connected with the study of virtual reality and interactive media."This observation is confirmed by our review of the literature.

  6. Virtual and augmented reality to develop empathy: a systematic

    Recent research suggests that Virtual Reality (VR) and Augmented Reality (AR) as immersive technologies are effective in developing empathy. The main reason behind this assumption is that immersive technologies allow people to experience perspective-taking. However, there is a lack of systematic literature reviews that summarize the current state of research on VR and AR to elicit empathy ...

  7. Home

    Overview. Virtual Reality is a multidisciplinary journal that publishes original research on Virtual, Augmented, and Mixed Reality. Established in 1995, the journal accepts submissions on real-time visualization, graphics, and applications as well as the development and evaluation of systems, tools, techniques, and software that advance the field.

  8. (PDF) Virtual Reality and Augmented Reality

    According to the Augmented/Virtual Reality Report 2017 (Digi-Capital, 2017), the VR/AR market. should reach $108 billion revenues by 2021, while an overview by Goldman Sachs (2016) suggests a ...

  9. Virtual and Augmented Reality Applications in Medicine: Analysis of the

    Introduction. Virtual reality (VR) is a technology that immerses the user in a synthetic 3-dimensional (3D) environment via wearable screens in the form of VR headsets, while closely related augmented reality (AR) uses elements of VR and superimposes them on to the real-world environment in the form of a live video displayed on the screen of an electronic device [].

  10. Virtual and Augmented Reality

    Virtual and augmented reality technologies have entered a new near-commodity era, accompanied by massive commercial investments, but still are subject to numerous open research questions. This special issue of IEEE Computer Graphics and Applications aims at broad views to capture the state of the art, important achievements, and impact of several areas in these dynamic disciplines. It contains ...

  11. A Systematic Literature Review on Extended Reality: Virtual, Augmented

    PDF | Extended reality (XR), here jointly referring to virtual, augmented, and mixed (VR, AR, MR) reality, is becoming more common in everyday working... | Find, read and cite all the research you ...

  12. Virtual and augmented reality 2020

    With roots going back to Ivan Sutherland's research in the 1960s, virtual reality (VR) reached a plateau in the early 1990s when the promise was demonstrable in universities and research laboratories. Although we all knew we had a long way to go, hype overtook the field, leading to impossible expectations. Fortunately, a few years later the Internet became the latest hot topic, leaving VR ...

  13. Systematic Analysis of Virtual Reality & Augmented Reality

    In this paper, systematic analysis of relationships and features both VR and AR varies by outline, arrangement, administrations, and devices for associations and clients. This paper provides a ...

  14. Virtual reality and augmented reality displays: advances and future

    Abstract. Virtual reality (VR) and augmented reality (AR) are revolutionizing the ways we perceive and interact with various types of digital information. These near-eye displays have attracted significant attention and efforts due to their ability to reconstruct the interactions between computer-generated images and the real world.

  15. How Virtual Reality Technology Has Changed Our Lives: An Overview of

    VR as a field also includes augmented reality (AR) and mixed reality (XR), which are less immersive forms of virtual experiences where users still operate in the real world with a virtual overlay. AR and XR applications are more accessible to people due to their development for use on mobile devices, which are much more common with most people ...

  16. Augmented reality and virtual reality: Changing realities in a dynamic

    The book, Augmented Reality and Virtual Reality, which is comprised of AR and VR research presented at the fifth International Augmented and Virtual Reality Conference held in Germany in 2019 came out at a good timing to present the knowledge of AR and VR since the takeoff of the technology. ... As a collection of conference papers, it is ...

  17. Research opportunities on virtual reality and augmented reality: a

    Augmented reality(AR) is a technology that covers virtual objects into a real environment with real objects for better observer's knowledge. Virtual Reality (VR) enables user to interact with a computer-simulated environment which is either a simulation of the real world or an imaginary world. VR and AR are the key to explore, and touch the past, present and the future. They are the basis of ...

  18. Preserving the ephemeral: A visual typology of augmented reality

    Augmented reality filters allow users of popular social media sites such as Instagram to change their appearance through the application of digital overlays that adhere to the user's face. The instantaneous (and often 'beautifying') application of filters has seen them become much discussed amongst users, journalists and increasingly ...

  19. A User Study Trends in Augmented Reality and Virtual Reality Research

    Augmented reality (AR) and virtual reality (VR) are becoming a part of everyday life with the advance of other technologies such as computer vision systems, sensing technologies, graphics, mobile computing, etc. Their primary goal is to help users achieve their goals effectively and efficiently with satisfaction. This paper describes the trends of how user studies have been incorporated into ...

  20. PDF Augmented reality and virtual reality: Changing realities in ...

    The book, Augmented Reality and Virtual Reality, which is comprised of AR and VR research presented at the fth International Augmented and Virtual Real‑ ity Conference held in Germany in 2019 came out at a good timing to present the knowledge of AR and VR since the takeo of the technology. It not only serves as

  21. (PDF) Augmented Reality Vs. Virtual Reality

    Virtual reality (VR) and augmented reality (AR) can be taken into account 2 sides of a similar coin. They each function to expand the sensorial setting of a particular through fixing reality via ...

  22. Full-colour 3D holographic augmented-reality displays with ...

    We develop a method for providing high-quality, holographic, three-dimensional augmented-reality images in a small form factor suitable for incorporation in eyeglass-scale wearables, using high ...

  23. How Do VR and AR Play a Big Role in Pediatrics?

    Virtual reality and augmented reality play an important role in the treatment of children When it comes to the medical uses of VR and AR, pediatrics is an area that has a lot of potentials. Children, as inventive and curious as they are, are more prone than adults to be entranced by VR's immersive qualities.

  24. 3D augmented reality with regular glasses

    Prototype of the compact augmented reality glasses. Through holography and AI, these glasses can display full-color, 3D moving images over an otherwise direct view of the real world.

  25. (PDF) Augmented Reality

    Augmented reality (AR) is an communica ting. experience of a real-life environment wherever the. objects that exist in in the real life are enriched by. virtual perceptual data, sometimes across ...

  26. AI and holography bring 3D augmented reality to regular glasses

    Researchers in the emerging field of spatial computing have developed a prototype augmented reality headset that uses holographic imaging to overlay full-color, 3D moving images on the lenses of what would appear to be an ordinary pair of glasses. Unlike the bulky headsets of present-day augmented reality systems, the new approach delivers a visually satisfying 3D viewing experience in a ...

  27. A bibliometric Analysis of Virtual Reality and Augmented Reality

    Augmented and virtual reality are transforming the practice of healthcare by providing powerful and intuitive methods of exploring and interacting with digital medical data, as well as integrating ...