Table of Contents

Unity camera in AR scenes

The presentation of AR effects in Unity relies on the camera. The following content will help you understand the role of the camera in AR scenes and how the session controls the camera's properties to ensure a proper AR experience.

Before you begin

The role of the camera in AR scenes

In Unity, the camera is used to display the game world to the player, but in AR scenes, the role of the camera becomes even more critical. It not only renders virtual content but also needs to align with the real world to ensure virtual objects are correctly overlaid on the real scene.

This video demonstrates a simple AR scene. The left side shows the Scene view, and the right side shows the Game view. The video was recorded in the Unity editor's Play mode using simulated runtime data. The content in the Game view is identical to what a user would see on their mobile device in the real world.

As shown in the video, the camera representing the user (camera icon) moves according to the user's movement in the real world. The white cone traces the camera's position and orientation over a period of time. In the Game view, the camera not only displays the virtual content from the Scene view but also overlays the real-world image beneath the virtual content. This is the typical way a camera operates in an AR scene.

To ensure virtual objects are correctly overlaid on the real scene, certain properties of the camera need to be adjusted based on the AR runtime state. These properties include:

  • The camera's transform (position and orientation)
  • The camera's field of view (FOV), aspect ratio, and projection matrix
  • The camera's culling settings (GL.invertCulling)
Warning

Modifying these properties of the session camera during app development is not supported, as it may cause incorrect alignment between virtual content and the real world, affecting the user experience. Even if these properties are altered through certain means, the AR system will overwrite the changes during runtime or lead to unexpected behavior due to inconsistencies between rendering data and computational data.

Based on which object controls these properties, the cameras used by the session can be divided into two categories: session-controlled cameras and non-session-controlled cameras.

Camera controlled by session

If the session's camera does not belong to any external system, such as a headset or AR Foundation, the session will automatically control the camera's aforementioned properties to ensure proper alignment with the real world.

Transform

The transform (position and orientation) of the camera is adjusted by the session based on the operational state of AR functionalities. Generally, the session updates the camera's position and orientation according to motion tracking data and/or target tracking data to ensure that what the user sees aligns with the real world.

In Unity, the central reference point for all AR tracking is called the session center, and the rules determining this center during the session's operation are referred to as the center mode. The behavior of the camera's transform varies under different center modes:

  • In the Camera center mode, the camera can move freely.

    Typically, the Camera mode is rarely used by applications.

  • In other center modes (such as FirstTarget), the camera cannot move freely.

    FirstTarget is the mode adopted by most AR applications.

Warning

The scale value of the camera transform should always remain (1, 1, 1). Modifying the camera's scale may lead to unexpected behavior.

Projection matrix

The camera's projection matrix is updated each frame of the session based on the physical camera's intrinsic parameters to ensure virtual content is correctly overlaid on the real scene.

Culling settings

The culling settings of the camera (GL.invertCulling) will be adjusted according to the mirroring settings of the session to ensure that virtual content is rendered correctly in the real-world scene.

When the setting corresponding to the currently used camera in HorizontalFlip is World, GL.invertCulling will be set to true. This is the default configuration for the front-facing camera.

AR background video stream

In AR scenes, the camera typically renders the video stream from the physical camera as the background to enhance user immersion. The session automatically handles the acquisition and rendering of the video stream and ensures proper alignment between the video stream and virtual content.

Camera not controlled by session

When using a head-mounted display and AR Foundation, as well as implementing a FrameSource with IsCameraUnderControl set to false, the session will not control the aforementioned properties of the camera. Instead, they will be managed by an external system.

Warning

Although the session does not control the camera properties in this case, they will be managed by third-party systems (such as head-mounted display SDKs or AR Foundation). Modifying these properties during application development is still unsupported.

Considerations when copying cameras

Sometimes you may need to copy the parameters of a session camera to another camera. In this case, pay special attention to the following two points:

  • Property acquisition timing: When using a controlled camera, refer to Obtaining session runtime results to acquire these parameters at the correct time. When using a camera not controlled by the session, refer to the third-party system's documentation to acquire them at the right time.
  • Camera field of view (FOV), aspect ratio, and projection matrix: Use Camera.projectionMatrix to obtain the camera's projection matrix and copy it to another camera. Camera.fieldOfView and Camera.aspect are mathematically part of the projection matrix, so using Camera.fieldOfView and Camera.aspect alone is insufficient.

Next steps

  • Read Camera configs to learn how to configure cameras for the best AR experience