Unity camera in AR scenes
The presentation of AR effects in Unity relies on the camera. The following content will help you understand the role of the camera in AR scenes and how the session controls the camera's properties to ensure a proper AR experience.
Before you begin
- Learn the basic concepts, components, and workflow of a session through Introduction to ARSession.
The role of the camera in AR scenes
In Unity, the camera is used to display the game world to the player, but in AR scenes, the role of the camera becomes even more critical. It not only renders virtual content but also needs to align with the real world to ensure virtual objects are correctly overlaid on the real scene.
This video demonstrates a simple AR scene. The left side shows the
Sceneview, and the right side shows theGameview. The video was recorded in the Unity editor'sPlaymode using simulated runtime data. The content in theGameview is identical to what a user would see on their mobile device in the real world.As shown in the video, the camera representing the user (camera icon) moves according to the user's movement in the real world. The white cone traces the camera's position and orientation over a period of time. In the
Gameview, the camera not only displays the virtual content from theSceneview but also overlays the real-world image beneath the virtual content. This is the typical way a camera operates in an AR scene.
To ensure virtual objects are correctly overlaid on the real scene, certain properties of the camera need to be adjusted based on the AR runtime state. These properties include:
- The camera's transform (position and orientation)
- The camera's field of view (FOV), aspect ratio, and projection matrix
- The camera's culling settings (GL.invertCulling)
Warning
Modifying these properties of the session camera during app development is not supported, as it may cause incorrect alignment between virtual content and the real world, affecting the user experience. Even if these properties are altered through certain means, the AR system will overwrite the changes during runtime or lead to unexpected behavior due to inconsistencies between rendering data and computational data.
Based on which object controls these properties, the cameras used by the session can be divided into two categories: session-controlled cameras and non-session-controlled cameras.
Camera controlled by session
If the session's camera does not belong to any external system, such as a headset or AR Foundation, the session will automatically control the camera's aforementioned properties to ensure proper alignment with the real world.
Transform
The transform (position and orientation) of the camera is adjusted by the session based on the operational state of AR functionalities. Generally, the session updates the camera's position and orientation according to motion tracking data and/or target tracking data to ensure that what the user sees aligns with the real world.
In Unity, the central reference point for all AR tracking is called the session center, and the rules determining this center during the session's operation are referred to as the center mode. The behavior of the camera's transform varies under different center modes:
In the Camera center mode, the camera can move freely.
Typically, the Camera mode is rarely used by applications.
In other center modes (such as FirstTarget), the camera cannot move freely.
FirstTarget is the mode adopted by most AR applications.
Warning
The scale value of the camera transform should always remain (1, 1, 1). Modifying the camera's scale may lead to unexpected behavior.
Projection matrix
The camera's projection matrix is updated each frame of the session based on the physical camera's intrinsic parameters to ensure virtual content is correctly overlaid on the real scene.
Culling settings
The culling settings of the camera (GL.invertCulling) will be adjusted according to the mirroring settings of the session to ensure that virtual content is rendered correctly in the real-world scene.
When the setting corresponding to the currently used camera in HorizontalFlip is World, GL.invertCulling will be set to true. This is the default configuration for the front-facing camera.
AR background video stream
In AR scenes, the camera typically renders the video stream from the physical camera as the background to enhance user immersion. The session automatically handles the acquisition and rendering of the video stream and ensures proper alignment between the video stream and virtual content.
Camera not controlled by session
When using a head-mounted display and AR Foundation, as well as implementing a FrameSource with IsCameraUnderControl set to false, the session will not control the aforementioned properties of the camera. Instead, they will be managed by an external system.
Warning
Although the session does not control the camera properties in this case, they will be managed by third-party systems (such as head-mounted display SDKs or AR Foundation). Modifying these properties during application development is still unsupported.
Considerations when copying cameras
Sometimes you may need to copy the parameters of a session camera to another camera. In this case, pay special attention to the following two points:
- Property acquisition timing: When using a controlled camera, refer to Obtaining session runtime results to acquire these parameters at the correct time. When using a camera not controlled by the session, refer to the third-party system's documentation to acquire them at the right time.
- Camera field of view (FOV), aspect ratio, and projection matrix: Use Camera.projectionMatrix to obtain the camera's projection matrix and copy it to another camera. Camera.fieldOfView and Camera.aspect are mathematically part of the projection matrix, so using Camera.fieldOfView and Camera.aspect alone is insufficient.
Next steps
- Read Camera configs to learn how to configure cameras for the best AR experience