Combining planar tracking with motion tracking

This article explains how to integrate planar image tracking with device motion tracking to enhance tracking stability and user experience in complex scenarios. It covers core principles, expected outcomes, and potential issues.

Basic principles

Motion Fusion combines the pose data from planar image tracking with the pose data from device motion tracking to achieve more robust pose estimation. The following is its core workflow:

Data synchronization and complementation

Visual tracking: Calculates the pose (position + rotation) of the current frame through image feature point matching, but it is susceptible to occlusion, blur, or rapid movement.
Motion tracking: Uses IMU sensor high-frequency output and visual image output to obtain device motion data, but it has cumulative drift errors.
Fusion mechanism:
- Align the coordinate systems of the visual tracking pose and the device motion tracking pose.
- When the target image is clearly visible and the motion is stable: prioritize visual tracking. Continuously feed the visual tracking pose into the fusion module for correction to reduce the cumulative drift of the entire system.
- When the target image is lost, occupies too small a portion of the frame, or moves rapidly: visual tracking fails, so prioritize motion tracking. Predict the fused pose based on the current motion tracking pose.

Key technical points

Timestamp alignment: Align the timestamps of visual frames with motion tracking data to avoid jitter caused by latency.
Coordinate system alignment: Perform coordinate system alignment based on the trajectories from visual tracking and motion tracking.
Relocalization: When the image reappears, visual tracking takes over to quickly correct potential accumulated errors and "pull" the virtual object back to the correct position.

Applicable scenarios and limitations

Motion fusion is not suitable for all scenarios. The motion fusion feature will not apply under the following circumstances:

The target device does not support motion-tracking features such as ARCore/ARKit. For a detailed list of supported devices, refer to: Motion tracking device support.
The target image/planar object is dynamic in the scene, such as a card held in the hand for interaction.

In all other scenarios, using motion fusion will significantly enhance the user experience of planar image tracking, including but not limited to the following use cases:

Fast motion: When the user moves the device rapidly, motion blur may cause image tracking to fail.
Target disappearance: When the target leaves the frame or is occluded by dynamic objects (e.g., pedestrians), the virtual content in the scene remains stable.
Distant target: When the user moves the device away, causing the target image to occupy a small portion of the frame, tracking remains stable and continuous.
Low-light conditions: When visual tracking performance degrades, the experience is maintained.

Effect and expected results

Under applicable scenarios, using motion fusion will provide a more stable and smoother user experience compared to relying solely on planar image tracking.

Ideal effect

More stable tracking: Virtual objects do not shake or jump.
Smooth transition: When visual tracking fails, the fusion pose changes continuously and naturally.
Anti-interference ability: In cases where the target image is lost or occluded, or the device moves rapidly, the virtual object can still follow the device's movement and continue tracking.

Suboptimal scenarios and countermeasures

Phenomenon	Cause	User perception	Solution
Initial non-activation	Motion tracking requires some time to initialize	Content disappears during the initial phase	Provide UI prompts to ensure system motion tracking initialization is complete
Significant drift	Accumulated system errors with prolonged lack of visual correction	Virtual objects deviate from their original positions	Guide users to reduce occlusion time or add prompts for visual relocalization
Performance degradation	Prolonged simultaneous operation of two functions	Frame rate drops and stuttering occurs	Normal phenomenon; motion fusion can be disabled via the interface

Expected result verification method

Test with supported devices in real scenarios:

Align the image and confirm the virtual object is stable.
Cover the image with your hand for 2 seconds and move the device, observing whether the virtual object moves smoothly.
Remove your hand and confirm the virtual object quickly returns to position without jumps.

Summary and best practices

Motion fusion significantly improves the robustness of planar image tracking in many scenarios, but it requires hardware support and sufficient performance from the device. developers should selectively enable this feature based on the target user's device and provide a fallback solution for low-performance devices.

api reference for enabling/disabling motion fusion in real-time:

native: setResultPostProcessing
unity: EnableMotionFusion

Table of Contents