Combining object tracking with motion tracking

This article introduces how to integrate 3D object tracking with device motion tracking to enhance tracking stability and user experience in complex scenarios. It covers the core principles, expected effects, and potential issues analysis.

Basic principles

Motion Fusion combines pose data from 3D object tracking and device motion tracking to achieve more robust pose estimation. The following is its core workflow:

Data synchronization and complementation

Visual tracking: Calculates the pose (position + rotation) of the current frame through image feature point matching, but it is susceptible to occlusion, blur, or rapid movement.
Motion tracking: Uses IMU sensor high-frequency output and visual image output to obtain device motion data, but it suffers from cumulative drift error.
Fusion mechanism:
- Aligns the coordinate systems of the visual tracking pose and the device motion tracking pose.
- When the target object is clearly visible and moving steadily: visual tracking is prioritized. The visual tracking pose is continuously fed into the fusion module for correction to reduce the cumulative drift of the entire system.
- When the target object is lost, occupies too small a portion of the frame, or moves rapidly: visual tracking fails, and motion tracking is prioritized. The fusion pose is predicted based on the current motion tracking pose.

Key technical points

Timestamp alignment: Align the timestamps of visual frames with motion tracking data to avoid jitter caused by latency.
Coordinate system alignment: Perform coordinate system alignment based on the trajectories from visual tracking and motion tracking.
Relocalization: When the target object reappears, visual tracking takes over to quickly correct potential accumulated errors and "pull" the virtual object back to the correct position.

Applicable scenarios and limitations

Motion fusion is not suitable for all scenarios. The following situations are not applicable for the motion fusion feature:

The target device does not support motion-tracking features like ARCore/ARKit. For a detailed list of supported devices, refer to: Motion-tracking device support.
The target object is dynamic in the scene, such as a toy or figurine held in hand.

For all other scenarios, using motion fusion will significantly enhance the user experience of 3D object tracking, including but not limited to the following use cases:

Fast motion: When the user moves the device quickly, motion blur may cause visual tracking to fail.
Target disappearance: When the target leaves the frame or is occluded by dynamic objects (e.g., pedestrians), the virtual content in the scene can still be maintained.
Distance from target: When the user moves the device away, causing the target object to occupy a small portion of the frame, tracking remains stable and continuous.
Low-light conditions: When visual tracking performance degrades, the experience can still be maintained.

Effect and expected results

Under applicable scenarios, using motion fusion will provide a more stable and smoother user experience compared to solely relying on 3D object tracking.

Ideal effect

More stable tracking: Virtual objects do not jitter or jump.
Smooth transition: When visual tracking fails, the fusion pose changes continuously and naturally.
Anti-interference ability: In scenarios where the target object is lost or occluded, or the device moves rapidly, the virtual object can still follow the device's movement and maintain tracking.

Suboptimal scenarios and countermeasures

Phenomenon	Cause	User perception	Solution
Initial non-activation	Motion tracking requires some time to initialize	Content disappears during the initial phase	Provide UI prompts to ensure system motion tracking initialization is complete
Significant drift	Accumulated system errors with prolonged lack of visual correction	Virtual objects deviate from their original positions	Guide users to reduce occlusion time or add prompts for visual relocalization
Performance degradation	Prolonged simultaneous operation of two functions	Frame rate drops and stuttering occurs	Normal phenomenon; motion fusion can be disabled via the interface

Expected result verification method

Test with supported devices in real-world scenarios:

Aim at the target object and confirm the virtual object is stable.
Cover the object with your hand for 2 seconds and move the device, observing whether the virtual object moves smoothly.
Remove your hand and confirm the virtual object quickly returns to position without jumps.

Summary and best practices

Motion fusion significantly improves the robustness of 3D object tracking in many scenarios, but it requires hardware support and sufficient performance from the device. Developers should selectively enable this feature based on the target user's device and provide a fallback solution for low-performance devices.

API reference for enabling/disabling motion fusion in real-time:

Native: setResultPostProcessing
Unity: EnableMotionFusion

Table of Contents