Input frame data requirements for external frame data sources
To ensure the proper functioning of external frame data sources, the most critical yet challenging task is to guarantee data accuracy. This document outlines the input frame data requirements for external frame data sources.
Before you begin
- Understand basic concepts such as cameras, input frames.
- Understand the basic concepts and common types of external frame sources.
Input frame data types
In Unity, external frame data sources often need to receive different data at two different times. Based on the external data input timing and data characteristics, we refer to these two sets of data as:
- Camera frame data
- Rendering frame data
Different types of external frame data sources have varying requirements for these two sets of data:
- Image and device motion data input extensions: require both camera frame data and rendering frame data
- Image input extensions: only require camera frame data
Camera frame data
Data requirements:
- Timestamp (timestamp)
- Raw physical camera image data (raw camera image data)
- Intrinsics (intrinsics, including image size, focal length, principal point. Distortion model and parameters are required if distortion exists)
- Extrinsics (extrinsics, Tcw or Twc, calibrated matrix expressing physical camera's offset relative to device/head pose origin)
- Tracking status (tracking status)
- Device pose (device pose)
Data timing:
- Physical camera exposure midpoint
Data usage:
- API call timing: May vary based on external code design. A common approach used by most devices is to query during 3D engine rendering updates, then determine whether to proceed with data processing based on device data timestamp
- API call thread: 3D engine's game thread or any other thread (if all external APIs used are thread-safe)
API call examples in Unity are as follows:
void TryInputCameraFrameData()
{
double timestamp;
if (timestamp == curTimestamp) { return; }
curTimestamp = timestamp;
PixelFormat format;
Vector2Int size;
Vector2Int pixelSize;
int bufferSize;
var bufferO = TryAcquireBuffer(bufferSize);
if (bufferO.OnNone) { return; }
var buffer = bufferO.Value;
IntPtr imageData;
buffer.tryCopyFrom(imageData, 0, 0, bufferSize);
var historicalHeadPose = new Pose();
MotionTrackingStatus trackingStatus = (MotionTrackingStatus)(-1);
using (buffer)
using (var image = Image.create(buffer, format, size.x, size.y, pixelSize.x, pixelSize.y))
{
HandleCameraFrameData(deviceCamera, timestamp, image, cameraParameters, historicalHeadPose, trackingStatus);
}
}
Render frame data
Data requirements:
- Timestamp (timestamp)
- Tracking status (tracking status)
- Device pose (device pose)
Data timing:
- On-screen display moment. TimeWarp is not calculated. Device pose data at the same moment will be used by external systems (e.g., device SDK) to set virtual camera transform for rendering the current frame.
Note
TimeWarp (sometimes called Reprojection or ATW/PTW) is a common latency-reduction technique in VR/AR headsets. It re-distorts images based on latest head pose after rendering completion to compensate for head movement during rendering. EasyAR requires the moment corresponding to the pose used to set the virtual camera at rendering start, not the actual on-screen moment after TimeWarp.
Data usage:
- API call timing: Each rendering frame of the 3D engine
- API call thread: 3D engine's game thread
API call examples in Unity are as follows:
void TryInputCameraFrameData()
{
double timestamp;
if (timestamp == curTimestamp) { return; }
curTimestamp = timestamp;
PixelFormat format;
Vector2Int size;
Vector2Int pixelSize;
int bufferSize;
var bufferO = TryAcquireBuffer(bufferSize);
if (bufferO.OnNone) { return; }
var buffer = bufferO.Value;
IntPtr imageData;
buffer.tryCopyFrom(imageData, 0, 0, bufferSize);
var historicalHeadPose = new Pose();
MotionTrackingStatus trackingStatus = (MotionTrackingStatus)(-1);
using (buffer)
using (var image = Image.create(buffer, format, size.x, size.y, pixelSize.x, pixelSize.y))
{
HandleCameraFrameData(deviceCamera, timestamp, image, cameraParameters, historicalHeadPose, trackingStatus);
}
}
Render frame data
Data requirements:
- Timestamp (timestamp)
- Tracking status (tracking status)
- Device pose (device pose)
Details of data requirements
Physical camera image data:
- Image coordinate system: Data captured when the sensor is level should also be level. Data should be stored with the top-left corner as the origin in row-major order. Images should not be flipped or inverted.
- Image FPS: Normal 30 or 60 fps data is acceptable. If high fps has special effects, the minimum acceptable frame rate for reasonable algorithm performance is 2. FPS higher than 2 is recommended, and the original data frame rate is typically sufficient.
- Image size: For better computational results, the longest side should be 960 pixels or larger. Time-consuming image scaling in the data pipeline is discouraged; using raw data directly is recommended unless copying full-size data becomes unacceptably time-consuming. Image resolution must not be smaller than 640*480.
- Pixel format: Prioritizing tracking effectiveness and overall performance, the typical format priority order is YUV > RGB > RGBA > Gray (Y component in YUV). When using YUV data, complete data definitions are required, including data packing and padding details. Mega performs better with color images compared to single-channel images, though other features are less affected.
- Data access: Data pointer or equivalent implementation. Eliminate all possible non-essential copies in the data pipeline. In
HandleRenderFrameData, EasyAR copies the data once for asynchronous use, and the image data is no longer used after this synchronous call completes. Note data ownership.
API call examples in Unity are as follows:
void TryInputCameraFrameData()
{
double timestamp;
if (timestamp == curTimestamp) { return; }
curTimestamp = timestamp;
PixelFormat format;
Vector2Int size;
Vector2Int pixelSize;
int bufferSize;
var bufferO = TryAcquireBuffer(bufferSize);
if (bufferO.OnNone) { return; }
var buffer = bufferO.Value;
IntPtr imageData;
buffer.tryCopyFrom(imageData, 0, 0, bufferSize);
var historicalHeadPose = new Pose();
MotionTrackingStatus trackingStatus = (MotionTrackingStatus)(-1);
using (buffer)
using (var image = Image.create(buffer, format, size.x, size.y, pixelSize.x, pixelSize.y))
{
HandleCameraFrameData(deviceCamera, timestamp, image, cameraParameters, historicalHeadPose, trackingStatus);
}
}
Render frame data
Data requirements:
- Timestamp (timestamp)
- Tracking status (tracking status)
- Device pose (device pose)
Next steps
- Create an image and device motion data input extension
- Create an image input extension
- Create a headset extension package
Related topics
- EasyAR coordinate system
- Image input extension example Workflow_FrameSource_ExternalImageStream