Create an image input extension

Before you begin

Understand basic concepts such as cameras, input frames.
Read external frame source for detailed interface requirements to create an external frame source.
Read external input frame data to learn about camera frame data and rendering frame data.

Create an external frame data source class

Inherit ExternalImageStreamFrameSource to create an image input extension. It is a subclass of MonoBehaviour, and the filename should match the class name.

For example:

public class MyFrameSource : ExternalImageStreamFrameSource
{
}

The sample Workflow_FrameSource_ExternalImageStream is an implementation of an image input extension based on a video recorded using ARCore on a mobile phone as input. The video was captured using ARCore on a Pixel2 through camera callback (not screen recording).

Device definition

Override IsCameraUnderControl and return true.

Override IsHMD to define whether the device is a head-mounted display.

For example, set it to false when using video as input.

protected override bool IsHMD => false;

Override Display to define the display of the device.

For example, if it only runs on a mobile phone, you can use Display.DefaultSystemDisplay, whose rotation value automatically changes according to the current display state of the operating system.

protected override IDisplay Display => easyar.Display.DefaultSystemDisplay;

Availability

Override IsAvailable to define whether the device is available.

For example, when using video as input, it is always available:

protected override Optional<bool> IsAvailable => true;

If IsAvailable cannot be determined during session assembly, you can override the CheckAvailability() coroutine to block the assembly process until availability is confirmed.

Virtual camera

Override Camera to provide a virtual camera.

For example, sometimes you can use Camera.main as a virtual camera for the session:

protected override Camera Camera => Camera.main;

Physical camera

Use FrameSourceCamera type to override DeviceCameras to provide device physical camera information. This data will be used when inputting camera frame data. CameraFrameStarted must be completed when true.

For example, using the video from the sample Workflow_FrameSource_ExternalImageStream:

private FrameSourceCamera deviceCamera;
protected override List<FrameSourceCamera> DeviceCameras => new List<FrameSourceCamera> { deviceCamera };

{
    var size = new Vector2Int(640, 360);
    var cameraType = CameraDeviceType.Back;
    var cameraOrientation = 90;
    deviceCamera = new FrameSourceCamera(cameraType, cameraOrientation, size, new Vector2(30, 30));
    started = true;
}

Caution

Several input parameters here need to be set according to the actual video used. The parameters in the code above are only applicable to the sample video.

Override CameraFrameStarted to provide the flag indicating the start of camera frame input.

For example:

protected override bool CameraFrameStarted => started;

Session start and stop

Override OnSessionStart(ARSession) and perform AR-specific initialization. Ensure to call base.OnSessionStart first.

For example:

protected override void OnSessionStart(ARSession session)
{
    base.OnSessionStart(session);
    ...
}

This is the appropriate place to open the device camera, especially if the camera is not designed to remain open continuously. It is also the right place to obtain calibration data that remains unchanged throughout the lifecycle. Sometimes, it may be necessary to wait for the device to be ready or for data updates before this data can be retrieved.

Additionally, this is a suitable location to start the data input loop. Alternatively, you can write this loop in Update() or other methods, particularly when data needs to be acquired at a specific point in Unity's execution order. Do not input data until the session is ready.

If needed, you can also ignore the startup process and perform data checks during each update, depending entirely on specific requirements.

For example, when using video as input, you can start playing the video and initiate the data input loop here:

protected override void OnSessionStart(ARSession session)
{
    base.OnSessionStart(session);
    ...
    player.Play();
    StartCoroutine(VideoDataToInputFrames());
}

Override OnSessionStop() and release resources, ensuring to call base.OnSessionStop.

For example, when using video as input, you can stop the video playback and release related resources here:

protected override void OnSessionStop()
{
    base.OnSessionStop();

    StopAllCoroutines();
    player.Stop();
    if (renderTexture) { Destroy(renderTexture); }
    cameraParameters?.Dispose();
    cameraParameters = null;
    frameIndex = -1;
    started = false;
    deviceCamera?.Dispose();
    deviceCamera = null;
}

Get camera frame data from device or file

Images can be obtained from any source such as system cameras, USB cameras, video files, networks, etc. As long as the data can be converted into the format required by Image. The methods of obtaining data from these devices or files vary and require reference to the usage instructions of the relevant devices or files.

For example, when using video as input, you can use Texture2D.ReadPixels(Rect, int, int, bool) to obtain camera frame data from the video player's RenderTexture, and then copy the data from Texture2D.GetRawTextureData() to Buffer:

void VideoDataToInputFrames()
{
    ...
    RenderTexture.active = renderTexture;
    var pixelSize = new Vector2Int((int)player.width, (int)player.height);
    var texture = new Texture2D(pixelSize.x, pixelSize.y, TextureFormat.RGB24, false);
    texture.ReadPixels(new Rect(0, 0, pixelSize.x, pixelSize.y), 0, 0);
    texture.Apply();
    RenderTexture.active = null;
    ...
    CopyRawTextureData(buffer, texture.GetRawTextureData<byte>(), pixelSize);
} 

static unsafe void CopyRawTextureData(Buffer buffer, Unity.Collections.NativeArray<byte> data, Vector2Int size)
{
    int oneLineLength = size.x * 3;
    int totalLength = oneLineLength * size.y;
    var ptr = new IntPtr(data.GetUnsafeReadOnlyPtr());
    for (int i = 0; i < size.y; i++)
    {
        buffer.tryCopyFrom(ptr, oneLineLength * i, totalLength - oneLineLength * (i + 1), oneLineLength);
    }
}

Caution

As in the code above, the data copied from the pointer of Texture2D needs to be flipped vertically to ensure the memory arrangement of the data is a normal image.

While obtaining the image, you also need to obtain the calibration data of the camera or equivalent camera and create an CameraParameters instance.

If the original source of the data comes from a mobile phone's camera callback and the data has not been artificially cropped, you can directly use the mobile phone's camera calibration data. When obtaining camera callback data using interfaces like ARCore or ARKit, you can refer to the relevant documentation to obtain the camera intrinsics. If the AR feature you need to use is image tracking or object tracking, you can also use CameraParameters.createWithDefaultIntrinsics(Vec2I, CameraDeviceType, int) to create camera intrinsics. In this case, the algorithm performance may be slightly affected, but the impact is generally minimal.

If the data comes from other sources such as USB cameras or video files not generated by camera callbacks, you need to calibrate the camera or video frames to obtain the correct intrinsics.

Caution

Camera callback data cannot be cropped. If it is cropped, the intrinsics need to be recalculated. If the data comes from screen recordings or other methods of obtaining image data, the mobile phone's camera calibration data usually cannot be used. In this case, you also need to calibrate the camera or video frames to obtain the correct intrinsics.

Incorrect intrinsics will prevent AR features from functioning properly, commonly causing virtual content to misalign with real objects, and making AR tracking difficult to succeed or easy to lose.

For example, using the video from the sample Workflow_FrameSource_ExternalImageStream, the corresponding camera intrinsics and the creation process of CameraParameters are as follows:

var size = new Vector2Int(640, 360);
var cameraType = CameraDeviceType.Back;
var cameraOrientation = 90;
cameraParameters = new CameraParameters(size.ToEasyARVector(), new Vec2F(506.085f, 505.3105f), new Vec2F(318.1032f, 177.6514f), cameraType, cameraOrientation);

Caution

The parameters in the above code are only applicable to the video in the sample, as the camera intrinsics and the video were collected at the same time. If you need to use data from other videos or devices, be sure to obtain the device intrinsics or perform manual calibration simultaneously.

Input camera frame data

After obtaining the camera frame data update, call HandleCameraFrameData(double, Image, CameraParameters) to input the camera frame data.

For example, when using video as input, the implementation is as follows:

IEnumerator VideoDataToInputFrames()
{
    yield return new WaitUntil(() => player.isPrepared);
    var pixelSize = new Vector2Int((int)player.width, (int)player.height);
    ...
    yield return new WaitUntil(() => player.isPlaying && player.frame >= 0);
    while (true)
    {
        yield return null;
        if (frameIndex == player.frame) { continue; }
        frameIndex = player.frame;
        ...
        var pixelFormat = PixelFormat.RGB888;
        var bufferO = TryAcquireBuffer(pixelSize.x * pixelSize.y * 3);
        if (bufferO.OnNone) { continue; }

        var buffer = bufferO.Value;
        CopyRawTextureData(buffer, texture.GetRawTextureData<byte>(), pixelSize);

        using (buffer)
        using (var image = Image.create(buffer, pixelFormat, pixelSize.x, pixelSize.y, pixelSize.x, pixelSize.y))
        {
            HandleCameraFrameData(player.time, image, cameraParameters);
        }
    }
}

Caution

Do not forget to execute Dispose() or release Image, Buffer, and other related data through mechanisms such as using after use. Otherwise, severe memory leaks may occur, and buffer pool acquisition may fail.

Table of Contents

Create an image input extension

Before you begin

Create an external frame data source class

Device definition

Availability

Virtual camera

Physical camera

Caution

Session start and stop

Get camera frame data from device or file

Caution

Caution

Caution

Input camera frame data

Caution

Related topics