The tutorial has been updated to work with ARCore version 1.12.0 so now we’re able to determine whether the image is currently being tracked by the camera to pause/resume the video accordingly.

This is the second part of the series where I cover the combination of ARCore and Sceneform. More specifically, I’m showing you how to play a video, using the Sceneform, on top of an image that can be detected with the help of ARCore Augmented Images capabilities.

At the end of the previous post, we have ended up with a working demo of the above-described concept. However, there was a couple of issues:

  1. Video aspect ratio is not maintained
  2. Some videos are displayed at the wrong angle

Well, it’s about time to clean this mess.

TL;DR Complete source code is available at this repository.

Let’s first figure out what could possibly go wrong?

As for the aspect ratio, that’s an easy one — we didn’t write any code to maintain it. The size of the video Node (i.e. the texture on which a video is rendered) is scaled exactly to match detected augmented image. But we don’t live in a perfect world and oftentimes, images and videos do not match. Hence my goal was to gracefully support all possible scenarios, where both image and respective video can have either the same or different size and aspect ratio.

As for the rotation — videos recorded using mobile devices often contain rotation metadata. This data is used by video players to display the video the way it was shot: portrait or landscape. For example, a video shot on a Google Pixel or iPhone in portrait orientation may have 1920×1080 px resolution and one may think “Hey, the width is greater than height, hence the video is landscape” and one will be wrong because that video file contains rotation metadata with integer value 90 (or 270 if device was held upside down). It’s really important to take video rotation metadata into consideration.


To solve both issues, we must know about video resolution and rotation. These properties can be extracted from a video file with the help of MediaMetadataRetriever class.

Previously, we used FileDescriptor only to set MediaPlayer data source:

Now, we will extend this code to also obtain video metadata:


The goal is to support the two most usable scale types — CenterCrop and CenterInside — all equivalent to ImageView.ScaleType. Coincidentally, we already have FitXY scale type supported, that’s the original logic of “squashing” a video to match the image size and aspect ratio.

Let’s define all the options with the enum.

Next, let’s create a VideoAnchorNode class that extends AnchorNode. It will encapsulate all the code responsible for the video Node scale and rotation.

Child node instance, namely videoNode, is used for the sake of better composition — in the future it may be easier, for example, to add a frame 3D object around the video, as a child of the VideoAnchorNode class.

Define a method that receives both image and video dimensions, video rotation and also a scale type.

Now, depending on the VideoScaleType parameter, we’ll have different logic.


FitXY scale type example

Scale video width and height independently, so that video dimensions matchtarget image exactly. This may change the aspect ratio of the video.


CenterCrop scale type example

Scale the video uniformly (maintain aspect ratio) so that both dimensions (width and height) of the video will be equal to or larger than the corresponding dimension of the target image.


CenterInside scaleType example

Scale the video uniformly (maintain aspect ratio) so that both dimensions (width and height) of the video will be equal to or less than the corresponding dimension of the target image.

One can notice that both CenterCrop and CenterInside methods are almost identical except for two lines of code, namely this comparison:

(videoAspectRatio <> imageAspectRatio)

Indeed, that’s where the difference between these two scale types lies — either fit the source inside the target or cover the target with the source. Nevertheless, I haven’t merged them into a single method for clarity.


Now, let’s rotate the Node to match video metadata. Luckily, that’s a one-liner:

Basically, we compensate for the video rotation with the Node rotation in the opposite direction. You may want to change the affected axis if using your own 3D model, where axes aligned differently.

Wrapping up

Now that we have all the logic in place, it’s time to replace the basic AnchorNodewith the VideoAnchorNode and pass the video parameters. Once again, we shall modify ArVideoFragment#playbackArVideo() method:

Apart from simply passing information about video and image to the Node, we also have to add one important step — applying video rotation to the image dimensions. Simply put, we have to swap width and height in case video rotation is equal to 90 or 270 degrees so that the aspect ratio equations stay true.

That’s pretty much it! Now let’s build and install the app on a device and see what will happen.

CenterInside & CenterCrop scale types demo 👇

During the first half of the demo, you can see that the aspect ratio of the videos is maintained and they all fit right inside the corresponding images.

The second half shows a similar result — aspect ratio of the videos is maintained, however, this time the images are completely covered with the corresponding videos.

Moreover, the last sample video (one with rotation metadata) does not only maintain it’s aspect ratio but also properly rotated. Sweet!

But what if you want to use CenterCrop scale type and at the same time not exceed the augmented image area (e.g. crop the video)? No worries folks, I’ve got you covered.

In the next post, we’ll modify the shader code to support video crop and add a finishing touch with the fade-in effect. Stay tuned!

Let’s jump right in.
Ready to take your business to the next level with Augmented Reality?
Let’s TALK
Table of Content
Book a call now!
CTO at Krootl
Get a Consultation