Remember the last time you opened an augmented reality (AR) application on your phone and closed it minutes later, giving up on the hope of seeing what the painting you wanted to buy would look like in your living room? You tried holding your phone camera up to the wall and made sure there was enough light in the room, but that piece of art simply wouldn’t stick to the wall. Your frustration is understandable, especially because it’s the same problem you encountered the last time you tried augmented reality experience.
Positional tracking and placement of virtual objects are two of the biggest challenges in augmented reality faced by developers these days. It doesn’t matter if you’re using a home decor app, a jewelry try-on app or an AR showroom — the issues remain the same. AR-powered app development relies on technology that’s still far from perfect.
Want to know more about tech trends?
Sign up to be the first who receive our expert articles
Success!
Thank you
What positional tracking means
So what is positional tracking and why does it give developers such a headache?
Positional tracking is a technology that helps a device determine its position in space by evaluating the surrounding environment. Achieving this requires a combination of software and hardware. As an essential technology in AR, positional tracking can determine how an object is moving with six degrees of freedom (6DoF), which refers to the freedom of movement of a rigid body in three-dimensional space.
To calculate an exact position in space, we have to know the dimensions of the space. To get that information, we need to detect our position, and this is where the problem begins.
In some environments it’s really tricky for a camera to identify its own position in space, especially when all it sees is a plain wall and it’s trying to place a 3D object on that wall. The lack of patterns, lack of other objects in the space and barely visible texture on the walls make it very hard for a device to pitch anchors on surfaces and locate itself in the space.
To ensure that AR functions well and virtual objects are displayed properly, regardless of whether you rotate your phone or headset, requires clear patterns or features that a virtual object can anchor to.
Let’s take a look at how the industry addresses this issue.
Common approaches to localization in AR
There are several common approaches that developers rely on to address the localization challenge.
Marker-based approach
A marker-based approach, also known as image recognition, requires a printed picture marker that the camera on the AR-powered device can recognize. When the camera reads the visual marker, the user sees relevant information or visuals on screen. The marker-based approach is considered old-school among developers, but it’s still popular, especially in museum installations.
Projection AR approach
This method uses visuals and light projected from a device onto a surface. One way of doing this is with the help of sensor-powered lasers. This approach works best in dark spaces; the projections can be lost in lighted spaces. This is a straight-forward approach that is often applied in marketing campaigns for real-estate development.
Superimposition-based AR
This technology relies heavily on object recognition. The augmented reality image replaces the original object in part or in full. This method is widely used in health care.
Markerless or Feature-Based
This approach requires no recognition system and instead, places virtual objects in the physical environment based on the environment’s real features. This is possible thanks to advanced cameras, sensors and algorithms trained to accurately map the real-world environment.
The feature-based approach leverages the power of trained machine learning (ML) algorithms that can find landmarks and accurately estimate the location of a device in space automatically. SIFT (Scale Invariant Feature Transform), SURF (Speed Up Robust Features) and ORB (Oriented Fast and Rotated BRIEF) are examples of feature detection algorithms.
When it comes to using the features detected by these algorithms for the goal of spatial localization, the Simultaneous Localization and Mapping or SLAM algorithmsteps in.Actually, it’s not one algorithm, but rather, a common name for the family of algorithms that made a real breakthrough in positional tracking for AR.
Here is how it works:
The technology looks for reference points in the environment, determines the device’s position relative to them based on its own calculations and then builds its internal map. The underlying principle here is the use of key frames, or little green dots, that SLAM pitches to the objects in the environment to remember them.
To collect the data needed for accurate calculations, SLAM uses sensors, like a gyroscope, accelerometer, compass, and video from cameras, or some combination of these and other tools contained in the device.
Based on the calculations and estimations, SLAM generates either a dense or a sparse map of the scene. Dense maps are more detailed and mark every object and feature in the scene, whereas sparse maps are simpler, with only key objects marked. While dense maps are better for rendering and the precise calculations needed for light estimation and visualizing overlapping objects, sparse maps are perfect for position tracking since they have fewer key frames and are easier to navigate.
Thanks to advanced SLAM algorithms and other tools mentioned above, 3D virtual objects in AR apps don’t bounce around—they stay still even if the user or the device moves.
There are several approaches to leverage the power of SLAM in AR:
Semantic SLAM not only localizes, but also recognizes real objects in space. It’s a very painstaking task even for a program and is rarely used;
Feature-based SLAM look for an object’s distinctive feature points, creates key frames, and remembers them. It’s probably the most popular method these days;
Direct SLAM analyzes every single feature and pixel of an object, enabling the algorithm to create a dense map of the environment. It’s mostly used when high-precision tracking is needed.
Takeaways
Despite solid advances in technology, positional tracking remains one the biggest challenges AR. To ensure a truly immersive experience for users of AR-powered apps, developers need to improve the device’s spatial localization ability, in part through more accurate calculations.
SLAM is a well-established technology that has its pros and cons, but is improving as hardware technology advances become more reliable. Despite its complexity, SLAM constantly evolves and becomes more comprehensive. More accurate and consistent estimation techniques that allow mapping of large areas bring the benefit of improved position tracking in AR apps.