Follow Me (Reinforcement Learning)

Technical Overview

The Follow Me project represents an innovative approach to augmented reality navigation and guidance. At its core, the system combines several cutting-edge technologies to create an intuitive and responsive user experience.

Visual Positioning System

The system employs a sophisticated Visual Positioning System (VPS) that integrates GPS localization with camera-observed features. For exterior areas covered by Google Geospatial mapping, we utilize their service, while in other areas we implement a third-party solution requiring prior spatial mapping. This mapping can be accomplished using standard mobile devices or specialized 360-degree cameras.

Intelligent Agent Integration

The virtual guide is powered by Unity's ML-Agents framework for reinforcement learning, enabling the agent to learn optimal navigation behaviors in complex environments. The system uses a locally-running speech-to-text module combined with a Large Language Model enhanced by Retrieval Augmented Generation (RAG) for natural interactions. Through training in dynamically generated virtual environments, the agent learns to handle real-world navigation challenges while maintaining natural movement patterns. The system is further enhanced by an empathic voice interface that can respond to users' facial expressions detected through the front-facing camera.

Environmental Awareness

Using computer vision and AI, the system maintains real-time awareness of its surroundings. It can identify and segment various environmental features including buildings, trees, people, and vehicles. The system generates collidable meshes for observed objects, preventing the avatar from passing through them and maintaining the illusion of physical presence.

Realistic Integration

Advanced depth synthesis techniques enable seamless blending of virtual elements with the real world. The system determines environmental lighting conditions to ensure proper rendering, shadows, and reflections. This attention to detail helps maintain the immersive quality of the experience.

Cloud Infrastructure

The system utilizes Cloud Anchors to persist AR experiences in real-world locations. This enables the creation of shared experiences where multiple users can interact with the same virtual objects in physical space, and allows for the placement of dynamic virtual content that can be accessed repeatedly over time.

Reinforcement learning by Profit0101 from Noun Project (CC BY 3.0)