WebXRRealtime-post

How to make real-time multiplayer WebXR experiences (part 1)

Using Websockets, React Three Fiber and DynamoDB to allow multiple users to interact with each other simultaneously.

Table of Contents

Intro

Real-time multiplayer WebXR experiences are a really exciting prospect for the future of the internet, we are beginning to see how they could be used as the technology develops (e.g the term ‘Metaverse’).

Having recently posted about how to create WebXR experiences that work across devices, I wanted to go a step further and explain the practical steps involved in connecting the users who are on these devices together.

I’ve made a gif of my partner (wearing a Oculus Quest 2), an example user (Macbook Pro Desktop) and myself (wearing a HoloLens 2interacting with a 3D model in the shared XR space, see below.

WebXR Gif

 As a side note, I’m writing a library called Wrapper.js that hopefully will help make creating collaborative WebXR experiences like the above easier.

Overview

There is a lot to understand in how this web app is architected and the technologies used to develop it, this section will explain this information.

In order to make best use of this information, its important to understand three key points:

The Concept

This example works by creating an application that can render on different devices, creating a unique identifier for each device (e.g though user log in), storing that device’s positional data in real time in a database, then rendering all users data in WebGL.

The below image shows how this concept works across a Mixed Reality Heaset (HoloLens 2), Virtual Reality Headset (Oculus Quest 2) and Laptop (Macbook Pro).

A topline description of how this works is:

  1. A user visits the website on their device and logs in using their credentials, the site will render as appropriate for their device (thanks to the WebXR API)
  2. As they move around in the environment, their movement co-ordinates are sent in real time to a database and stored against their username
  3. They can see other users who are moving around the environment, by rendering other user’s movement data into the 3D scene

The Tech Stack

As this was originally written for Wrapper.js, it adheres to the use of Terraform, Serverless Framework and Next.js.

The below diagram details this further.

Below is an explanation that details these three categories further:

Terraform

  • Amazon API Gateway (both HTTP and Websocket implementations), that lambda functions from Serverless Framework are deployed to
  • DynamoDB, used to store data submitted and retrieved from the Front End via HTTP and Websocket requests to the API Gateway
  • Amazon S3, for deploying the front end static files that are exported by Next.js
  • Cloudfront, the Content Delivery Network (CDN) that allows your content to be accessed at the edge and also allows you to access your content with a custom domain name
  • Route53, the DNS records that assign your domain name to your Cloud Front CDN and API Gateway
  • Amazon Certificate Manager, for generating SSL certificates for your Domain Name
  • AWS Cognito, a service for managing user login information and used as federated identity to authorise Front End entry to the web app and Back End access to data

Serverless Framework

  • Lambda functions, that are deployed to API Gateways (created in Terraform) and execute backend logic written in NodeJS

Next.js

  • AWS Amplify, a library used in the front end which provides out of the box components  and helper functions, that make the use of user logins simple and secure
  • Zustand, a library that helps you implement easy state management across your application globally
  • React-Three-XR, a library that allows your React-Three-Fiber application to leverage the WebXR API
  • React-Three-Fiber, a library that makes integration of Three.js into React components simple, quick and efficient
  • Three.js, a library that allows you to create 3D content in the browser through the use of WebGL
  • React.js, a library that allows you to componentise your javascript

The App Flow

If you were to map out when these technologies get used within this example, it would look something like the below diagram.

If you start at the bottom of the diagram, you can see that all the Front End files are exported statically and hosted on the S3 bucket.

These files are then distributed with the Cloudfront CDN, assigned with an SSL certificate by AWS Certificate Manager and are provided with a custom domain name with Amazon Route 53.

At this point, Step 1 begins:

  • The user opens the website on their device (e.g the HoloLens 2) by visiting the domain name
  • At this point React-Three-XR renders the website based on the device’s capability (in this case a mixed reality headset)

 

Next up is Step 2:

  • The user logs into their AWS Cognito account, with the use of the AWS Amplify library on the Front End
  • Having logged in, the website now has a unique identifier for the person that is using that device

 

Once the user has successfully logged in and uniqly identified themselves, this sets them up nicely for Step 3:

  • There are two kinds of API that the user then interacts with – HTTP and Websockets.
  • The HTTP API is called once to get that users image, that can then be rendered in WebGL
    • Once the API is called, a lambda is triggered that uses the users log in details to query the database for information
    • The Amazon DynamoDB database stores information related to that users login credentials, such as a profile image url.
  • The Websocket API is called when ever the Front End needs to update the database and when ever there is an update to the database
    • As a user moves in the 3D world, their position co-ordinates are submitted to the Websocket API in real time
    • Once the API is triggered, a lambda function takes the submitted data and saves it to DynamoDB database
    • Once the data is saved to database, it is then all returned to the Front End, where the 3D renderer visualises all objects with positional data

Conclusion

I’ve tried my best to simplify the concepts in this post, to help make understanding the processes that enable real-time multiplayer WebXR experiences easier to digest.

In the next post, I will detail the practical sides of the actual code that is used to render multiple users positions in real-time.

Meanwhile, hope you enjoyed this post and have fun 😀

Share this post