Masters of the Metaverse


“Masters of the Metaverse” may seem like the title of a new sci-fi thriller, but that’s actually what Learning and Development (L&D) professionals may need to become in the not-so-distant future.

“The metaverse is still very much in the early stages of its development, but I believe immersive and interactive learning experiences are going to quickly become the gold standard in the L&D space,” believes Dr. Alex Young, founder of Virti. “I predict that immersive technologies and metaverse-hosted L&D programs are going to become the go-to solution for organizations looking to get ahead of the competition and attract top talent.”

With that in mind, here’s a look at some evolving and emerging technologies—from artihcial intelligence, photorealistic avatars, and computer-generated imagery to voice technology and customized virtual spaces—that will have a signihcant impact on the way people learn and trainers train in the coming years.


What exactly does the “metaverse” mean? “A simple way to think of it is as a richer and more immersive version of everything the Web and mobile currently have to offer,” explains justin Parry, co-founder and COO of Immerse. “It can open up a new realm of experience, engagement, and expression, which, in turn, will lead to new forms of social interaction and collaboration, new ways of dehning ourselves, and ultimately a completely different perspective on day-to-day life as we know it.”

Dr. Young says the move to metaverse-hosted L&D programs is being driven by new research into the ways in which humans learn and retain new skills, “the conclusions of which are delivering proof that the most effective way of learning is to be immersed in a tailor-made environment where practice opportunities are inhnite and on-demand.” He says studies have shown that when immersive simulations are used consistently for training new hires, learners experience a 61 percent higher retention of information during the first 30 days after initial onboarding and achieve a27 percent faster time to productivity. “As the technology underpinning the metaverse improves, it’s getting easier for large and small organizations to build their own virtual hubs where training content can be hosted and where learners can interact with one another regardless of physical location,” Dr. Young says. “Because of this, it will become the norm for employees to enter a virtual space to, for example, practice their communication skills with an artificial intelligence-powered avatar, to participate in immersive unconscious bias training, or brush up on their delivery of complex product demos.”

Adds D-ID co-founder and CEO Gil Perry, “Being able to put your photorealistic avatar in all sorts of virtual situations allows for training scenarios to be played out at zero risk if they go wrong. We’ve seen this done so effectively with flight simulators for pilot training, but that could be extended to any real-world situation employees have to deal with.”


Virtual reality (VR) and immersive technology in general offer a whole new level of performance data for soft skills and training in general, capturing user behavior in ways not previously possible, according to Parry, whose company created the Immerse Platform and Immerse Marketplace. “We can track and capture data available within the current learning technology paradigm, such as semantic data, speech patterns, and decision-making. The difference is the heightened level of immersion VR provides, increasing the realism of responses and emotional engagement, plus the ability to track body language and movement through space. In the near future, mainstream headsets also will allow eye tracking, facial expression, and biometrics, which will only enhance this ability to elicit and measure realistic responses.”

Used by companies such as DHL, Shell, QinetiQ, and GE Healthcare, the Immerse Platform is a content aggregation, distribution, and reporting solution that allows companies to create, implement, scale, and measure virtual reality training solutions. As an open platform with a free software development kit (SDK), content can come from any source.

The newly launched Immerse Marketplace offers companies an ever-growing library of 100-plus VR training applications. Topics are applicable to a wide range of industries and target training in people skills, technical skills, and health and safety.

Parry says Immerse’s newest product is what the company calls “VR-in-a-box.” It provides five headsets, five pieces of Immerse Marketplace content, access to the Immerse Platform for a year, a pre-installed Mobile Device Management (MDM) solution, and shipping.


Frameable provides a fully customizable virtual space platform that supports many virtual event types, including virtual training. “We offer three different room layouts, with our Stage and Hybrid configurations most frequently used for virtual training,” says Frameable founder and CEO Adam Riggs. “Our Stage layout allows hosts to showcase presentations, pre-recorded videos, and panels. The audience can participate by raising their hands to ask a question, joining on stage (with permission), messaging one another or hosts in the text chat, or reacting with emojis that appear on screen for all participants.”

Riggs says the Hybrid layout combines Stages and Tables. In this scenario, attendees can view what is happening on the Stage while discussing or collaborating with attendees privately at their table.

A customer in the manufacturing industry used Frameable to design virtual training events for its incoming summer interns. The company hosted two one-hour training sessions on the platform, with the main goals of reviewing the hiring and onboarding process and sharing additional training information the interns might need.

Welcoming and closing sessions were held on a stage. The customer used Hybrid layouts to break up trade marketing and non-trade marketing interns with tables based on location to provide more specific training information. “This layout allowed interns to get to know other program participants from their geographic location and enabled small group conversations,” Riggs says.


The move to remote work and eLearning has put a tremendous strain on employee engagement, notes D-ID’s Perry. “Despite the need for a more relatable and welcoming workplace, the ‘human-touch’ has been lost. Not only is video proven to engage more than the written word—especially among the younger generations of employees who are more accustomed to rich media—but moving human faces are the most compelling of all, increasing attention, engagement, and retention of knowledge.”

With D-ID’s new virtual instructor platform, videos can be created in minutes, using any front-facing image the company is licensed to use. This could be pictures of a company CEO, team members, or stock images found online. Once the script or audio is added, at the click of a button, D-ID’s AI video generator combines the images with the audio or text, creating photorealistic avatars, with expression and speech.

If using text, the voice is generated using Microsoft Azure text-to-speech, which has 270 neural voices across 119 languages and variants, according to Perry. “These customizable videos allow companies with a distributed workforce to offer a diverse range of presenters in terms of age, ethnicity, gender, and language,” he says.

Japanese eLearning platform Skill Plus currently is utilizing the technology. According to Skill Plus, prior to using D-ID’s technology, a 30-minute eLearning course would require an allocation of four hours for travel, preparation, filming, and cleaning, as well as the instructor, a filming crew, and a director. Using D-ID’s technology, Perry says, Skill Plus developed Meta Speaker, a system that generates digital humans from photographs and audio. Now a single staff member can create a digital human learning instructor in about 30 minutes.

Perry says Skill Plus recently developed a service aimed specidcally at owners of beauty salons, who face particular challenges keeping their employees up to date with regulations and best practices.


Scott Stachiw, director of Immersive Learning with Roundtable Learning Company, says he’s seeing more organizations invest in doing soft skills training using computer-generated imagery (CGI). “In the past, it’s always been soft skills training with 360° video, but with CGI, it’s much easier to swap out characters, languages, and other variables for organizations so they can customize their training to the demographics they are trying to reach,” he explains.

Roundtable Learning Company recently developed a training program with a retail company to help train its delivery personnel. “Their primary goal when they reached out to us was to replace their entire in-home delivery process with a CGI town—including different houses, the delivery vans, the retail building itself—to make the VR/AR experience as realistic as possible,” Stachiw says. “Then we created different scenarios that reflect real-life situations trainees might encounter in their day-to-day operations.”

After the success of the pilot program, the retailer decided to expand the program; thousands of employees have participated in the program so far.


Another trend Stachiw points to is the use of voice recording during training modules. “The technology is now advanced enough that it allows the trainee to speak directly to the characters in the simulation and record their voice,” he notes. “Then, at the end of the branched learning module, the trainee can watch the interaction and hear their own voice, which provides them the ability to evaluate how their interaction could be improved.”

Voice thrives in situations where hands and eyes are tied up, notes RAIN Technology CEO Nithya Thadani. “Humans can speak three times faster than they can type or text. For workers, voice is a natural solution for functions such as quick data entry or retrieval.”

That’s why RAIN Technology recently incorporated voice AI into its training technology for auto technicians. “With our Ortho technology, which launched this month to eight pilot shops, technicians can ask for a specification specific to their vehicle and get an answer in seconds, starting with the most common specs looked up across vehicle makes and models,” explains Thadani. “We are using voice AI to map what a technician says to extract their intent, so we can rapidly deliver the right answer to a repair question—and do this seamlessly through voice via a flexibly mounted tablet so they don’t have to interrupt their workflow.”

Content is included in the product, and is sourced from industry databases that aggregate OEM (original equipment manufacturer) data. “In the future, we’re looking at integrating shop-specific data, and even data that is crowd-sourced from other technicians on clever fixes,” Thadani says.


“We see the evolution of VR/AR PC characters, combined with some sort of artificial intelligence, to allow soft skills training to be more natural,” Stachiw says. “This will allow trainees to experience more freedom in how they interact with the simulation and give them more natural responses in their real-life work.”

Stachiw also believes mixed reality (MR) training “will really take off in the next couple of years. The necessary technology is still in the development phase, but there is some hardware out there that will allow MR training to come to the fore. MR is going to open up an entirely new avenue for tactile interactions and change the way training is done in the future.”


The Vanderbilt University School of Nursing has limited opportunities to effectively train its growing Nursing School enrollment to become ultrasound technology certified due to medical equipment costs and a strong focus on patient safety. The cost of ultrasound devices, while continuing to fall, can still be very expensive. Taking existing scanners out of patient production to support the training mission is not an option.

Replacing much of the didactic training with virtual environment experiences allows students to view content at their leisure, orient themselves on the anatomical features being scanned, and view the ultrasound output in full field of view in real time, coinciding with the clinician’s interaction with the patient.

The Solution

Vanderbilt University School of Nursing purchased ELB Learning’s CenarioVR, a virtual reality course authoring application. Three different media components were combined to simulate the clinical environment: 360° video of the clinician and patient interactions shown from two distinct camera angles; a top-down view with graphic, anatomical overlays; and streaming video of the ultrasound output timed to coincide with exactly how the instructor is interacting with the patient.

Each video scenario was enhanced with additional instructional elements on the top-down view of the patient, such as drawings of anatomical elements that appear when selected. Questions, annotations, buttons, and hotspots were included to enforce key concepts and quiz the students on probe placement. Once each scenario was finalized with interactions and assessments, the course was published in two formats: HTML5 for viewing on a computer’s browser, as well as CenarioVR Live and mobile for viewing on a phone in a Google Cardboard or Oculus Go headset.


  • By preloading much of the instruction into an immersive format, the need for face-to-face instruction and time spent physically touching the equipment was substantially reduced.
  • Vanderbilt University avoided the need to purchase five additional ultrasound machines valued at $425,000.
  • Since rolling out the immersive module in spring 2019, learners reported they had a more efficient training experience with the ultrasound probes.


A multinational commodity trading company asked Immerse to help it improve safety procedures for gas station operators and truck drivers while promoting higher business performance.

The solution to this challenge was a VR training application that teaches service station operators and truck drivers safe fuel handling practices, delivered via VR headsets and distributed and tracked using the Immerse Platform.

Immerse conducted a pilot study of 76 employees during the last half of 2021 and found:

  • 88 percent of employees improved key behaviors on the job post-training
  • A 69 percent average overall performance improvement
  • A 76 percent average overall accident risk reduction
  • A $1,573 benefit per employee directly attributable to VR training
  • A 77 percent ROI based on benefits vs. costs after just three months


To maximize the results of any training programs hosted in the metaverse, L&D professionals will need to keep three key pillars in mind:

1. Engagement: Excellent training content is content that makes learners want to keep learning—for today’s generation, that means high-quality, well-produced content that is intuitive to navigate. The learning objectives and the journey should be clear; elements of gamification can be drawn on; and multimedia resources should be integrated.

2. Interactivity: Training should encourage learners to make choices, apply their new knowledge, and take control of their own progression.

3. Personalization: When designing new training content for the metaverse, look to create bespoke immersive content tailored to solve specific organizational challenges and bridge the skills gaps that actually need bridging.

Lorri Freifeld
Lorri Freifeld is the editor/publisher of Training magazine. She writes on a number of topics, including talent management, training technology, and leadership development. She spearheads two awards programs: the Training APEX Awards and Emerging Training Leaders. A writer/editor for the last 30 years, she has held editing positions at a variety of publications and holds a Master’s degree in journalism from New York University.