
We look forward to presenting Transform 2022 in person again on July 19th and virtually from July 20th to 28th. Join us for insightful conversations and exciting networking opportunities. Register today!
The JPEG file format played a crucial role in the web’s transition from a textual world to a visual experience through an open, efficient container for sharing images. Now, the Graphics Language Transmission Format (glTF) promises the same for 3D objects in the metaverse and digital twins.
JPEG used various compression tricks to dramatically downsize images compared to other formats like GIF. The latest version of glTF similarly uses techniques to compress both the geometry of 3D objects and their textures. The glTF already plays a central role in e-commerce, as evidenced by Adobe’s foray into the Metaverse.
VentureBeat spoke to Neil Trevett, President of the Khronos Foundation, which governs the glTF standard, to learn more about what glTF means for businesses. He is also the VP of Developer Ecosystems at Nvidia, where his mission is to make GPUs easier for developers to use. He explains how glTF complements other digital twin and metaverse formats like USD, how to use it and where it goes.
VentureBeat: What is glTF and how does it fit into the ecosystem of file formats related to metaverse and digital twins?
Neil Trevett: At Khronos we put a lot of effort into 3D APIs like OpenGL, WebGL and Vulkan. We’ve found that any application that uses 3D will eventually need to import assets. The glTF file format is widely used and very complementary to USD, which is becoming the standard for creating and authoring on platforms like Omniverse. USD is the place to go if you want to merge multiple tools into sophisticated pipelines and create very high quality content, including movies. Because of this, Nvidia is investing heavily in USD for the Omniverse ecosystem.
On the other hand, glTF focuses on being efficient and easy to use as a delivery format. It is a lightweight, streamlined and easy to work with format that can be used by any platform or device up to and including web browsers on mobile phones. The slogan we use as an analogy is “glTF is the JPEG of 3D”.
It also complements the file formats used in authoring tools. For example, Adobe Photoshop uses PSD files to edit images. No professional photographer would edit JPEGs because a lot of information has been lost. PSD files are more sophisticated than JPEGs and support multiple layers. However, they would not send a PSD file to my mother’s cell phone. You need JPEG to get it to a billion devices as efficiently and quickly as possible. So USD and glTF complement each other in a similar way.
VentureBeat: How do you go from one to the other?
Trevett: It is important to have a seamless distillation process from USD assets to glTF assets. Nvidia is investing in a glTF connector for Omniverse so we can seamlessly import and export glTF assets to and from Omniverse. We at the glTF working group at Khronos are pleased that USD meets the industry’s need for an authoring format, which is a tremendous amount of work. The goal is for glTF to be the perfect distillation target for USD to support ubiquitous use.
An author format and a delivery format have very different design requirements. USD’s design is all about flexibility. This helps in composing things to create a movie or VR environment. If you want to insert another asset and blend it into the existing scene, you must keep all of the design information. And you want everything at a base truth level of resolution and quality.
The design of a transmission format is different. For example, with glTF, the vertex information is not very flexible for reauthorization. But it’s rendered in exactly the form the GPU needs to execute that geometry as efficiently as possible via a 3D API like WebGL or Vulkan. Therefore, glTF puts a lot of design effort into compression to speed up download times. For example, Google has contributed its Draco 3D mesh compression technology and Binomial has contributed its universal texture compression technology base. We’re also starting to put a lot of effort into managing the Level of Detail (LOD) so you can download models very efficiently.
Distillation helps to switch from one file format to another. A big part of this is removing design and authoring information that you no longer need. But you don’t want to reduce the visual quality unless you really have to. With glTF you can maintain visual fidelity, but you also have the option to compress things if you’re aiming for low-bandwidth delivery.
VentureBeat: How much smaller can you make it without sacrificing too much fidelity?
Trevett: It’s like JPEG where you have a dial to increase compression with an acceptable loss in image quality, only glTF has the same for geometry and texture compression. When it comes to a geometry-intensive CAD model, the geometry makes up the bulk of the data. However, if it is more of a consumer-oriented model, the texture data can be much larger than the geometry.
With Draco, data can be reduced by a factor of 5 to 10 without significant loss of quality. There is also something similar for texture.
Another factor is the required storage space, which is a valuable resource in mobile phones. Before we implemented binomial compression in glTF, people sent JPEGs, which is great because they’re relatively small. But the process of unpacking it into a full-size texture can take hundreds of megabytes for even a simple model, which can affect the performance and performance of a mobile phone. The glTF textures allow you to take a jpeg-sized, super-compressed texture and instantly unpack it into a GPU-native texture, so it never grows to full size. This reduces both data transfer and storage requirements by a factor of 5 to 10. This can be useful when downloading assets into a browser on a mobile phone.
VentureBeat: How do humans render the textures of 3D objects efficiently?
Trevett: Well, there are two basic classes of textures. One of the most common are image-only textures, e.g. B. mapping a logo image onto a t-shirt. The other is a procedural texture, where you create a pattern like marble, wood, or stone simply by running an algorithm.
There are several algorithms you can use. For example, Allegorithmic, recently acquired by Adobe, pioneered an interesting technique for generating textures that are now used in Adobe Substance Designer. They often turn this texture into an image because it’s easier to process on client devices.
Once you have a texture, you can do more with it than just slap it onto the model like a piece of wrapping paper. You can use these texture images to create a more sophisticated look to the material. For example, with physically based rendered (PBR) materials, try to emulate the properties of real materials as much as possible. Is it metallic which makes it look shiny? Is it clear? Is light broken? Some of the more sophisticated PBR algorithms can use up to 5 or 6 different texture maps feeding in parameters that characterize how glossy or translucent it is.
VentureBeat: How has glTF evolved on the scene graph side to show the relationships within objects e.g. B. how car wheels turn or connect several things?
Trevett: This is an area where the USD is way ahead of the glTF. Most glTF use cases have so far been fulfilled by a single asset in a single asset file. 3D Commerce is a leading use case where you want to bring up a chair and place it in your living room like Ikea. This is a single glTF asset and many of the use cases were happy with that. As we move toward the Metaverse and VR and AR, people want to create multi-asset scenes for delivery. An active area being discussed in the working group is how best to implement and link multi-GLTF scenes and assets. It won’t be as sophisticated as USD as the focus is on transmission and delivery rather than creation. But glTF will have something in the next 12 to 18 months to allow multi-asset composition and linking.
VentureBeat: How will glTF evolve to support more metaverse and digital twin use cases?
Trevett: We need to start bringing in things that go beyond mere appearance. Today we have geometry, textures and animations in glTF 2.0. The current glTF says nothing about physical properties, noise or interactions. I think many of the next-gen extensions to glTF will include this type of behavior and properties.
The industry is now deciding that there will be USD and glTF in the future. Although there are older formats like OBJ, they are starting to show their age. There are popular formats like FBX that are proprietary. USD is an open source project and glTF is an open standard. People can participate in both ecosystems and help evolve them to meet their customer and market needs. I think both formats will develop in parallel. Now the goal is to match them and keep that efficient distillation process between the two.