EXT_mesh_gpu_instancing #1691

ultrafishotoy · 2019-10-23T17:49:44Z

start instancing discussion

ultrafishotoy · 2019-10-23T18:45:55Z

I'm not sure where the new attributes (TRANSFORM4x3, and ID) should get defined.

donmccurdy · 2019-10-23T19:49:12Z

We should also state that this extension's goal is to enable GPU instancing¹. The definition given in the Unity docs is quite good, I think:

Use GPU Instancing to draw (or render) multiple copies of the same Mesh at once, using a small number of draw calls. It is useful for drawing objects such as buildings, trees and grass, or other things that appear repeatedly in a Scene. GPU Instancing only renders identical Meshes with each draw call, but each instance can have different parameters (for example, color or scale) to add variation and reduce the appearance of repetition. GPU Instancing can reduce the number of draw calls used per Scene. This significantly improves the rendering performance of your project.

Of course the bit about "each instance can have different parameters" doesn't fully apply here; as we haven't yet defined — and may not define — ways to override arbitrary material parameters per-instance.

But otherwise I think those are the rough goals for this extension. Do others agree, or have I missed anything important? I suppose transmission size could also be mentioned.

¹ I just want to disambiguate "GPU instancing" from "instancing" because the latter could also mean simple reuse of transmitted data (accessor, texture, mesh, material, etc.) throughout a glTF asset. glTF already allows this, and the feature is widely used. It does not, however, directly enable efficient reduction of draw calls as this extension might.

donmccurdy · 2019-10-23T19:52:12Z

And... would you be able to share some example use cases? We've discussed at various points, but just want to add that context in the thread too.

If by any chance it would be easy to generate a sample asset using the proposed extension that would also be very neat to see, but no need for anything yet if that's a nontrivial amount of work.

extensions/2.0/Khronos/KHR_instancing/extensions/2.0/Khronos/KHR_instancing/README.md

donmccurdy · 2019-10-23T20:04:15Z

extensions/2.0/Khronos/KHR_instancing/extensions/2.0/Khronos/KHR_instancing/README.md

+{
+    "nodes": [
+        {
+            "mesh": 0,


One option we should consider is enabling the KHR_instancing extension to (optionally) override the node.mesh property. For example:

name: "teapot", mesh: 0, extensions: { KHR_instancing { mesh: 1, attributes: { ... } } }

This simply provides exporters with some flexibility to control how/if fallback happens for viewers that don't recognize the extension:

When the mesh override is omitted, clients that don't recognize the extension will render a single instance of the mesh with the node's transform.

When the mesh override is provided, it should reference a single instance of the mesh. The fallback node.mesh value could then point to anything the exporter chooses – a merged mesh of all instances, a single point, or a textured plane saying "404 Not Found".

When the KHR_instancing extension is marked as required, clients that don't recognize the extension are expected to fail fast without attempting to proceed.

ultrafishotoy · 2019-10-23T20:05:10Z

I chose this design based on what Octane allows; any given mesh can be referenced by multiple nodes some of which may be instanced. While it is possible to collapse all references to a single group of instance transforms, I prefer to keep the octane node structure intact so that it can be leveraged at run-time for things like hide/show, procedural animation, culling, etc.

…

On Wed, Oct 23, 2019 at 12:55 PM Don McCurdy ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In extensions/2.0/Khronos/KHR_instancing/extensions/2.0/Khronos/KHR_instancing/README.md <#1691 (comment)>: > +For example, the following defines some instancing attributes to a mesh node in the graph. Instancing only applies to mesh nodes. Applying to nodes rather than meshes allows the +same mesh to be attached as normal to nodes while also being instanced. + +```json +{ + "nodes": [ + { + "mesh": 0, + "name": "teapot", + "extensions": { + "KHR_instancing": { + "attributes": { + "TRANSFORM4x3": 0, + "ID": 1 + }, + } I suspect this is the right design, but just to mention an alternative, the KHR_instancing object could also be left empty and the attributes put into the mesh's primitive.attributes instead. The per-instance would have different counts, compared to the native mesh attributes, and this would break implementations that don't recognize the extension, if they tried to use those attributes. Recommend keeping as-is. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1691?email_source=notifications&email_token=ACCZHRBW45QFSQF2336Z7FDQQCT3JA5CNFSM4JEF2PVKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCI7XUUA#pullrequestreview-306149968>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACCZHRG2QOC7TPMX4GYVBATQQCT3JANCNFSM4JEF2PVA> .

extensions/2.0/Khronos/KHR_instancing/extensions/2.0/Khronos/KHR_instancing/README.md

ultrafishotoy · 2019-10-23T20:18:58Z

I think the optional mesh is a good idea. Another property I think that would be helpful is a bounding box for the entire transform group. It could be useful for culling and picking and calculating it at run-time would be very time consuming.

ultrafishotoy · 2019-10-23T20:26:08Z

I can definitely provide some sample assets. I've got some teapot scenes specifically for testing. I guess I just need to resolve how to deal with the new attributes.

lexaknyazev · 2019-10-23T21:37:26Z

extensions/2.0/Khronos/KHR_instancing/extensions/2.0/Khronos/KHR_instancing/README.md

+            "extensions": {
+                "KHR_instancing": {
+                    "attributes": {
+                        "TRANSFORM4x3": 0,


The spec should mention in which coordinate space those transforms are defined. Reasonable options are

local, like a regular mesh (full chain: instanced transforms, local node's TRS, all parents to the scene root);

scene, like skinned mesh (all local and parent node transforms are ignored).

We're only using local space at otoy. Maybe that could be handled by alternate 'TRANSFORM' attributes that include the space? i.e. TRANSFORM_LOCAL, TRANSFORM_SCENE, with maybe TRANSFORM defaulting to LOCAL?

TRANSFORM_SCENE behavior can be achieved with a node positioned at the scene root. Supporting only local (with explicitly described interaction between JSON-stored local TRS and accessor-stored instanced TRS) seems enough to cover both cases from engine perspective.

lexaknyazev · 2019-10-23T21:53:04Z

extensions/2.0/Khronos/KHR_instancing/extensions/2.0/Khronos/KHR_instancing/README.md

+                "KHR_instancing": {
+                    "attributes": {
+                        "TRANSFORM4x3": 0,
+                        "ID": 1


Are these IDs implementation-dependent, with no spec-defined usage? If so, they shouldn't be included here. glTF 2.0 has an established workflow for supporting app-specific attribute semantics - they just start with an underscore.

Otherwise, the spec should be more sound and include at least:

allowed datatypes (likely scalar uint8/16/32, uint32 is not available on WebGL 1.0);

uniqueness within a set of instances, within a scene, or within an asset (otherwise they should not be called IDs).

In shaders, GPU instances have built-in IDs. Should this attribute be somehow related to them?

takahirox · 2019-10-23T23:22:34Z

extensions/2.0/Khronos/KHR_instancing/extensions/2.0/Khronos/KHR_instancing/README.md

+            "extensions": {
+                "KHR_instancing": {
+                    "attributes": {
+                        "TRANSFORM": 0,


Thinking of instancing + key frame animation. Allowed values for animation.channel.target.path are

translation

rotation

scale

weights

https://github.com/KhronosGroup/glTF/tree/master/specification/2.0#targetpath-white_check_mark

Do we also need KHR_instancing extension in animation.channel to specify TRANSFORM attribute as target.path?

"channels": [ { ...., "KHR_instancing": { "sampler": 0, "target": { "node": 0, "path": "transform" } } } ]

(Or instancing + key frame animation is not popular because of non good performance?)

Thanks! Replied on #1691 (comment) for wider visibility.

donmccurdy · 2019-10-24T17:50:36Z

Do we also need KHR_instancing extension in animation.channel to specify TRANSFORM attribute as target.path? ... Or instancing + key frame animation is not popular because of non good performance?

@takahirox I'm not familiar with best practices for expressing animation of instance batches in a runtime-friendly way. If there are tools, engines, or formats that have set a good precedent on that, please let us know! The closest I can think of would be Point Caches in FBX (definition, example), and I don't know whether that approach would be appropriate for glTF. In particular, and similar to a known limitation of morph target animation, I expect users would want the ability to animate a single instance in the batch, rather than having to bake the transforms of each instance at each keyframe collectively.

Because I'm not really confident of the answers on this, I'm inclined to say animation should be left undefined by this extension, but perhaps added later through something like #1301.

However, I do think this extension should provide the ability to set application-specific, per-instance attributes on the batch. In addition to the application-specific, per-vertex attributes already allowed on the mesh. For example...

meshes: [{
  primitives: [{
    attributes: {
      POSITION: 0,
      _BARYCENTRIC: 1
    }
  }]
}]
nodes: [{
  name: "tree_person",
  mesh: 0,
  extensions: {
    KHR_instancing {
      mesh: 1,
      attributes: {
        TRANSFORM: 2,
        ID: 3,
        _POSITION_KEYFRAME_1: 4,
        _POSITION_KEYFRAME_2: 5,
        _POSITION_KEYFRAME_3: 6
      }
    }
  }
}]

... in this case, the attributes prefixed with _ are application-specific, and when imported to a generic viewer they are likely to be ignored entirely. However, any user might set up an application to use them as follows:

_BARYCENTRIC: Defines barycentric coordinates of each vertex, useful in applying stylized wireframe effects in custom shaders. These coordinates are the same for all instances.
_POSITION_KEYFRAME_1–3: Defines positions of each instance, at three known keyframes. The application is responsible for determining interpolation between the keyframes; these are not associated with any entry in the glTF animations list.

tl;dr — I'm hoping to allow some flexibility for content authors to apply custom effects to instanced glTF models. Custom attributes and instance IDs enable that without locking in the spec to a particular animation mechanism prematurely. If/when we identify best practices for this sort of animation, which can be implemented in a performant and portable way, suggestions for a formal extension for instancing animation would of course be welcome.

takahirox · 2019-10-24T23:54:04Z

@donmccurdy

Thanks for the comment. TBH I'm not familiar with, too. I'm happy if someone shares the best practice.

I expect users would want the ability to animate a single instance in the batch, rather than having to bake the transforms of each instance at each keyframe collectively.

Agreed.

I'm inclined to say animation should be left undefined by this extension, but perhaps added later through something like #1301.

Oh I didn't know #1301. Yeah I think ok to place animation stuffs out of this extension but somewhere else.

tl;dr — I'm hoping to allow some flexibility for content authors to apply custom effects to instanced glTF models. Custom attributes and instance IDs enable that without locking in the spec to a particular animation mechanism prematurely. If/when we identify best practices for this sort of animation, which can be implemented in a performant and portable way, suggestions for a formal extension for instancing animation would of course be welcome.

Sounds good to me!

takahirox · 2019-10-25T00:03:03Z

Is per-instance material parameter out of scope of this extension? (Sorry if I'm missing anything.) For example user may want to set different baseColorFactor, roughnessFactor, and other material parameters for each instance.

baseColorFactor = baseColorFactor * instanceBaseColorFactor;
roughnessFactor = roughnessFactor * instanceRoughnessFactor;

zeux · 2019-10-29T04:55:54Z

Trying to understand the rationale behind this - glTF already allows referencing the same mesh from many nodes in the transform graph. glTF loader can thus gather, for each mesh, a set of nodes that instantiate it and create the transform buffer at runtime - either at load time, if animation is not used / supported, or at runtime.

Does this extension target scenes with thousands of instances of the same mesh and is thus intending to reduce the file sizes? If so, is 4x3 matrix an appropriate storage or should we use independent streams of T/R/S data instead?

lexaknyazev · 2019-10-29T07:25:39Z

Does this extension target scenes with thousands of instances of the same mesh and is thus intending to reduce the file sizes?

Yes. It also reduces processing time as the transform buffer can be copied to GPU as is.

If so, is 4x3 matrix an appropriate storage or should we use independent streams of T/R/S data instead?

Currently, it's based on an existing vendor-specific implementation. More efficient storage options (such as a lossless compression pass, sparse arrays, channels separation, etc) will certainly be considered.

donmccurdy · 2019-11-06T03:20:18Z

glTF loader can thus gather, for each mesh, a set of nodes that instantiate it...

@zeux While a glTF loader could do this, in practice they don't. At least for three.js, doing so would mean reducing the node hierarchy in an opinionated way that feels unexpected without some hint (either from the asset, or from the application) that the meshes are meant to be batched.

From that perspective, I see the rationale here as (1) a way for an asset to identify that meshes can and should be batched, and (2) efficient transmission+parsing of these batches.

I do recognize that specifying GPU batches is somewhat unusual territory for a 3D asset format. Based on the goals of glTF for runtime efficiency, I think this could be a reasonable choice, but would be glad to hear if you'd suggest something else or think this is a bad idea altogether. 🙂

ultrafishotoy · 2020-04-21T20:10:19Z

Oh, I didn't realize that we're waiting on gltfpack. I'll try to implement something this week.

One thing I'd like to see in the spec is a precise definition of the accessor component types for pos/rot/scale. I'd recommend matching the animation specification for this.

Thanks @zeux ! I'll have a look at the anim spec and update the schema.

zeux · 2020-04-22T01:32:13Z

One more spec clarification (this one isn't really relevant for gltfpack, but it's good to highlight) - is there a usecase for this extension being optional (used but not required), or does it have to be in the required list, since there's no alternate way to specify instancing information?

ultrafishotoy · 2020-04-22T02:26:51Z

One more spec clarification (this one isn't really relevant for gltfpack, but it's good to highlight) - is there a usecase for this extension being optional (used but not required), or does it have to be in the required list, since there's no alternate way to specify instancing information?

My feeling is that it should be required if used; if it was optional it wouldn't confer any of the transmission, loading, and runtime benefits that are specifically what this extension is designed for, and large instancing batches could get unwieldy really quickly. Anyway, lets try to do a quick discussion about this tomorrow.

donmccurdy · 2020-04-22T02:55:41Z

See #1691 (comment) for some previous thoughts on optional use. In general it is a good thing if extensions can be designed to enable optional use. In this case, I think that could be accomplished by adding a mesh property to the extension which overrides the default mesh. Then the default mesh property becomes a fallback for clients not supporting the extension, and could refer to (a) a single instance of the object, (b) a joined mesh containing all instances in an external .bin buffer, or (c) whatever the author feels is appropriate to show.

That said, few (no?) users have used runtime optimization extensions — like Draco — without requiring them, in practice. KHR_draco_mesh_compression supports uncompressed fallback in external buffers, but the fallback feature hasn't really been used. If we think the complexity of a fallback mesh is harmful, I could be persuaded to skip it.

juj · 2020-04-24T14:34:46Z

Heya, dropping some drive-by ponderings here, hope it is welcomed!

What is the value of having this extension? Isn't having this a bit redundant?

https://github.com/KhronosGroup/glTF-Sample-Models/blob/master/2.0/SimpleMeshes/glTF/SimpleMeshes.gltf#L13 already shows how to support instancing by sharing mesh references, unless I am mistaking something?

It seems odd to have the file format specify what vertex attributes would be instanced. This is because engines have different feature sets that they can support in their instanced code paths, that are engine/shader/code layout specific, rather than input asset specific?

Some engines support instancing on static meshes that have the same texture only (i.e. only instance transforms), other engines might support instancing color data as well (modulate per-vertex colors by a color multiplication modifier, or modulate a texture by a color multiplication modifier). Certain engines might instance also animated meshes, etc.

Since typically the feature set of what per-instance attributes an engine is capable to render instanced, and which it can't depends on the shader code paths that are implemented, having a file format that would specify what the structure of the instance buffer should be seems like it would be dictating how a renderer must be implemented?

Also different GLTF files will be authored with different instance formats, and an engine that wanted to support all would be expected to create a permutational combination of rendering shader paths to support all of them? Also note that the full possible combination of instance buffer structures is not being specified here?

May be that I am missing something here, though this extension seems like it does not need to exist? If an engine is capable of collecting compatible nodes to be submitted instanced, it could do that already without this extension?

I would understand that more instancing opportunities would be enabled by having a spec that would help reduce duplicate data, so that e.g. a renderer would not need to identify when two duplicated materials have the identical values (though this is already achieved by a good exporter), but this spec does not seem to improve reducing data duplication/sharing of data either(?)

Does this extension help some renderers implement instancing, when before they would not have been able to? If it does, perhaps the spec might be helpful to have an intro paragraph to motivate why it helps some renderers "solve" instancing for them?

Great work with GLTF on everyone involved, just scratching my head a bit on this one :)

ultrafishotoy · 2020-04-29T16:37:21Z

Hi @juj, this extension is specifically for really large instancing groups (thousands or even millions of trees, clumps of grass, etc.), with the goal of reducing file size, transmission time, and load time. Have a look at @donmccurdy's excellent analysis here : #1691 (comment).

juj · 2020-04-29T19:56:30Z

with the goal of reducing file size, transmission time, and load time.

Oh, thanks, gotcha. Missed that this was about conserving disk space! The description rationale at https://github.com/KhronosGroup/glTF/pull/1691/files#diff-d53ba0f3c9191f50ab2501ad1edcc6f1R21 does not mention reducing file size, transmission time, and load time, but makes it read as if it'd be to optimize GPU rendering efficiency.

ultrafishotoy · 2020-04-29T20:09:02Z

with the goal of reducing file size, transmission time, and load time.

Oh, thanks, gotcha. Missed that this was about conserving disk space! The description rationale at https://github.com/KhronosGroup/glTF/pull/1691/files#diff-d53ba0f3c9191f50ab2501ad1edcc6f1R21 does not mention reducing file size, transmission time, and load time, but makes it read as if it'd be to optimize GPU rendering efficiency.

I'll add some language about that. Thanks!

zeux · 2020-04-29T21:37:53Z

For reducing transmission size, would it be possible to incorporate support for normalized 16-bit integers for quaternion storage?

glTF animation data currently allows 32-bit floating point, 16-bit normalized integer and 8-bit normalized integer values, but in my experience 8-bit quaternions aren't really adequate in terms of precision so we could either specify both 8-bit/16-bit for consistency with animations, or just 16-bit for pragmatic reasons. GPUs support decoding 16-bit normalized integers when the input is bound as an (instanced) vertex stream, so I'm assuming it's reasonably straightforward to implement and it can significantly reduce the transmission size impact.

Noteworthy is that three.js prototype implementation currently doesn't support that IIRC, but because it decodes quaternion data to matrix data I'm assuming it's a simple matter of adding a conversion step in JS code; @donmccurdy could comment on feasibility.

zeux · 2020-05-13T02:43:00Z

Well, "this week" arrived 3 weeks later but I did get to this eventually :) zeux/meshoptimizer#142 implements support for this extension in gltfpack.

Note that the output uses 16-bit normalized quaternions as per my suggestion above. It was trivial to implement in three.js and it already works in Babylon.JS so I'm assuming it's straightforward to do this; this saves transmission size by using a more efficient and more compressible quaternion encoding.

Attaching a few example models:

two basic models, simplemeshes (using KHR_mesh_quantization) and simplemeshes_noq (without KHR_mesh_quantization)
stadium model provided in Scene complexity limits #1699, stadium.glb (using KHR_mesh_quantization) and stadium_cc.glb (using KHR_mesh_quantization and MESHOPT_compression draft)

Using prototype three.js implementation with a few fixes the stadium model is very quick to download and render; it's pretty slow in Babylon.JS, I believe it's because that implementation decodes the node instances into the scene graph which takes a while - I'll file an issue for that.

gltfpack right now has a limitation in that it outputs all instances in world space, which leads to meshes that had skew transform (from accumulated relative transforms in the node graph) looking differently. At some point I can rework the implementation to preserve that.

instancing.zip

ultrafishotoy · 2020-05-13T15:28:58Z

Well, "this week" arrived 3 weeks later but I did get to this eventually :) zeux/meshoptimizer#142 implements support for this extension in gltfpack.

Awesome! Thanks @zeux!

donmccurdy · 2020-05-17T23:24:26Z

so we could either specify both 8-bit/16-bit for consistency with animations, or just 16-bit for pragmatic reasons.

@zeux Either is fine with me. I might vote for consistency with the animation spec, since the implementation burden is low. Your suggested changes on the three.js implementation look good, as well.

If an engine is capable of collecting compatible nodes to be submitted instanced, it could do that already without this extension? ... Does this extension help some renderers implement instancing, when before they would not have been able to?

@juj That is what I hope, anyway, in addition to reducing filesize and parsing cost. I understand your perspective — why don't engines just batch everything already? Nothing prevents them from merging, instancing, or otherwise optimizing a scene loaded at runtime. But in the ~9 years that WebGL has been available, I'm not aware of any engine doing so reliably, for any standard format¹. I'm less familiar with native engines like Unity or Unreal, but my understanding is that they do this optimization offline and store data that is easier to batch than their input sources, e.g. FBX.

Your concern is fair, and others have been raised, which is partly why this is an EXT_ extension rather than an official KHR_ one. It is testing a hypothesis, that pre-processed data can make instancing more accessible for realtime transmission. I hope it is successful, but this is not an area that prior formats have explored particularly thoroughly. It is entirely possible that we'll learn more constraints are necessary to make a fully future-proof standard for GPU instancing, when/if an official KHR_ extension is considered.

¹There may be exceptions to this in special areas like AEC? In geospatial visualization heavy offline processing is also common.

juj · 2020-05-18T07:08:05Z

But in the ~9 years that WebGL has been available, I'm not aware of any engine doing so reliably, for any standard format1.

This is surprising to read, working close to two decades with native APIs that support instancing, all of the engines that I ever worked with that supported instancing, use it dynamically rather than statically from the input data files. When porting UE4 and Unity3D to asm.js/wasm, both of them extend that support to WebGL.

This is the first time I read that it would not be possible to do so reliably. What is preventing them from doing so?

This is also the first time I read that anyone would be attempting to store instancing information "pre-solved" into an input file.

The reason that static instancing is a bad idea is because of view frustum dependence. Even if you stored a forest of 100k trees as 100k instances in a file, it would be inefficient to use a single 100k instanced draw call at runtime to render them, because probably only some 1000 of them might fit in the view frustum at once - and which ones do at a given time, depends on the position/orientation of the frustum.

Engines would rather do one of the following:

a) first frustum+occlusion-cull all the instances, and then render whatever remain using instancing, or
b) divide the world into a grid, and for each cell that the view frustum intersects, render all the trees in that cell instanced. (each cell has a presolved list of instances in that cell)

The solution a) would not be able to leverage any instancing data present in an input file. One might argue that b) could, if b) pre-bucketed the instances to such cells, but the cell size will invariably be engine-specific, and then an engine would have a problem of reconstructing the cell boundaries from input instance data, just complicating things.

I'm less familiar with native engines like Unity or Unreal, but my understanding is that they do this optimization offline and store data that is easier to batch than their input sources, e.g. FBX.

Both engines employ both instancing and batching techniques. In neither engines does one save "pre-solved" instancing data to the optimized/baked input files. Unity utilizes strategy a) from above, and I believe that so does Unreal.

Unity GPU Instancing
Unreal Engine Draw Call Merging Dynamic Instancing

All this being said, if instancing helps improve code size of glTF scene files, then it might be a worthy addition. However I would expect that renderers would as the first thing have to discard the instancing information after loading, i.e. it is just used as a indirection-based domain-specific compression technique.

To me it seems that renderers that would faithfully render the instance information stored in a file like this, would be ones that ignore frustum culling, and behave inefficiently/suboptimally because of that; or ones that would be expecting that the scene serializer would have done a suitable optimization b) above for them. Should glTF serializers be expected to require applying optimization b)?

Also, it does worry me that with this kind of instancing extension, implementing a loader becomes more complicated as one will need to be able to recognize both non-instanced and instanced ways of loading input files?

vpenades · 2020-05-18T13:09:30Z

@juj I fully agree with you, I am not sure if this extensions features are worth the cost of adding more complexity to the loaders.

At first I thought this extension would be really useful to implement things like hair or fur on animated characters, but since it's been limited to static instancing, that possibility is gone.

Then I thought about trees, forests, and I didn't realize about the issue of frustum culling... so, another possibility gone.

So, I would like to ask which are the real, practical use cases of this extension (beyond filling a small volume with thousands of teapots or cutting down file size), because the more I see it, the more it looks like a compression technique and nothing else.

Knowing this is important to me because I need to leverage the pros/cons to decide whether implementing this extension is worth or not.

donmccurdy · 2020-05-22T18:26:39Z

@juj —

Also, it does worry me that with this kind of instancing extension, implementing a loader becomes more complicated as one will need to be able to recognize both non-instanced and instanced ways of loading input files?

This is a vendor extension — it is prefixed EXT_ because multiple vendors have expressed interest in implementing the same thing. General-purpose glTF tools are not expected to implement every custom vendor extension — that would certainly be a huge burden. We had discussed an official version, but came to the conclusion that there wasn't enough precedent or information to design it yet. A vendor extension allows tools to opt in, learn from the implementation, and perhaps (or perhaps not) an official extension is made someday.

...all of the engines that I ever worked with that supported instancing, use it dynamically rather than statically from the input data files. When porting UE4 and Unity3D to asm.js/wasm, both of them extend that support to WebGL.

I find this surprising as well. My understanding was that static batching does not usually work this way, but I understand dynamic batching is more flexible.

This is the first time I read that it would not be possible to [implement dynamic batching] reliably. What is preventing them from doing so?

It's not impossible, no. But it's quite complex, and the argument that Unity and Unreal have implemented it, so why can't everyone else, is a bit worrying to me. I don't mean that as a criticism — if anyone has advice about how smaller engines like three.js, babylon.js, et al could implement dynamic instancing with no pre-processing of the input files, I'm certainly interested. 🙂

b) pre-bucketed the instances to such cells, but the cell size will invariably be engine-specific, and then an engine would have a problem of reconstructing the cell boundaries from input instance data

A goal of this extension is (IMO) to make it possible to create these buckets in an application-appropriate way. For a standalone model (e.g. for architecture or retail product preview) perhaps a single batch is fine. For a larger game world you'd certainly want cells. I consider the flexibility to decide the cell structure as a good thing — there is probably not a one-size-fits-all choice here. Reconstructing cell boundaries does not seem overly complex to me, but separate extensions have been proposed to provide world bounds at the node level, and (if we needed that information here) I would prefer to keep the information in those extensions instead of conflating the information with instancing data.

However I would expect that renderers would as the first thing have to discard the instancing information after loading, i.e. it is just used as a indirection-based domain-specific compression technique.

For a three.js we are able to use the instancing information very nearly as it is provided; I can't comment on other renderers.

@vpenades —

At first I thought this extension would be really useful to implement things like hair or fur on animated characters, but since it's been limited to static instancing, that possibility is gone.

Could you say more about what you are looking for here? Do you mean animation of individual instances, like Houdini's VAT system?

Then I thought about trees, forests, and I didn't realize about the issue of frustum culling... so, another possibility gone.

Authoring a single large batch will defeat culling, in the same way that merging an entire game world into one large mesh would defeat culling. Creating well-chosen batch cells will enable efficient culling. I see that flexibility as a strength in this extension: it would potentially enable tools like gltfpack to take input options (e.g. gltfpack --instancing-cell-size 100) to optimize a model for the application. Without that flexibility, I'm not sure it's realistic to design an extension guaranteeing the data is optimized for any arbitrary application. Do you think that is possible?

So, I would like to ask which are the real, practical use cases of this extension...

Cases like trees and forests are, indeed, what I see as the practical uses of this extension.

I think use cases like hair and fur may also benefit from this extension's data structure, but the extension is not (alone) sufficient to fully enable those things. In the short term I would expect to do custom processing on the loaded data to do this animation. In the long term perhaps glTF should consider procedural materials and procedural animation, similar to VAT, to more fully enable that use case. Or maybe we should really wait until something like that is designed, but I see that as a very large endeavor.

vpenades · 2020-05-22T21:06:24Z

Could you say more about what you are looking for here? Do you mean animation of individual instances, like Houdini's VAT system?

I always thought that instancing would be a good opportunity to allow hair/fur on single characters. Think of Monsters INC or Chewbacca. But also for realistic characters where you want to display actual hair over the whole body like in real life.

Crowds simulation is out of the question because the previously mentioned frustum culling problem, the same for trees/forests.

I believe this instancing extension assumes that a gltf-model is going to be a culling-unit, that is, it's either rendered in full, or not at all. The problem with this approach is that I believe most engines preffer to do a full run per mesh. Some years ago I had the opportunity to use SpeedTree... What it did was, for every kind of tree (a single mesh), ran a culling over the whole scene (many square kilometers) while filling an instancing buffer dinamically, so all the visible instances of a given mesh were rendered in a single call.

This is different than culling model "cells" and rendering its contents cell by cell.

My point is that, since glTF does not have any concept of culling, this instancing extension is limited to geometries that are assumed to be culled as a whole, or rendered as a whole. And this removes many of the use cases of instancing.

As a side note... can a gltf model using the instance extension... to be instanced? I think an engine would have to check if the model uses internal instancing, and if it's the case, render the models in the scene one by one. So maybe, supporting instancing per model prevents an engine supporting instancing per scene.

In other words... me feeling is that instancing is a per scene optimization, not a per model optimization, so I think such an extensions must help an engine to do per scene instancing, and not enforcing a per model instancing on the engine.

I did find a use case that might be interesting to some; which is molecule visualization, as in, defining simple colored sphere meshes as atoms, and producing models with millions of atoms. It's an useful use case, specially these days, but I am not sure if it's sooo specific that people working in the bio industry will probably have their own formats and viewers.

rsahlin · 2020-05-25T07:19:22Z

@juj & @vpenades - I agree with your concerns regarding this extension.

However, as @donmccurdy pointed out this is a vendor extension, ie nothing that I forsee ending up in the coming versions of glTF (as included in the core spec) - as such I guess I'm fine with it since there are a couple of clients/viewers asking for this.

My initial proposal was to have this as a named WEB_GL, eg WEB_GL_mesh_gpu_instancing extension - but apparently that does not exist :-)

emackey · 2020-05-27T15:58:10Z

extensions/2.0/Vendor/EXT_mesh_gpu_instancing/README.md

@@ -0,0 +1,59 @@
+# EXT\_mesh\_gpu\_instancing
+
+## Khronos 3D Formats Working Group


Please change this line to ## Contributors

Done! Thanks Ed!

ultrafishotoy · 2020-05-27T16:44:39Z

@donmccurdy : it looks like I don't have write access so I can't do the merge. Maybe you can do it when you add your comments? Thanks!

emackey · 2020-05-28T17:29:14Z

I'm merging this. If there are further comments, please open a new issue.

Thanks @ultrafishotoy!

ultrafishotoy · 2020-05-28T17:53:26Z

I'm merging this. If there are further comments, please open a new issue.

Thanks @ultrafishotoy!

Awesome, thanks to you @emackey!

zeux · 2020-05-28T18:08:11Z

I've noted quantized quaternions a few times on this thread; since this is merged now I've opened #1818 to specifically address that.

donmccurdy · 2020-06-03T04:54:50Z

I've tried to clarify the current answers to questions raised on this PR in #1821 — feedback welcome.

instancing extension

76587d9

ultrafishotoy changed the title ~~instancing extension~~ KHR_instancing Oct 23, 2019

donmccurdy reviewed Oct 23, 2019

View reviewed changes

extensions/2.0/Khronos/KHR_instancing/extensions/2.0/Khronos/KHR_instancing/README.md Outdated Show resolved Hide resolved

donmccurdy reviewed Oct 23, 2019

View reviewed changes

extensions/2.0/Khronos/KHR_instancing/extensions/2.0/Khronos/KHR_instancing/README.md Outdated Show resolved Hide resolved

cleanup

de031ed

overview statement

5d2dac5

lexaknyazev reviewed Oct 23, 2019

View reviewed changes

ultrafishotoy added 2 commits October 23, 2019 14:48

tweak statements

ed5f6e6

statement

a68dcf6

lexaknyazev reviewed Oct 23, 2019

View reviewed changes

ultrafishotoy added 2 commits October 23, 2019 15:19

description

82b3adb

typo

1b4f9c3

takahirox reviewed Oct 23, 2019

View reviewed changes

ultrafishotoy added 4 commits November 5, 2019 17:20

simple example

956b01d

syntax fix

62cf80a

teapots galore

2ed4903

oops

a833742

specify the types of the attributes

dbd7b63

emackey reviewed May 27, 2020

View reviewed changes

clean up readme.md

e0a458b

emackey merged commit 0c76176 into KhronosGroup:master May 28, 2020

zeux mentioned this pull request May 28, 2020

EXT_mesh_gpu_instancing should support quantized quaternions #1818

Closed

lilleyse mentioned this pull request Nov 12, 2020

Feature metadata extension (EXT_feature_metadata) CesiumGS/glTF#3

Merged

anupamdas2012 mentioned this pull request Nov 24, 2020

Extension EXT_mesh_gpu_instancing atteneder/glTFast#107

Closed

lyuma mentioned this pull request Oct 11, 2021

Add inverted normals when flipped model scale godotengine/godot#53642

Open

		@@ -0,0 +1,59 @@
		# EXT\_mesh\_gpu\_instancing

		## Khronos 3D Formats Working Group

EXT_mesh_gpu_instancing #1691

EXT_mesh_gpu_instancing #1691

Conversation

ultrafishotoy commented Oct 23, 2019

ultrafishotoy commented Oct 23, 2019

donmccurdy commented Oct 23, 2019

donmccurdy commented Oct 23, 2019

Choose a reason for hiding this comment

ultrafishotoy commented Oct 23, 2019 via email

ultrafishotoy commented Oct 23, 2019

ultrafishotoy commented Oct 23, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

takahirox Oct 23, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

donmccurdy commented Oct 24, 2019 • edited

takahirox commented Oct 24, 2019

takahirox commented Oct 25, 2019

zeux commented Oct 29, 2019

lexaknyazev commented Oct 29, 2019

donmccurdy commented Nov 6, 2019

ultrafishotoy commented Apr 21, 2020 • edited

zeux commented Apr 22, 2020

ultrafishotoy commented Apr 22, 2020

donmccurdy commented Apr 22, 2020 • edited

juj commented Apr 24, 2020

ultrafishotoy commented Apr 29, 2020

juj commented Apr 29, 2020

ultrafishotoy commented Apr 29, 2020

zeux commented Apr 29, 2020

zeux commented May 13, 2020 • edited

ultrafishotoy commented May 13, 2020

donmccurdy commented May 17, 2020 • edited

juj commented May 18, 2020

vpenades commented May 18, 2020 • edited

donmccurdy commented May 22, 2020 • edited

vpenades commented May 22, 2020 • edited

rsahlin commented May 25, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ultrafishotoy commented May 27, 2020

emackey commented May 28, 2020

ultrafishotoy commented May 28, 2020

zeux commented May 28, 2020

donmccurdy commented Jun 3, 2020

takahirox Oct 23, 2019 •

edited

donmccurdy commented Oct 24, 2019 •

edited

ultrafishotoy commented Apr 21, 2020 •

edited

donmccurdy commented Apr 22, 2020 •

edited

zeux commented May 13, 2020 •

edited

donmccurdy commented May 17, 2020 •

edited

vpenades commented May 18, 2020 •

edited

donmccurdy commented May 22, 2020 •

edited

vpenades commented May 22, 2020 •

edited