glTF in Unity optimization - 4. New Mesh API - The Failed Attempt

06 Apr 2020

This is part 4 of a mini-series.

Goal

With version 2019.3 Unity introduced a new Advanced Mesh API for creating meshes and announced the following advantages

Faster (yay!)
Flexible vertex attribute data layout

It's faster, since it omits all validation checks the simple API does. In this post we will see if that's true for my cases.

The plan I have in mind is start from the rear end (mesh data submission) and gradually improve the workflow towards data retrieval from glTF buffers.

Replace simple by advanced API calls
Replace existing data structures (C# arrays) by Unity NativeArrays
Experiment with data types (instead of using floats for everything, use smaller types; esp. if the original glTF type is not a float)

Step 1: Advanced API calls

First thing I did was replacing the simple API calls by the advanced ones (see commit).

This is an approach from the rear end, where all vertex data is already retrieved from buffers in the form of arrays (e.g. Vector3[]) and ready to be pushed. I created one vertex stream per attribute.

Test 1: Full high resolution mesh

First comparison loading a high resolution mesh with UVs, normals and tangents (repeated 10 times)

	glTFast 0.11.0	glTFast dev	speedup
SetVertices	20.51 ms	3.17 ms	6.5 x
SetIndices	66.42 ms	29.34 ms	2.3 x
SetUVs	31.99 ms	2.07 ms	15.45 x
SetNormals	54.14 ms	3.16 ms	17.1 x
SetTangents	93.80 ms	4.42 ms	21.2 x
RecalculateBounds	34.63 ms	30.02 ms	1.2 x
UploadMeshData	117.70 ms	121.69 ms	1.0 x

That's some great improvements 😀! Vertex data is 6x to 21x times faster and setting indices is twice as fast. As a result the total loading time for 10 huge meshes went down from 8.0 sec to 5.5 sec ( 45% faster ).

Test 2: High resolution mesh without normals and tangents

The second test is the same mesh, but without normals and tangents

Note: In previous posts of this miniseries it became clear that normal/tangent calculations are a bottleneck. Still I'd like to see if the new mesh API improves the situation.

	glTFast 0.11.0	glTFast dev	speedup
SetVertices	32.64 ms	3.09 ms	10.6 x
SetIndices	69.80 ms	27.38 ms	2.6 x
SetUVs	35.42 ms	2.27 ms	15.6 x
RecalculateNormals	130.61 ms	75.62 ms	1.7 x
RecalculateTangents	960.36 ms	892.66 ms	1.1 x
RecalculateBounds	35.81 ms	26.77 ms	1.3 x
UploadMeshData	134.27 ms	132.60 ms	1.0 x

The normal and especially the tangent calculations are still devastating, but they got a bit faster. Other than that we see similar results. Setting positions is even 10 times faster. The total time went down from 16.4 sec to 14.5 sec ( 13% faster ).

Test 3: Generic sample sets

Moving on to sample set with more variation and practical real-world features.

	glTFast 0.11.0	glTFast dev	speedup
glTF sample models	9.48 sec	8.82 sec	+7.5%
furniture set	9.08 sec	8.32 sec	+10%

This looked promising at first sight, but I saw that I introduced regressions (like not supporting a second UV set or vertex colors). When I tried to fix those to re-run the tests, suddenly Unity crashed 😱 .

I tracked down the crash at Mesh.SetVertexBufferParams. Turns out the troubling mesh primitive uses positions, normals, tangents and two sets of texture coordinates. Reading the docs carefully I found that Unity supports up to four vertex streams maximum and my "one stream per attribute" approach exceeds this limit and causes Unity to crash.

I filed a bug report and Unity fixed it in 2020.2. It still won't work, but at least fellow developers don't have to wonder why it crashes anymore 💯.

I investigated a bit in the code base and came to the conclusion, that for the moment this is a dead end. The experimental Mesh API support stops here 🛑.

Next up

In order to use the advanced Mesh API I have to group vertex attributes in a way that results in 4 vertex streams or less, no matter how many attributes there are.

This totally screwed up my plan of starting at the "rear end" and improve from there in tiny steps. I'll have to refactor the data retrieval from start to end in order to support this. The positive initial results motivate me to do exactly that, so I decided to draw a line under the current version of glTFast and call it version 1.0 before I proceed doing this major refactor.

On the plus side, I'll build the refactored version based on NativeArrays from the start, so that's two things at one sweep.

Follow me on twitter or subscribe the feed to not miss updates on this topic.

If you liked this read, feel free to

Next: New Mesh API - The Refactor

Overview of this mini-series

gltf unity performance