texture atlas usage most efficient?

I have a 24 card deck with 11 unique cards plus 1 unique back face, so in my content I can have 12 textures and 11 static mesh actors. Then my game at run-time will have 24 instances of the static mesh actors (for 24 cards).

However, I’m guessing that it’s more efficient to have just one texture (ie a simple texture atlas)?

Also I notice the 11 static meshes are exactly the same except for texture coordinates (used to sample from the texture atlas). So now I wonder, which of the following is faster?

A) one mesh, one material, modify UV in material based on const(s), or

B) 11 meshes, one material => per mesh tex coords, or

C) one mesh, 11 materials => material hardcodes

Meanwhile I read this post ( opengl - How to avoid texture bleeding in a texture atlas? - Game Development Stack Exchange ) says not to use texture atlases for 3D, so maybe my entire question is wrong?

Also, does UE4 do anything automatically for me? I mean should I just have 12 separate textures instead of a texture atlas? I was assuming it would improve performance to make my own simple texture atlas for this, but maybe I’m wrong?

So I know how to do it in OpenGL. But how to do it in UE4?

Update #1 - here’s my latest idea on how to do it with one static mesh, one material, and one texture atlas:

  • material has array of 24 boneIdx’s (or more)
  • C++ generate deck static mesh => generate mesh with 24 cards, set init 24 boneIdxs, one actor uses this static mesh
  • to translate a card, modify material const (uniform mat4)

Besides the initial generation of 24 (or more) cards in C++, cards are never added/removed from the mesh - they are just translated by modifying the material const (uniform mat4) (bone[24] array). Translation means moving the cards to the discard pile or in a grid relative to the player’s camera to make the cards easy to see.

One limitation is that user can’t click a card via actor click because there’s only one actor (one static mesh for all 24+ cards). In other words, I can’t take advantage of UE4’s built-in system for detecting clicks on actors. Implementing that myself sounds complicated. However, a trick is to consider my use case. The cards are going to be in one of three places - the main deck, the discard pile, or the screen. And the only time we care about user clicking is main deck and discard pile. So I can just have two invisible actors for the main deck and discard pile. When user clicks one of them, we get a grid view of the cards (in main deck or in discard pile) by translating the corresponding bones.

So far I’m liking this idea because it sounds theoretically more optimal. Though I’m not sure if it really gains much performance, so I’m not really sure if it’s worth the extra effort to implement it. So I’m still open to other ideas…

Update #2 - I want to translate and rotate cards in the material based on a boneIdx (a group ID for the verts in that card so each card has a separate translation and rotation).

I considered doing this with a material “custom” node but it’s unclear to me whether that’s fully cross-platform. I noticed material has a “world position offset” output. I’m able to do rotations with (Absolute World Position is World Position, Time is Rotation Amount) → RotateAboutWorldAxis_Cheap → X-Axis → World Position Offset. However, the lighting is broken - lighting appears to be done before World Position Offset is applied. But I want to do my custom World Position Offset first, then let UE4 do its lighting. How?

Update #3 - I remembered “inverse transpose” from some CG lecture for normals. Here’s one reference:

I tried VertexNormalWS → InverseTransformMatrix → Normal. However, this is wrong because I need to get the matrix created by (RotateAboutWorldAxis_Cheap → X-Axis). However, RotateAboutWorldAxis_Cheap doesn’t output a matrix - it outputs transformed position. Hmmm…

Hey Pemcode!

some nice questions.
Texture atlasses can be more optimal, but since UE4 does one drawcall per mesh per material the performance --depending on the case-- is often very limited.

There are multiple ways to approach this, but imho the best one is to just have 12 separate textures, one Master material, 11 material instances, and one static mesh.

Alternatively, you could have it all as one atlas, and use a static panner to offset the atlas per card in the material instance.

Let me know if you have any more questions.

As Luos stated, pretty good question.
Firstly about mip map texture bleeding, it should not be an issue in your case, since the texture on the cards is unlikely to be tiling.

I just like to clarify something. Static Mesh and Card in a card game does not tag along quite well. I assume that your cards will be actually animated and moving in some way, which kinda contradicts with them being static. If I am wrong and your cards are really static, then atlasing the textures and using per-mesh UVs is probably the best.

Otherwise:

In all options you’ve listed there is actually little performance difference, since it will always be 11 separate draws.

So overall, I gotta agree with Luos. Just go with 11 material instances.

@Deathrey Each card is an instance of a static mesh actor. For example, I may have 24 cards (or a different number) (there’s 11 types of cards that are exactly the same except for the texture). Yes the cards can be transformed (translated, rotated, scaled) but they are still each a static mesh. They don’t animate - the geometry within a static mesh doesn’t change relative to other vertices within the static mesh. The model matrix for each card actor may change, and this applies to the entire actor. Individual vertices are not moving (eg 3d animation) relative to each other. So it’s a static mesh.

I wonder if there is an easy way to do them all in one draw call. I should be able to dynamically batch the 8 verts in each card into a single “cards” mesh so that they can be drawn in a single draw call. In that case I’d use a texture atlas so they’d be using the same material and the same texture atlas so the only difference would be the positions and texture coordinates of each card. I guess maybe UE4 doesn’t make this easy or automatic?

I’d be interested to learn more about how UE4s rendering system works in terms of draw calls and performance… It’s awesome that it (or some of it?) is open source, though I’d like to learn some higher level things, I’m worried that reading the UE4 rendering engine source code might turn into a big rat hole.

Suppose I have two cards which are in a vertex buffer. Card A is verts 0-7, Card B is verts 8-15.

Suppose we want to translate Card A (but not card B). In OpenGL, two ways to do this:

A) Modify vertex buffer for Card A then re-upload the vertex buffer for Card A + Card B

B) In our GLSL VS, we can do hierarchical transforms. So Card A and Card B each have a separate transformation matrix (const) used by the GLSL VS. To translate Card A, modify Card A’s const matrix.

Edit: Actually B doesn’t make sense because how does your VS know which const matrix to use for vertex 0-7 versus vertex 9-15?

My deck has 24 cards (or more) with 11 unique cards. So isn’t it 24 draw calls (rather than 11)? The problem is that I want each of the 24 (or more) cards to have independent translation (so you can move one card with out moving the other 23).

I did some reading… here’s how you do it in OpenGL:

https://www.opengl.org/wiki/Skeletal_Animation

Each vertex has an index saying its bone. So for my cards, I’d have card boneIdx=0 for verts 0-7 and boneIdx=1 for verts 8-15. In this way, I can use an array of uniform mat4’s for my bone transforms… Which gives each card a unique transform.

This allows me to put all the cards in a single vertex buffer and upload it once. Then when I want to transform a card, I only modify it’s corresponding bone uniform mat4. This doesn’t change once per frame, it changes only when a card translates.

So there’s one vertex buffer for all the cards and one vertex shader, The only thing we change when a card translates is its corresponding uniform mat4.

So I know how to do it in OpenGL. But how to do it in UE4…

You are overthinking this man, you are trying to reduce a mere 24 drawcalls.
24 drawcalls in an engine that on a standard pc can handle (tens of) thousands, and a few hundred on current gen mobile easily.

You are sort of mixing 3 different terms here. Dynamic batching,Hardware Instancing and Skinned meshes.

Potentially, you can modify the engine to get your own dynamic batching solution and additionally you could store texture ID from your atlas at each vertex. However the effort is indubitably not worth the benefit.

And frankly, what are you up to is pretty noble act, but you are really over-complicating things.

Then I must be doing something else wrong? Here are screen shots with 1 card vs. 24 cards vs. 96 cards. This is with each as a separate actor just copy-pasted in the level editor. FPS is 27.82 vs. 21.92 vs. 13.39.

Edit: To clarify what I mean. There is only one static mesh “card_Mesh” in the content. To create one actors, I drag the static mesh “card_Mesh” from content into the level editor. Then I copy-paste it so that I have 24 actors (or 96 actors). So there’s only one static mesh, one material, and one texture in Content. But in the level there are 1 or 24 or 96 actors that reference the static mesh “card_Mesh”.

unless each card is a few hundred drawcalls, a few million polygons in vertices, a very very poorly set up blueprint (which I doubt after your questions), or you have a very very poor rig… I dont know whats going on there.

Gotta agree with Lu. Even for old mobiles, 96 drawcalls is kinda more than acceptable.

The card is an actor created by dragging a static mesh (8 verts) from content into level editor. Then I copy-pasted it so that we have 96 actors.

Based on what I read I think maybe the level editor isn’t smart enough to use instancing so maybe I need to try 96 actors generated in C++ with instancing…

I wonder if there is something else besides draw calls that is slowing it down by having 96 actors such as lighting or collision or physics. And I wonder if it would make sense to turn off CPU culling since each actor is only 8 verts (edit my .obj file has 8 verts but in content it says 24 verts per card mesh).

As another data point, when I use the editor’s window > developer tools > merge actors, to create a new mesh out of 96 cards, then I drag that mesh from content into the level editor… I get good FPS (unlike with 96 non-merged card actors).

The card is an actor created by dragging a static mesh (8 verts) from content into level editor. Then I copy-pasted it (in the level editor) so that we have 96 card actors.

Based on what I read I think maybe the level editor isn’t smart enough to use instancing (sort of a bug imho), so maybe I need to try 96 actors generated in C++ with instancing.

I wonder if there is something else besides draw calls that is slowing it down by having 96 actors such as lighting or collision or physics.