Using ProfileGPU, see huge outlier in ShadowDepths

I have a dynamic sunlight in my level, and the cost for my ShadowDepths is huge. When I go into ProfileGPU, my ShadowDepth costs will range from 4ms to 9ms, and nearly all the cost is displayed in only one element (see attached image for an example).

The element will vary in its type, but it will usually be the second or third in the list. The entry will also have additional information of prims/ms and verts/ms, indicating that it’s detecting that one primitive is rendering very slowly for some reason?

Is this a bug in ProfileGPU, and it’s not giving correct times? Is the GPU stalling for some reason? What can I do? Help!

I’m still seeing an outlier problem where one asset takes orders of magnitude longer than similar elements. It’s not always the ShadowDepths. I’ve seen it in BasePasses as well, on instanced meshes and plain ol’ vanilla static meshes. It seems to usually be the third element in a pass though, which seems odd. It feels like a stall or something where ProfileGPU is misreporting the time it takes to render the element. Does anyone have advice on how I could track down this problem? I’ve lost so much time on this issue, and don’t know what to do.