Inline float3 SkinPosition(float3 position, inout int4 index, inout float4 weight, Inline float3 SkinPosition(float3 position, int4 index, float4 weight, BoneConstants bc) Previously my skinning function looked (in SRSL, not GLSL) like this: The code gets inlined automatically and any value passed by reference isn’t copied when the GLSL is generated. So I just recently added an ‘inline’ keyword for functions. Luckily, I control the compiler for my own shading language, so I can get it to generate different code. I don’t want to have to manually repeat code in shaders, that’s just bad programming practice. But since my custom shader language that generates GLSL doesn’t have a concept of globals, everything is passed to functions if it’s needed. Most people directly reference the global list of uniform bone transformations directly and never run into this issue. (Or my GTX 980 and 290X are just too fast for me to notice the slowdown…) On Windows and Linux, I suspect the compiler is smart enough to see that the function doesn’t modify the array, and optimizes the copy away. My first attempt before manually inlining this code was actually to pass the array by reference, but the OpenGL compiler yelled at me that you can’t pass a uniform by reference. This possibly results in different execution paths per GPU thread, causing even more slowdown. What’s just a few matrix multiples, scaling, and adding becomes many many copies and conditionals. My suspicion is that the program running on the GPU doesn’t have enough registers to make this copy, so the GLSL compiler is generating code – copying the array bit by bit, and then is running the code over and over to evaluate the final result. In my case with animation, the entire array of bone transformations is being copied, because it’s being passed by value. When you pass a parameter by reference any modifications to the variable change it directly. When you pass a parameter by value, a copy of the variable is made so that any changes to the variable in the function don’t effect its value in the calling function. There’s two ways to pass parameters to a function. ![]() Gl_Position = (gc.worldToProjection * (tc.transforms * vec4(position, 1.0))) (bc.transforms * vec4(inputPosition, 1.0)) * inputWeight.w)).xyz (bc.transforms * vec4(inputPosition, 1.0)) * inputWeight.z + ![]() (bc.transforms * vec4(inputPosition, 1.0)) * inputWeight.y + ((bc.transforms * vec4(inputPosition, 1.0)) * inputWeight.x + And my frame rate returned to normal, with animated characters. To fix it, instead of calling a function to animate the models, I manually inlined the code. I stared at this code for a while (more than a while actually), and after messing about a bit, it finally dawned on me what’s wrong with it. It then weights them by how much influence each bone has on the vertex. What this code does is transform the position of a vertex by up to four bones in the models structure. Gl_Position = (gc.worldToProjection * (tc.transform * vec4(position, 1.0))) Vec3 position = SkinPosition(inputPosition, inputIndex, inputIndex, bc) (ansforms * vec4(position, 1.0)) * weight.w)).xyz (ansforms * vec4(position, 1.0)) * weight.z + (ansforms * vec4(position, 1.0)) * weight.y + ((ansforms * vec4(position, 1.0)) * weight.x + Vec3 SkinPosition(vec3 position, ivec4 index, vec4 weight, BoneConstants bones) This isn’t the entire shader, just enough to get an idea of how the animation part works. It looks pretty standard and is simple code. So here’s the basic code that handles animation in GLSL. Since rendering houses and trees really only has minor differences with animated models I disabled the shader code that animates the models and the frame rate went back up to normal. And if I remove the people from my original test scene, the frame rate is over 100. ![]() The image of just trees doesn’t have any deer or people moving around. So after much debugging I determined that rendering animated models was causing the slow down. Thats a difference of 63 or so milliseconds. If I move the camera to a different location, the frame rate is 126. I’ve got the game paused so there isn’t any time spent on updates, this is just drawing. So here’s a scene, rendering on OSX, at an abysmal frame rate of 14 on a MacBook Pro. This week I’ve properly fixed the issue, and I want to record it here for myself and others to avoid this mistake. ![]() I identified the problem and made some work arounds for development to continue. A few months ago I had some interesting performance problems with OpenGL on OSX.
0 Comments
Leave a Reply. |