Strange problem with GLSL

You are an experienced programmer and have a problem with the engine, shaders, or advanced effects? Here you'll get answers.
No questions about C++ programming or topics which are answered in the tutorials!
Post Reply
arras
Posts: 1622
Joined: Mon Apr 05, 2004 8:35 am
Location: Slovakia
Contact:

Strange problem with GLSL

Post by arras »

I did run in to strange problem which I can describe on this test shader:

vertex shader code (file "shader_v.glsl"):

Code: Select all

uniform int Lighting;

void main()
{
   vec4 color = vec4(0,0,0,1);

   int a = 0;
   int b = 5;

   if(Lighting == 0)
   {

      for(int i=0; i<5; i++)
      {
         color.r += 0.1;
      }

   }

   gl_FrontColor = color;

	gl_Position = ftransform();
}
fragment shader code (file "shader_f.glsl"):

Code: Select all

void main()
{
	gl_FragColor = gl_Color;
}
And finally simple Irrlicht code using that shader on simple sphere:

Code: Select all

#include <irrlicht.h>
using namespace irr;

int main()
{
   IrrlichtDevice *Device = createDevice(video::EDT_OPENGL,
      core::dimension2d<s32>(640, 480), 16, false, false, false, NULL);

   video::IVideoDriver* driver = Device->getVideoDriver();
   scene::ISceneManager* smgr = Device->getSceneManager();
   gui::IGUIEnvironment* guienv = Device->getGUIEnvironment();

   video::IGPUProgrammingServices *services = driver->getGPUProgrammingServices();

   s32 CMT_SHADER = services->addHighLevelShaderMaterialFromFiles(
		"shader_v.glsl", "main", video::EVST_VS_1_1,
      "shader_f.glsl", "main", video::EPST_PS_1_1,
      0, video::EMT_SOLID);

   scene::ISceneNode *node = smgr->addSphereSceneNode(5, 16);
   node->setMaterialType((video::E_MATERIAL_TYPE)CMT_SHADER);

   scene::ICameraSceneNode* camera = smgr->addCameraSceneNode(0, core::vector3df(0,0,-20));

   gui::IGUIStaticText *text = guienv->addStaticText(L"", core::rect<s32>(10,10,200,200));

   while(Device->run())
   {
      driver->beginScene(true, true, video::SColor(0,100,100,150));

      core::stringw tmp("FPS: ");
      tmp += driver->getFPS();
      text->setText(tmp.c_str());

      smgr->drawAll();
      guienv->drawAll();

      driver->endScene();
   }

   Device->drop();

   return 0;
}
It does nothing special, its just for testing my problem since it seems somehow related to combination of uniform variable with if() and for() loops. Code runs fine like this.

However if I replace vertex shader code with:

Code: Select all

uniform int Lighting;

void main()
{
   vec4 color = vec4(0,0,0,1);

   int a = 0;
   int b = 5;

   if(Lighting == 0)
   {

      for(int i=0; i<b; i++)
      {
         color.r += 0.1;
      }

   }

   gl_FrontColor = color;

	gl_Position = ftransform();
}
FPS suddenly drops from 1900 to 44! That something I don't understand. I did spend some hours reading GLSL docs and testing but did not find any reason for such behavior.

Here is another combination of code, which cause same FPS drop:

Code: Select all

uniform int Lighting;

void main()
{
   vec4 color = vec4(0,0,0,1);

   int a = Lighting;
   int b = 5;

   if(a == 0)
   {

      for(int i=0; i<b; i++)
      {
         color.r += 0.1;
      }

   }

   gl_FrontColor = color;

	gl_Position = ftransform();
}
???!

On the other hand this code works again fine:

Code: Select all

uniform int Lighting;

void main()
{
   vec4 color = vec4(0,0,0,1);

   int a = 0;
   int b = 5;

   if(a == 0)
   {

      for(int i=0; i<b; i++)
      {
         color.r += 0.1;
      }

   }

   gl_FrontColor = color;

	gl_Position = ftransform();
}
I don't see any reason for such huge FPS differences. Does somebody have idea of what's going on? ...Or at last verify the same behavior on his machine?
xDan
Competition winner
Posts: 673
Joined: Thu Mar 30, 2006 1:23 pm
Location: UK
Contact:

Post by xDan »

Hmm, is your graphics card quite old?

With my old card, if I put over a certain amount of code in it (e.g. too many loop iterations, as the loops apparently get unrolled) then it can't handle it and goes into software mode... Which is indeed very slow.

Maybe assigning "a" to a uniform variable takes some more instructions than assigning it to a constant 0...

You could try decreasing the number of iterations of the loop, see if that helps.
Last edited by xDan on Mon Feb 25, 2008 1:18 pm, edited 1 time in total.
BlindSide
Admin
Posts: 2821
Joined: Thu Dec 08, 2005 9:09 am
Location: NZ!

Post by BlindSide »

Very simple, you are using a non-const value for the for loop, so the driver has to recompile the shader EVERY FRAME. Thats the reason for your performance.

Shader compilers like to do stuff like loop unrolling, to make the shader performance better. Thats why when a uniform value is changed, it has to recompile the entire shader.

I can see that in the last example "b" works fine without recompiling. That is probably because for some reason the compiler decided not to use loop-unrolling on that particular shader. They most likely decide whether to do this on many complex factors. The "if(a == 0)" will probably be removed by the compiler as it is redundant so maybe it thought that since it is a less complex shader than the others it does not need such serious optimisations.

Look into driver docs to see if any companies provide an extension to explicitly disable loop-unrolling in shader compilation.

Cheers
ShadowMapping for Irrlicht!: Get it here
Need help? Come on the IRC!: #irrlicht on irc://irc.freenode.net
arras
Posts: 1622
Joined: Mon Apr 05, 2004 8:35 am
Location: Slovakia
Contact:

Post by arras »

xDan >> Yes card is ATI Mobility Radeon X700 which is not the newest model out there but decreasing number of for() loops does nothing. Basically there is no difference if b = 5 or 1 or even 0.

BlindSide >> the hell with it! This really complicate things since what I need it to check if Lighting is on for material and then run light color computation code as many times as there are lights in scene. I have to look for some workaround...

What is really strange is that this code works fine:

Code: Select all

uniform int Lighting;

void main()
{
   vec4 color = vec4(0,0,0,1);

   for(int i=0; i<Lighting; i++)
   {
      color.r += 0.1;
   }

   gl_FrontColor = color;

	gl_Position = ftransform();
}
Its only if for() loop is included inside if() statement...

Also I did notice that putting some code in to function makes some small drop in speed ...which is also strange since compiler should just replace function call with code ...at last I think.
arras
Posts: 1622
Joined: Mon Apr 05, 2004 8:35 am
Location: Slovakia
Contact:

Post by arras »

Now this whole thing makes me think that best would be to make diferent shaders for every state ...like Lighting on/off, different fog settings and then run one which is appropriate for rendered materials settings. Which may of course result in to dozen or more shader files.
Luben
Posts: 568
Joined: Sun Oct 09, 2005 10:12 am
Location: #irrlicht @freenode

Post by Luben »

just put the shader functions in the same file but with different names, like Lighting_Main, NoLighting_Main. Related code in the same file, yatta!
If you don't have anything nice to say, don't say anything at all.
arras
Posts: 1622
Joined: Mon Apr 05, 2004 8:35 am
Location: Slovakia
Contact:

Post by arras »

OK I just came around bunch of articles and forums which deals with this problem. Here are two representative: http://www.gamedev.net/community/forums ... 1&#2287633
http://www.opengl.org/discussion_boards ... ber=234684
So the way to go is to made different shader for each situation as I already thought ...this will result in to really monstrous solution if I want to made my material able to mimic every situation (number, type of lights, fog with different settings). Thats really bad...
BlindSide
Admin
Posts: 2821
Joined: Thu Dec 08, 2005 9:09 am
Location: NZ!

Post by BlindSide »

Thats ok, I have a solution for you. I'll post it in the code snippets forums soon.
ShadowMapping for Irrlicht!: Get it here
Need help? Come on the IRC!: #irrlicht on irc://irc.freenode.net
arras
Posts: 1622
Joined: Mon Apr 05, 2004 8:35 am
Location: Slovakia
Contact:

Post by arras »

OK I'll look at it. Meanwhile I did some tests wit fixed number of lights and found out really large drop in performance. I did test with 3 lights in scene, one point, spot and directional. While test run at 104 FPS on EMT_SOLID shader crawled at 33 FPS. And thats still only the same stuff fixed pipeline does but without fog and texture. If I would like to run some more advanced stuff like pixel lighting or bump mapping it would slow rendering down to death. Not speaking about running full complement of 8 lights in scene.
I did found that fixed pipeline handle several lights quit well. For example it dropped only 4-6 FPS between rendering scene with 1 then 3 lights (in above scene). Howe the hell fixed pipeline does it ...lighting calculation is probably from those more expensive.


I am really starting to think that rendering my tiles with ONE_TEXTURE_BLEND several times (that is rendering geometry in few passes) would be cheaper than using shaders!

Did somebody ever made some complex shader which can actually be used in something else than just more or less simple demo?
BlindSide
Admin
Posts: 2821
Joined: Thu Dec 08, 2005 9:09 am
Location: NZ!

Post by BlindSide »

I'm interested to see what this shader of yours looks like.
ShadowMapping for Irrlicht!: Get it here
Need help? Come on the IRC!: #irrlicht on irc://irc.freenode.net
arras
Posts: 1622
Joined: Mon Apr 05, 2004 8:35 am
Location: Slovakia
Contact:

Post by arras »

Well I try to optimize it up and down so it changed a little bit, but here is it:
vertex shader

Code: Select all

uniform int LightType0, LightType1, LightType2;

void main()
{
   int LightType[3];

   LightType[0] = LightType0;
   LightType[1] = LightType1;
   LightType[2] = LightType2;

   vec3 position = vec3(gl_ModelViewMatrix * gl_Vertex);
   vec3 normal = normalize(gl_NormalMatrix * gl_Normal);
   vec3 cameraVector = normalize(-position);

   gl_FrontColor = vec4(0,0,0,1);

   for(int i=0; i<3; i++)
   {
      vec3 lightVector = gl_LightSource[i].position.xyz - position;
      float distance = length(lightVector);
      lightVector = normalize(lightVector);
      float diffusePower = max(0.0, dot(normal, lightVector));
      vec3 halfVector = normalize(lightVector + cameraVector);
      float specularPower = max( 0.0, dot(normal, halfVector) );
      specularPower = pow( specularPower, gl_FrontMaterial.shininess );

      vec4 color = gl_FrontLightProduct[i].ambient;

      // point light
      if(LightType[i] == 0)
      {
         color += gl_FrontLightProduct[i].diffuse * diffusePower;
         color += gl_FrontLightProduct[i].specular * specularPower;
      }

      // spot light
      if(LightType[i] == 1)
      {
         float spotEffect = dot( normalize(gl_LightSource[i].spotDirection), -lightVector );
         if(spotEffect > gl_LightSource[i].spotCosCutoff)
         {
            spotEffect = pow(spotEffect, gl_LightSource[i].spotExponent);
            color += gl_FrontLightProduct[i].diffuse * diffusePower * spotEffect;
            color += gl_FrontLightProduct[i].specular * specularPower * spotEffect;
         }
      }

      // directional light
      if(LightType[i] == 2)
      {
         lightVector = normalize(gl_LightSource[i].position.xyz);
         distance = 0;

         diffusePower = max(0.0, dot(normal, lightVector));
         color += gl_FrontLightProduct[i].diffuse * diffusePower;

         halfVector = normalize(lightVector + cameraVector);
         specularPower = max( 0.0, dot(normal, halfVector) );
         specularPower = pow( specularPower, gl_FrontMaterial.shininess );
         color += gl_FrontLightProduct[i].specular * specularPower;
      }

      // attentuation
      float attentuation = 1.0 / (gl_LightSource[i].constantAttenuation +
         gl_LightSource[i].linearAttenuation * distance +
         gl_LightSource[i].quadraticAttenuation * distance * distance);

      color = color * attentuation;
      gl_FrontColor += color;
   }

   // final color
   gl_FrontColor += gl_FrontLightModelProduct.sceneColor;

   gl_Position = ftransform();
}
fragment shader

Code: Select all

void main()
{
	gl_FragColor = gl_Color;
}
Here is whole test project (source code) if you are interested:
ShaderTester.zip
ebo
Posts: 38
Joined: Sun Feb 19, 2006 5:39 pm

Post by ebo »

Try removing ifs ...
Branches are really slow, as they cause shader threads to diverge (see CUDA docs)
arras
Posts: 1622
Joined: Mon Apr 05, 2004 8:35 am
Location: Slovakia
Contact:

Post by arras »

I know, all branches are executed then one result is picked.
Post Reply