Understanding Shaders

kevinsbro · Post by **kevinsbro** » Fri Jan 13, 2012 9:19 pm

Im trying to get a quick grasp on how exactly shaders work in irrlicht.

Is there a difference between putting constants in the vertex shader vs the pixel shader?
Is it costly to the graphics processing to have a lot of high level shaders initialized?
or would it be better to try and combine them and use constants to flag certain aspects?

The reason for the first question is because I realized that XEffects passes constants in the 2nd pass vertex shadow map shader, which passes it to the pixel shader through MVar : TEXCOORD1. Is there a reason for this technique?

Any help would be much appreciated!

Granyte · Post by **Granyte** » Sat Jan 14, 2012 2:02 am

pixel and vertex shader are to be considered as two diferent program if you put constant to the vertex shader it wont be availible to pixel shader same for the oposite

i don't know about xeffect specific but generaly using TEXCOORD(your number here) is a methode to pass custom data from pixel shader to vertex shader

kevinsbro · Post by **kevinsbro** » Sat Jan 14, 2012 4:25 am

My confusion was that the constant declared in the vertex shader was being passed with no change to the pixel shader using "TEXCOORD(your number here)". I was just wondering if this is more efficient than simply declaring the same constant in the pixel shader?

Does the pixel shader get run the same amount of times as the vertex shader?

Granyte · Post by **Granyte** » Sat Jan 14, 2012 5:14 am

if i understand well how it works vertex shader is run for every vertex in your mesh where pixel shader is run for every pixel on your screen the object take

hendu · Post by **hendu** » Sat Jan 14, 2012 8:30 am

It's usually much better to have separate shaders over an if branch. If the shader compiler is smart and your branch takes a constant, the end result might be two different shaders still (or one that does both things slower due to the if).

In OpenGL when you send a uniform it's accessible in both vertex and fragment shaders.

Granyte · Post by **Granyte** » Sat Jan 14, 2012 9:26 am

many people say that if statement are slow but i have a shader with 8 possible branches and it cal still run over 350 fps on my machine ?.? is it possible that certain hardware are less optimised going around if branches then others?

REDDemon · Post by **REDDemon** » Sat Jan 14, 2012 11:17 am

well. they are not slow. But in the GPU doing 4 Multiply and Add operation requires the same time of an "If" statement (never tested if this is really true) so theorically from a CPU point fo view it is a waste of computing resources. But i think that if branching allows to skip you more than 4 MAD operations it can still increase performance of a shader. And sometimes certain effects are possible only with some branching. The best way is to try different shaders and see wich is faster.

Radikalizm · Post by **Radikalizm** » Sat Jan 14, 2012 12:58 pm

REDDemon wrote:well. they are not slow. But in the GPU doing 4 Multiply and Add operation requires the same time of an "If" statement (never tested if this is really true) so theorically from a CPU point fo view it is a waste of computing resources. But i think that if branching allows to skip you more than 4 MAD operations it can still increase performance of a shader. And sometimes certain effects are possible only with some branching. The best way is to try different shaders and see wich is faster.

The thing about if-statements on a GPU is that a GPU can't do branch prediction, while a CPU can
Branch prediction causes the CPU to to cache data it expects to be run before the branch is executed, I'm not sure how it does the prediction exactly (I have a book about it here somewhere, should look it up) but it does make branching a lot more efficient overall
Since the GPU does not do any prediction it has to fetch the data after the results of the if-statement are known. Throw this in a pixel shader which has to execute millions of times each frame (try 1920x1080 at 60fps, you'll get a pretty astonishing number) and you have a lot of wasted cycles which could perfectly be used for other purposes

So I take it as a general rule of thumb to avoid branching in shaders as much as possible, I only use them when there's absolutely no way around them

EDIT: I do however believe that more modern GPU's have some form of branch prediction, but it's absolutely not comparable to the prediction done on the CPU

mongoose7 · Post by **mongoose7** » Sat Jan 14, 2012 1:44 pm

The reason a CPU does branch prediction is because their pipelines are so long. (The Pentium 4 was just stupid, but that is in the past now.) So it must prepare the statements from one side of the branch or the other. (To do nothing means, after the branch, the pipeline is empty.) So it is a good idea to predict which branch has the highest probability. This is simply a matter of caching.

The moral of the story is that, for a branch to be expensive in a shader, it must have a pipeline. Yes?

hendu · Post by **hendu** » Sat Jan 14, 2012 2:10 pm

Having a branch there can invalidate the hierarchical optimizations, and I remember reading how texture reads inside a branch were more costly too (it makes them dependent).

mongoose7 · Post by **mongoose7** » Sat Jan 14, 2012 2:21 pm

I think I hear you saying that, if there are no branches, the compiler can optimise the entire unit. If there are branches, it can't.

Granyte · Post by **Granyte** » Sat Jan 14, 2012 4:31 pm

i just though about posting my code for the lulz reading what you guys are saying

Code: Select all

 
map = tex2D( tex0, TexCoord);  // sample heightmap
         if (map[2] >= 240.0f/255.0f)
         {
         col2 = tex2D( tex5, TexCoord*scale );  // sample color map
         col = tex2D( tex4, TexCoord*scale   );
         colfinal = lerp(col,col2,((map[2]-(240.0f/255.0f))*15.0f));
         }
         else if (map[2] >= 208.0f/255.0f && map[2] < 240.0f/255.0f)
         {
         colfinal = tex2D( tex4, frac(TexCoord*scale) );
         }
         else if (map[2] >= 188.0f/255.0f && map[2] < 208.0f/255.0f)
         {
         col2 = tex2D( tex4, TexCoord*scale );  // sample color map
         col = tex2D( tex3, TexCoord*scale   );
         colfinal = lerp(col,col2,((map[2]-(188.0f/255.0f))*12.75f));
         }
         else if (map[2] >= 156.0f/255.0f&& map[2] < 188.0f/255.0f)
         {
         colfinal = tex2D( tex3, TexCoord*scale );
         }
         else if (map[2] >= 136.0f/255.0f && map[2] < 156.0f/255.0f)
         {
         col2 = tex2D( tex3, TexCoord*scale );  // sample color map
         col = tex2D( tex2, TexCoord*scale   );
         colfinal = lerp(col,col2,((map[2]-(136.0f/255.0f))*12.75f));
         }
         else if (map[2] >= 104.0f/255.0f && map[2] < 136.0f/255.0f)
         {
         colfinal = tex2D( tex2, TexCoord*scale   );
         }
         else if (map[2] >= 84.0f/255.0f && map[2] < 104.0f/255.0f)
         {
         col2 = tex2D( tex2, TexCoord*scale   );  // sample color map
         col = tex2D( tex1, TexCoord*scale );
         colfinal = lerp(col,col2,((map[2]-(84.0f/255.0f))*12.75f));
         }
         else
         {
         colfinal  = tex2D( tex1, TexCoord*scale );
         }

but that explain why my other shader with a single branch works fine both branch use the same data the only diference is how they are multiplyed or added together

ACE247 · Post by **ACE247** » Sat Jan 14, 2012 4:59 pm

I think this article talked quite a bit about Branch Prediction or rather Flow Control on GPU's.
http://http.developer.nvidia.com/GPUGem ... ter34.html

Granyte · Post by **Granyte** » Sat Jan 14, 2012 5:27 pm

gpu gem 2 ......
nvidia 6000 serie ........
2005

do i need to remid you guys at wich rate the technologie evolve XD

so if i get it right if some one target legacy hardware don't use them but if some one target more recent hardware use them wisely

hendu · Post by **hendu** » Sat Jan 14, 2012 5:27 pm

@Granyte

- you have two conditions when you could have just one, use the fall-through
- you're doing TexCoord*scale multiple times even though it doesn't change
- magic numbers

Irrlicht Engine

Understanding Shaders

Understanding Shaders

Re: Understanding Shaders

Re: Understanding Shaders

Re: Understanding Shaders

Re: Understanding Shaders

Re: Understanding Shaders

Re: Understanding Shaders

Re: Understanding Shaders

Re: Understanding Shaders

Re: Understanding Shaders

Re: Understanding Shaders

Re: Understanding Shaders

Re: Understanding Shaders

Re: Understanding Shaders

Re: Understanding Shaders