I'm afraid that if your PS2.0 card isn't "2.0a" or "2.0b" then this demo won't work for you. Standard 2.0 just has doesn't have the instruction count. Then again, I was amazed I was able to fit the PS3.0 in PS2.x at all.
Nice work, shame i cant run it, im too lazy to get the newest dx but ill get it to see this.
I had a look at the asm shader in your exe and it has billions of compare commands, maybe thats why the instruction "lighter" ps2 version is slower, it has to do all the operations even if the compare tells it not to.(no full dynamic branching in ps2)
I was reading a thesis by Timothy John Purcell on gpu raytracing (triangle structures) and it was interesting to see that octrees didnt actually speed things up on the gpu (since octree code used "heavier" instructions than the rendering ones), he just used a uniform grid instead.
Maybe all these compares are slowing things down? thats if they sorting things into groups or something. So ungrouped rendering maybe faster in high desnity secenes (with so many groups you might have the same number of groups as your primitives lol).
I have to say the reflections are pretty good (and according to the framerates they are very fast) I would be intrested in how you did them so fast.
"Irrlicht is obese"
If you want modern rendering techniques learn how to make them or go to the engine next door =p
omaremad wrote:Nice work, shame i cant run it, im too lazy to get the newest dx but ill get it to see this.
I had a look at the asm shader in your exe and it has billions of compare commands, maybe thats why the instruction "lighter" ps2 version is slower, it has to do all the operations even if the compare tells it not to.(no full dynamic branching in ps2)
I was reading a thesis by Timothy John Purcell on gpu raytracing (triangle structures) and it was interesting to see that octrees didnt actually speed things up on the gpu (since octree code used "heavier" instructions than the rendering ones), he just used a uniform grid instead.
Maybe all these compares are slowing things down? thats if they sorting things into groups or something. So ungrouped rendering maybe faster in high desnity secenes (with so many groups you might have the same number of groups as your primitives lol).
I have to say the reflections are pretty good (and according to the framerates they are very fast) I would be intrested in how you did them so fast.
Yeah, the PS2 version has to execute all the instructions even if the camera is pointing at nothing. The PS3 version has two "if"s that cut out lots of instructions.
On my PS3 card the PS3 version is two to three times as fast as the PS2 version. I guess SM3 instructions are just faster on this card.
I just about managed to fit the shader into PS2.0a, but my SM3 card easily has enough instructions; it reports 4096, but I'd hate to see what the performance is like if I approach that (currently approx 327).
a_haouchar wrote:this is cool! did you use linux to code it to ps3 and ps2?
I'm not sure what linux has to do with ps3 (Pixel Shader 3.0) and ps2 (Pixel Shader 2.0). Pixel shaders are pieces of code that can be run on graphics hardware to determine the colour of each fragment of the rasterised triangle.
OnTopic: Tested this on a Nvidia 7950 something (GT? I think). Gave around 300-450 as expected.
Good. Keep that card ready. My current raytracing shader is taking about a minute to compile and is 1223 ps3 instructions. Forget three spheres - I currently have 16 of them. All reflective and shadowed. My card (7800) supports 4096 instructions; I'll use as many of them as possible (until I run out of the 226 constant registers, probably). Shame I haven't yet got the shader compiler to utilise the SM3 aL loop register, though.
Either I'll run out of: constant registers (~100 spheres); ps3 instructions (4096); fps; or patience as I wait for the darm HLSL to compile.
You will need a very fast SM3 card for this next demo.