Should the external libraries such as libPNG be optimised?

Discuss about anything related to the Irrlicht Engine, or read announcements about any significant features or usage changes.
Post Reply
Ulf
Posts: 281
Joined: Mon Jun 15, 2009 8:53 am
Location: Australia

Should the external libraries such as libPNG be optimised?

Post by Ulf »

The external C libraries are not optimised.

I don't know how much can be done, but for starters, I searched for every variable++ and changed them to ++variable.
Obviously not when the variable was being referenced at the same time, as in this case we want to get the value and increment after.
I did the same for variable--.

There may be other inefficiencies, I'm not so advanced on optimising, but I can see inefficient code.

Isn't it worth optimising?
I can hear birds chirping
:twisted:

I live in the Eye of Insanity.
BlindSide
Admin
Posts: 2821
Joined: Thu Dec 08, 2005 9:09 am
Location: NZ!

Post by BlindSide »

Well what kind of performance improvements did your optimizations bring? Let's talk numbers.
ShadowMapping for Irrlicht!: Get it here
Need help? Come on the IRC!: #irrlicht on irc://irc.freenode.net
Ulf
Posts: 281
Joined: Mon Jun 15, 2009 8:53 am
Location: Australia

Post by Ulf »

Hehe. :roll:

No, don't get me wrong.

I was just asking.. :shock:
I can hear birds chirping
:twisted:

I live in the Eye of Insanity.
Ulf
Posts: 281
Joined: Mon Jun 15, 2009 8:53 am
Location: Australia

Post by Ulf »

But if you are interested and want me to get some results, I need a bit of time. I have other priorities.

But, I did think that something simple as variable++ should all be changed to ++variable, because it's usually in loops. Extra operations every loop.

It definitely is more efficient, but I don't know how much difference it makes. Maybe it is mostly when loading images etc. Done at the start of application loading and so doesn't affect the running.
I just thought, it's easy to do, so why not?

As far as efficiency is concerned:

if x is an int

++x

Code: Select all

return x = x + 1;
x++

Code: Select all

int i = x;
x = x + 1;
return i;
I can hear birds chirping
:twisted:

I live in the Eye of Insanity.
Ulf
Posts: 281
Joined: Mon Jun 15, 2009 8:53 am
Location: Australia

Post by Ulf »

Call me pedantic.

It is just obvious to me and it gets to me! haha.
I can hear birds chirping
:twisted:

I live in the Eye of Insanity.
Halifax
Posts: 1424
Joined: Sun Apr 29, 2007 10:40 pm
Location: $9D95

Post by Halifax »

Well I know where BlindSide is going as well, but let me just say something. Compilers are extremely good at optimization, and you also have to remember the definition. I am going to show you all the types of post-increment I found in the five files I searched:

Code: Select all

variable++; // case 1
for(...;...; variable++) // case 2
*ptr++ = assignment; // case 3
Now let's analyze this first case study with GCC, and optimization at full power (-O3). Here's the two code cases.

case 1:

Code: Select all

int main()
{
    volatile int var = 2;
    var++;
    return var;
}
case 2:

Code: Select all

int main()
{
    volatile int var = 2;
    ++var;
    return var;
}
Now you're probably asking yourself. "What's that volatile keyword there for?". Simple, it tells the compiler "Hey, actions taken on this variable are not defined in the scope of this program, so don't perform any optimization analysis, etc.!" Now both the cases of code above produce this code with GCC 4 (I took the boilerplate code out):

Code: Select all

movl	$2, -4(%ebp)
movl	-4(%ebp), %eax
incl	%eax
movl	%eax, -4(%ebp)
movl	-4(%ebp), %eax
(Note: No optimization is taken on the 'volatile' variable which is why those last two lines were not reduced.)
At any rate, take this code:

Code: Select all

int main()
{
    int var = 2;
    var++;
    return var;
}
produces:

Code: Select all

movl	$3, %eax
With optimization GCC constant folds the variable and predicts its value at runtime. Taking out a significant amount of code.

GCC, and MSVC, can predict when it should use pre-increment/pre-decrement versus post-increment/post-decrement. You don't need to explicitly type it out.

And to answer your question directly, even if the libraries aren't fully optimized, they don't really need to be. They're for image loaders, which I'm almost sure isn't the most intensive part of a user's program, and if it was, then that would be the time to profile and look for optimizations.

EDIT:

You ninja'd me. :lol: Let me shorten the above just to make it a little more terse. It won't make a bit of difference. GCC/MSVC already optimizes it for you.
TheQuestion = 2B || !2B
hybrid
Admin
Posts: 14144
Joined: Wed Apr 19, 2006 9:20 pm
Location: Oldenburg(Oldb), Germany
Contact:

Post by hybrid »

Moreover, the libpng and zlib guys know their business. They'll have an eye on such things, and also provide optimized versions which you can simply choose instead of the standard ones we use. This will employ SSE optimizations and other stuff. And there you'll get a significant speedup. he rest, just remember that it's a good idea to write pre-increment, but that it won't help in the case of loop increments as those are properly handled by all known compilers just as Halifax said.
Ulf
Posts: 281
Joined: Mon Jun 15, 2009 8:53 am
Location: Australia

Post by Ulf »

Ok. Thanks for the comments Blindside, Halifax, and hybrid.

So Halifax, loops are already optimised in all compilers? That is great.

Halifax, thanks for the excellent analysis.

Also,
And to answer your question directly, even if the libraries aren't fully optimized, they don't really need to be. They're for image loaders, which I'm almost sure isn't the most intensive part of a user's program,
Agreed. That is what I was thinking anyway. Though I'd still think it's worth optimising. Well, it won't get slower if you do!
Though, having said that, I'll also say that most of the changes I made were in loops, if I remember correctly. So, I guess it won't affect it.

hybrid,
just remember that it's a good idea to write pre-increment, but that it won't help in the case of loop increments as those are properly handled by all known compilers just as Halifax said
I agree, it's good to always use pre-increment anyway, rather than relying on the compiler.
Moreover, the libpng and zlib guys know their business.
I'm sure you're right. The code has been out for ages.
They'll have an eye on such things, and also provide optimized versions which you can simply choose instead of the standard ones we use. This will employ SSE optimizations and other stuff. And there you'll get a significant speedup.
Interesting. It may not be necessary as it's mostly for image loading etc as Halifax said, but it makes me wonder, what is SSE? I'll look it up later.

I'm not going to get stuck on this for small gains. I was just curious.

Thanks again for the comments.
I can hear birds chirping
:twisted:

I live in the Eye of Insanity.
Ulf
Posts: 281
Joined: Mon Jun 15, 2009 8:53 am
Location: Australia

SSE and optimising

Post by Ulf »

Halifax,

I looked up SSE. It's about intrinsic functions that can hold 128 bit values and do operations on larger data packets. Something like that?

It seems to be like unrolling loops. Actually I think it is faster than loop unrolling,, Well I can't say much, I only skimmed to get an idea of what it is.

Does it only work on Intel, or AMD also?

Anyway, I'm not crazy. I know it's no use to me until I have a great game or something. 8)

But it looks interesting. Are there any tutorials that demonstrate how to incorporate it into code? I mean, with code example and showing how to set up the compiler to use it?

Have you used it? Do you know how? I don't want you to teach me, don't worry. Not yet anyway, hehe :twisted:
I can hear birds chirping
:twisted:

I live in the Eye of Insanity.
Ulf
Posts: 281
Joined: Mon Jun 15, 2009 8:53 am
Location: Australia

Post by Ulf »

Anyway, I'm not crazy.
Not really true. But I'm not stupid.
I can hear birds chirping
:twisted:

I live in the Eye of Insanity.
Halifax
Posts: 1424
Joined: Sun Apr 29, 2007 10:40 pm
Location: $9D95

Post by Halifax »

SSE is a pretty advanced topic. It will work on both AMD and Intel processors if you use the intrinsics. At any rate, the name stands for (S)treaming (S)IMD (E)xtensions, and SIMD stands for (S)ingle (I)nstruction (M)ultiple (D)ata.

Basically the difference is that your processor generally works in a scalar fashion while SSE unlocks features that allow you to use it in a vector processor fashion. Meaning that you can issue 4 additions in 1 instruction instead of 4 additions in 4 instructions e.g. it's quite an optimization if used correctly.

Not all code can be SIMD-ified though, so it's not good in all cases. And you probably shouldn't waste your time using SSE optimizations for libPNG, because I have heard that the speed increase is negligible.

If you want to learn more about SSE, check out the MSDN docs: (for some reason, linking to the URL causes the forum to erase my post completely, so just goole "SSE2 MSDN"). Knowing you to think in terms of vector processing is a great skill, and can come in handy when working with the SPUs in the PS3. (If you ever intend to get into the industry as an engine programmer.)
TheQuestion = 2B || !2B
Ulf
Posts: 281
Joined: Mon Jun 15, 2009 8:53 am
Location: Australia

Post by Ulf »

Thanks Halifax,
SSE unlocks features that allow you to use it in a vector processor fashion. Meaning that you can issue 4 additions in 1 instruction instead of 4 additions in 4 instructions e.g. it's quite an optimization if used correctly.
Wow, sounds fascinating.. in a way, and dead boring in a mathematics genius way :lol:. I'm ok with that, I am good at Maths :wink:.

A programmer should be good at Maths? Yes I think so.
Knowing you to think in terms of vector processing is a great skill, and can come in handy when working with the SPUs in the PS3. (If you ever intend to get into the industry as an engine programmer.)
Ok. I will definitely look into it.. eventually. I'm not making a deadline for myself, I'm too kind to myself to put myself through those bullshit dreams like, tomorrow I will have SSE optimisation in my code.. hahahaha.

Hey, you seem to know a lot about these complex issues, or are you just well read on it?

Thanks again.
I can hear birds chirping
:twisted:

I live in the Eye of Insanity.
Halifax
Posts: 1424
Joined: Sun Apr 29, 2007 10:40 pm
Location: $9D95

Post by Halifax »

Ulf wrote:Hey, you seem to know a lot about these complex issues, or are you just well read on it?
Many years of programming and academic work. I don't really consider that complex though, to be honest, and you'll learn to not consider things complicated either, as you continue to program.
TheQuestion = 2B || !2B
Ulf
Posts: 281
Joined: Mon Jun 15, 2009 8:53 am
Location: Australia

Post by Ulf »

Ok. yes, it may not be too complex in reality, but it looks very technical. That's what I meant.
Thanks for the advice.
I can hear birds chirping
:twisted:

I live in the Eye of Insanity.
Post Reply