Should the external libraries such as libPNG be optimised?
Should the external libraries such as libPNG be optimised?
The external C libraries are not optimised.
I don't know how much can be done, but for starters, I searched for every variable++ and changed them to ++variable.
Obviously not when the variable was being referenced at the same time, as in this case we want to get the value and increment after.
I did the same for variable--.
There may be other inefficiencies, I'm not so advanced on optimising, but I can see inefficient code.
Isn't it worth optimising?
I don't know how much can be done, but for starters, I searched for every variable++ and changed them to ++variable.
Obviously not when the variable was being referenced at the same time, as in this case we want to get the value and increment after.
I did the same for variable--.
There may be other inefficiencies, I'm not so advanced on optimising, but I can see inefficient code.
Isn't it worth optimising?
I can hear birds chirping

I live in the Eye of Insanity.
I live in the Eye of Insanity.
Well what kind of performance improvements did your optimizations bring? Let's talk numbers.
ShadowMapping for Irrlicht!: Get it here
Need help? Come on the IRC!: #irrlicht on irc://irc.freenode.net
Need help? Come on the IRC!: #irrlicht on irc://irc.freenode.net
But if you are interested and want me to get some results, I need a bit of time. I have other priorities.
But, I did think that something simple as variable++ should all be changed to ++variable, because it's usually in loops. Extra operations every loop.
It definitely is more efficient, but I don't know how much difference it makes. Maybe it is mostly when loading images etc. Done at the start of application loading and so doesn't affect the running.
I just thought, it's easy to do, so why not?
As far as efficiency is concerned:
if x is an int
++x
x++
But, I did think that something simple as variable++ should all be changed to ++variable, because it's usually in loops. Extra operations every loop.
It definitely is more efficient, but I don't know how much difference it makes. Maybe it is mostly when loading images etc. Done at the start of application loading and so doesn't affect the running.
I just thought, it's easy to do, so why not?
As far as efficiency is concerned:
if x is an int
++x
Code: Select all
return x = x + 1;
Code: Select all
int i = x;
x = x + 1;
return i;
I can hear birds chirping

I live in the Eye of Insanity.
I live in the Eye of Insanity.
Well I know where BlindSide is going as well, but let me just say something. Compilers are extremely good at optimization, and you also have to remember the definition. I am going to show you all the types of post-increment I found in the five files I searched:
Now let's analyze this first case study with GCC, and optimization at full power (-O3). Here's the two code cases.
case 1:
case 2:
Now you're probably asking yourself. "What's that volatile keyword there for?". Simple, it tells the compiler "Hey, actions taken on this variable are not defined in the scope of this program, so don't perform any optimization analysis, etc.!" Now both the cases of code above produce this code with GCC 4 (I took the boilerplate code out):
(Note: No optimization is taken on the 'volatile' variable which is why those last two lines were not reduced.)
At any rate, take this code:
produces:
With optimization GCC constant folds the variable and predicts its value at runtime. Taking out a significant amount of code.
GCC, and MSVC, can predict when it should use pre-increment/pre-decrement versus post-increment/post-decrement. You don't need to explicitly type it out.
And to answer your question directly, even if the libraries aren't fully optimized, they don't really need to be. They're for image loaders, which I'm almost sure isn't the most intensive part of a user's program, and if it was, then that would be the time to profile and look for optimizations.
EDIT:
You ninja'd me.
Let me shorten the above just to make it a little more terse. It won't make a bit of difference. GCC/MSVC already optimizes it for you.
Code: Select all
variable++; // case 1
for(...;...; variable++) // case 2
*ptr++ = assignment; // case 3
case 1:
Code: Select all
int main()
{
volatile int var = 2;
var++;
return var;
}
Code: Select all
int main()
{
volatile int var = 2;
++var;
return var;
}
Code: Select all
movl $2, -4(%ebp)
movl -4(%ebp), %eax
incl %eax
movl %eax, -4(%ebp)
movl -4(%ebp), %eax
At any rate, take this code:
Code: Select all
int main()
{
int var = 2;
var++;
return var;
}
Code: Select all
movl $3, %eax
GCC, and MSVC, can predict when it should use pre-increment/pre-decrement versus post-increment/post-decrement. You don't need to explicitly type it out.
And to answer your question directly, even if the libraries aren't fully optimized, they don't really need to be. They're for image loaders, which I'm almost sure isn't the most intensive part of a user's program, and if it was, then that would be the time to profile and look for optimizations.
EDIT:
You ninja'd me.
TheQuestion = 2B || !2B
-
hybrid
- Admin
- Posts: 14144
- Joined: Wed Apr 19, 2006 9:20 pm
- Location: Oldenburg(Oldb), Germany
- Contact:
Moreover, the libpng and zlib guys know their business. They'll have an eye on such things, and also provide optimized versions which you can simply choose instead of the standard ones we use. This will employ SSE optimizations and other stuff. And there you'll get a significant speedup. he rest, just remember that it's a good idea to write pre-increment, but that it won't help in the case of loop increments as those are properly handled by all known compilers just as Halifax said.
Ok. Thanks for the comments Blindside, Halifax, and hybrid.
So Halifax, loops are already optimised in all compilers? That is great.
Halifax, thanks for the excellent analysis.
Also,
Though, having said that, I'll also say that most of the changes I made were in loops, if I remember correctly. So, I guess it won't affect it.
hybrid,
I'm not going to get stuck on this for small gains. I was just curious.
Thanks again for the comments.
So Halifax, loops are already optimised in all compilers? That is great.
Halifax, thanks for the excellent analysis.
Also,
Agreed. That is what I was thinking anyway. Though I'd still think it's worth optimising. Well, it won't get slower if you do!And to answer your question directly, even if the libraries aren't fully optimized, they don't really need to be. They're for image loaders, which I'm almost sure isn't the most intensive part of a user's program,
Though, having said that, I'll also say that most of the changes I made were in loops, if I remember correctly. So, I guess it won't affect it.
hybrid,
I agree, it's good to always use pre-increment anyway, rather than relying on the compiler.just remember that it's a good idea to write pre-increment, but that it won't help in the case of loop increments as those are properly handled by all known compilers just as Halifax said
I'm sure you're right. The code has been out for ages.Moreover, the libpng and zlib guys know their business.
Interesting. It may not be necessary as it's mostly for image loading etc as Halifax said, but it makes me wonder, what is SSE? I'll look it up later.They'll have an eye on such things, and also provide optimized versions which you can simply choose instead of the standard ones we use. This will employ SSE optimizations and other stuff. And there you'll get a significant speedup.
I'm not going to get stuck on this for small gains. I was just curious.
Thanks again for the comments.
I can hear birds chirping

I live in the Eye of Insanity.
I live in the Eye of Insanity.
SSE and optimising
Halifax,
I looked up SSE. It's about intrinsic functions that can hold 128 bit values and do operations on larger data packets. Something like that?
It seems to be like unrolling loops. Actually I think it is faster than loop unrolling,, Well I can't say much, I only skimmed to get an idea of what it is.
Does it only work on Intel, or AMD also?
Anyway, I'm not crazy. I know it's no use to me until I have a great game or something.
But it looks interesting. Are there any tutorials that demonstrate how to incorporate it into code? I mean, with code example and showing how to set up the compiler to use it?
Have you used it? Do you know how? I don't want you to teach me, don't worry. Not yet anyway, hehe
I looked up SSE. It's about intrinsic functions that can hold 128 bit values and do operations on larger data packets. Something like that?
It seems to be like unrolling loops. Actually I think it is faster than loop unrolling,, Well I can't say much, I only skimmed to get an idea of what it is.
Does it only work on Intel, or AMD also?
Anyway, I'm not crazy. I know it's no use to me until I have a great game or something.
But it looks interesting. Are there any tutorials that demonstrate how to incorporate it into code? I mean, with code example and showing how to set up the compiler to use it?
Have you used it? Do you know how? I don't want you to teach me, don't worry. Not yet anyway, hehe
I can hear birds chirping

I live in the Eye of Insanity.
I live in the Eye of Insanity.
SSE is a pretty advanced topic. It will work on both AMD and Intel processors if you use the intrinsics. At any rate, the name stands for (S)treaming (S)IMD (E)xtensions, and SIMD stands for (S)ingle (I)nstruction (M)ultiple (D)ata.
Basically the difference is that your processor generally works in a scalar fashion while SSE unlocks features that allow you to use it in a vector processor fashion. Meaning that you can issue 4 additions in 1 instruction instead of 4 additions in 4 instructions e.g. it's quite an optimization if used correctly.
Not all code can be SIMD-ified though, so it's not good in all cases. And you probably shouldn't waste your time using SSE optimizations for libPNG, because I have heard that the speed increase is negligible.
If you want to learn more about SSE, check out the MSDN docs: (for some reason, linking to the URL causes the forum to erase my post completely, so just goole "SSE2 MSDN"). Knowing you to think in terms of vector processing is a great skill, and can come in handy when working with the SPUs in the PS3. (If you ever intend to get into the industry as an engine programmer.)
Basically the difference is that your processor generally works in a scalar fashion while SSE unlocks features that allow you to use it in a vector processor fashion. Meaning that you can issue 4 additions in 1 instruction instead of 4 additions in 4 instructions e.g. it's quite an optimization if used correctly.
Not all code can be SIMD-ified though, so it's not good in all cases. And you probably shouldn't waste your time using SSE optimizations for libPNG, because I have heard that the speed increase is negligible.
If you want to learn more about SSE, check out the MSDN docs: (for some reason, linking to the URL causes the forum to erase my post completely, so just goole "SSE2 MSDN"). Knowing you to think in terms of vector processing is a great skill, and can come in handy when working with the SPUs in the PS3. (If you ever intend to get into the industry as an engine programmer.)
TheQuestion = 2B || !2B
Thanks Halifax,
. I'm ok with that, I am good at Maths
.
A programmer should be good at Maths? Yes I think so.
Hey, you seem to know a lot about these complex issues, or are you just well read on it?
Thanks again.
Wow, sounds fascinating.. in a way, and dead boring in a mathematics genius waySSE unlocks features that allow you to use it in a vector processor fashion. Meaning that you can issue 4 additions in 1 instruction instead of 4 additions in 4 instructions e.g. it's quite an optimization if used correctly.
A programmer should be good at Maths? Yes I think so.
Ok. I will definitely look into it.. eventually. I'm not making a deadline for myself, I'm too kind to myself to put myself through those bullshit dreams like, tomorrow I will have SSE optimisation in my code.. hahahaha.Knowing you to think in terms of vector processing is a great skill, and can come in handy when working with the SPUs in the PS3. (If you ever intend to get into the industry as an engine programmer.)
Hey, you seem to know a lot about these complex issues, or are you just well read on it?
Thanks again.
I can hear birds chirping

I live in the Eye of Insanity.
I live in the Eye of Insanity.
Many years of programming and academic work. I don't really consider that complex though, to be honest, and you'll learn to not consider things complicated either, as you continue to program.Ulf wrote:Hey, you seem to know a lot about these complex issues, or are you just well read on it?
TheQuestion = 2B || !2B