Backstory:
Howdy, so I have a scene with a terrian, skybox,mesh(character), 3rd person camera, and a light. Nothing special but I get 10-20 FPS. DirectX 9 & OpenGL give roughly the same results. My first thought was oh man I have some horrible code somewhere that is eating up all the cpu time. But after I did some quit profiling I noticed that my scene manager's drawAll() command was the one eating up most of the time in my main loop. That of course doesn't narrow it down much but I don't have any custom scene nodes being rendered so clearly Something else is going on. So anyways I was digging around in the irrlicht code because I figured I would need a better idea of how it works to narrow this thing down. I started thinking about optimization and wondered if intrinsics had ever been considered.
Actual Question:
So here is my question. Has intrinsics ever been considered for use in Irrlicht? Like for vector math and stuff. Would it yield any benefit or would it just be a waste of time?
My Stuff:
Irrlicht 1.6
MSVC (2003,2008) //tried on both no difference
AMD Phenom 9750 Quad-Core Processor 2.40 Ghz
8 GB DDR2 (4 GB usable)
ATI Radeon HD 4850
Intrinsics?
-
- Admin
- Posts: 14143
- Joined: Wed Apr 19, 2006 9:20 pm
- Location: Oldenburg(Oldb), Germany
- Contact:
I think vector optimization is the least thing you have to worry about. Even though things are not really comparable, my 5 year old laptop has far better performance in scenes that you described. Maybe post some numbers you get when starting the default examples from the Irrlicht SDK.
Regarding instrinsics: We have some of those in our math lib, but it's usually only relevant for really low-end systems, which require every bit of optimization. Most apps and systems will benefit more from default compiler optimization.
Regarding instrinsics: We have some of those in our math lib, but it's usually only relevant for really low-end systems, which require every bit of optimization. Most apps and systems will benefit more from default compiler optimization.
huh?
Yeah, sorry my performance issue is limited to my test scene. I get great performance from the sample applications. 2000 fps on one of the simple ones. I wasn't trying to blame the engine for my performance issue I was just relaying how I came to my question. I'm pretty sure my issue is either a bug or just something weird with my scene.
Anyway on to the meat of the discussion. I take issue with one aspect of your response. I'm not very familiar with intrinsics but I know they allow you to use special processor instructions without inline assembly.
I guess you could just be saying that it isn't needed on high-end systems because they perform fine without them but that doesn't mesh with the compiler statement. I probably look like a jerk for asking for people's opinions and then arguing with the first one I get but I'm having difficulty meshing what your saying with what I have read.
Anyway on to the meat of the discussion. I take issue with one aspect of your response. I'm not very familiar with intrinsics but I know they allow you to use special processor instructions without inline assembly.
That statement fires red flags up like crazy in my admittedly inexperienced mind. One reason is that SSE for example wasn't added until pentium III . Later it was added to AMD processors. So if your target for intrinsics is "low-end systems" depending on what you call low end they may not support it. Also if using these new instructions aren't relevant on high-end systems why do they keep adding new extensions SSE3,SSE4, etc. My final objection is why would they even provide headers for this stuff if the compiler can optimize better?Regarding instrinsics: We have some of those in our math lib, but it's usually only relevant for really low-end systems, which require every bit of optimization.
I guess you could just be saying that it isn't needed on high-end systems because they perform fine without them but that doesn't mesh with the compiler statement. I probably look like a jerk for asking for people's opinions and then arguing with the first one I get but I'm having difficulty meshing what your saying with what I have read.
I'm not random you just don't get the &.
-
- Admin
- Posts: 14143
- Joined: Wed Apr 19, 2006 9:20 pm
- Location: Oldenburg(Oldb), Germany
- Contact:
Well, there are many different ways to write such things. You can also use compiler proprietary extensions, which later on compile down to code such as SSE, or even inline assembler. But unless there are really huge SIMD like areas in the code it's not really that useful.
Regarding low-end hardware. Yes, my netbook is also low-end hardware. With only SSE (2?) and slow processors (Atom or other very low classes). These systems could really benefit from manual tweaking. But for most others it's often not worth the effort. Not that I'm absolutely against such things, but we have other TODOs with higher priority.
Regarding low-end hardware. Yes, my netbook is also low-end hardware. With only SSE (2?) and slow processors (Atom or other very low classes). These systems could really benefit from manual tweaking. But for most others it's often not worth the effort. Not that I'm absolutely against such things, but we have other TODOs with higher priority.
alright
Alright, that makes sense. Thank you for explaining.
I'm not random you just don't get the &.