Irrlicht Occlusion Culling (Pre-alpha)
Irrlicht Occlusion Culling (Pre-alpha)
Introduction
Well, about a week ago I got pretty froggy and decided that I would give it a shot. So after a little research, and testing, I have implemented a simple hardware-based bounding box occlusion culling. I am getting some pretty good numbers, with this simple demo, but I hope to improve this even more as I will discuss below. Right now, I'm releasing this, and opening up this thread to show the progress. Oh yes, and currently, this is only working for OpenGL, but as I said, I will discuss more later. My plan is to have a fully-functioning, and configurable, occlusion culling system built-in to Irrlicht.
Download
Download Link
Results
The main thing this occlusion culling system is good for right now is culling non-visible objects, and overdraw that isn't frustum culled yet. At any rate here are my results on an ATI Raedon 200M Xpress:
58 FPS with occlusion enabled; 1,1,15 and 14 occluded; 400 polys
17 FPS with occlusion enabled; 1,1,15 and 10 occluded; 400 polys
6 FPS with occlusion enabled; 1,1,15 and 0 occluded; 400 polys
6 FPS with occlusion disabled; 1,1,15 and 0 occluded; 400 polys
Now mind you, I was in the same position (start) with occlusion culling enabled, and disabled. So as you can see there is a increase of about 56 FPS on my crappy card. At any rate here are some screenshots:
Improvements/To-Do/Checklist
- Well first off, I would like to get this ported over to DX9 which shouldn't be a problem, and will be done by the time I fully release this with patches and everything. I don't know about the software renderers, but I could always give it a try just for academic purposes.
- There is one major improvement that will be first on my list to get done, and that will be rendering this to a texture instead. As we all probably know the main thing to deal with is fill-rate limitations. So it would be nice to expose some methods to user so they can manipulate the render texture size based on their needs.
- Well bounding box occlusion is good for simple things, but gets kind of out of hand once you really big object bounding boxes, and the actual object is a bunch of space. So my intention is for a user to be able to specify a bounding mesh of their liking instead. Which means, they could model a mesh, and then model a bounding mesh, and attach that for occlusion culling.
- Improving the occlusion culling piepline wouldn't be to hard, and would probably yield better results once you get into having tons of objects in a scene. Basically you would just issue the queries early in the pipeline, which is definitely do-able, I just didn't try it yet.
- Make a better demo.
Now if anyone else has some improvements they would like to mention, feel free, and I will look into it.
Thanks
Thanks goes to BlindSide, and hybrid too.
Well, about a week ago I got pretty froggy and decided that I would give it a shot. So after a little research, and testing, I have implemented a simple hardware-based bounding box occlusion culling. I am getting some pretty good numbers, with this simple demo, but I hope to improve this even more as I will discuss below. Right now, I'm releasing this, and opening up this thread to show the progress. Oh yes, and currently, this is only working for OpenGL, but as I said, I will discuss more later. My plan is to have a fully-functioning, and configurable, occlusion culling system built-in to Irrlicht.
Download
Download Link
Results
The main thing this occlusion culling system is good for right now is culling non-visible objects, and overdraw that isn't frustum culled yet. At any rate here are my results on an ATI Raedon 200M Xpress:
58 FPS with occlusion enabled; 1,1,15 and 14 occluded; 400 polys
17 FPS with occlusion enabled; 1,1,15 and 10 occluded; 400 polys
6 FPS with occlusion enabled; 1,1,15 and 0 occluded; 400 polys
6 FPS with occlusion disabled; 1,1,15 and 0 occluded; 400 polys
Now mind you, I was in the same position (start) with occlusion culling enabled, and disabled. So as you can see there is a increase of about 56 FPS on my crappy card. At any rate here are some screenshots:
Improvements/To-Do/Checklist
- Well first off, I would like to get this ported over to DX9 which shouldn't be a problem, and will be done by the time I fully release this with patches and everything. I don't know about the software renderers, but I could always give it a try just for academic purposes.
- There is one major improvement that will be first on my list to get done, and that will be rendering this to a texture instead. As we all probably know the main thing to deal with is fill-rate limitations. So it would be nice to expose some methods to user so they can manipulate the render texture size based on their needs.
- Well bounding box occlusion is good for simple things, but gets kind of out of hand once you really big object bounding boxes, and the actual object is a bunch of space. So my intention is for a user to be able to specify a bounding mesh of their liking instead. Which means, they could model a mesh, and then model a bounding mesh, and attach that for occlusion culling.
- Improving the occlusion culling piepline wouldn't be to hard, and would probably yield better results once you get into having tons of objects in a scene. Basically you would just issue the queries early in the pipeline, which is definitely do-able, I just didn't try it yet.
- Make a better demo.
Now if anyone else has some improvements they would like to mention, feel free, and I will look into it.
Thanks
Thanks goes to BlindSide, and hybrid too.
TheQuestion = 2B || !2B
-
- Posts: 1638
- Joined: Mon Apr 30, 2007 3:24 am
- Location: Montreal, CANADA
- Contact:
Umm, what for? I don't remember being of any help.Thanks goes to BlindSide, and hybrid too.
In any case, great work, I'm going to try this out now on MY ATI 200M.
Ok I tried it. It works great, but there are some cases (Even with polygon count set to 64, where slightly visible things are being culled.), I'm thinking the bounding box should use a worst-case scenario, and render some things that are not visible, rather than not render things that are supposed to be visible (I guess an easy fix for non-arbitrary shaped objects would be to just make the bounding box bigger.).
I agree a nicer demo would do great, just pop open IrrEdit and loiter the castle level with highpoly furniture.
Anyway, I agree with christian, this kind of feature is essential for Irrlicht, alot of people simply do what I just mentioned, and then complain about performance problems when millions of unseen polys get rendered because of the lack of a proper culling algorithm.
Oh yeah, you should try this with something more fillrate intensive, eg, using large textures on the spheres, with some normalmapping or something. That might show an even better performance improvement when things get occluded.
Cheers
ShadowMapping for Irrlicht!: Get it here
Need help? Come on the IRC!: #irrlicht on irc://irc.freenode.net
Need help? Come on the IRC!: #irrlicht on irc://irc.freenode.net
Yeah, you helped me on the channel with the NULL material.
But yes I will get a better demo for the next problem, and I am working on the RTT part tommorow seeing as my time is abnormally free.
Oh yeah, and one thing that I negelected to mention is the poor decision in code design that I made when performing the occlusion queries. As a worst case scenario in which the fragment count wasn't available, I neglected to render the scene node. This is fixed now, so in worst case scenario where the query is not available it just renders the object. I don't think there will be many times where that happens though, because I have moved the occlusion culling pass up in the pipeline, and that by itself seemed to remove the problem on my system.
But one thing I have to stress, even more, if occlusion culling were to be included in the Irrlicht engine is the use of scene nodes. Which means that if one were trying to take advantage of this occlusion culling, their best bet is to split their levels into multiple scene nodes. If you only have one scene node then you won't have any benefit.
Oh yeah, another thing I also forgot about is the fact that solid objects can cull transparent objects, but not vice versa. So that will be included next time as well.
But yes I will get a better demo for the next problem, and I am working on the RTT part tommorow seeing as my time is abnormally free.
Oh yeah, and one thing that I negelected to mention is the poor decision in code design that I made when performing the occlusion queries. As a worst case scenario in which the fragment count wasn't available, I neglected to render the scene node. This is fixed now, so in worst case scenario where the query is not available it just renders the object. I don't think there will be many times where that happens though, because I have moved the occlusion culling pass up in the pipeline, and that by itself seemed to remove the problem on my system.
But one thing I have to stress, even more, if occlusion culling were to be included in the Irrlicht engine is the use of scene nodes. Which means that if one were trying to take advantage of this occlusion culling, their best bet is to split their levels into multiple scene nodes. If you only have one scene node then you won't have any benefit.
Oh yeah, another thing I also forgot about is the fact that solid objects can cull transparent objects, but not vice versa. So that will be included next time as well.
TheQuestion = 2B || !2B
very nice Halifax!
for the problem of bounding boxes/spheres which may not make a correct occlusion, i was thinking maybe we could invent a new term : the inverted box/sphere. the maximal box/sphere from within the object. that way it would not be such errors.
edit: there is one small issue, i get a very low fps rate with 125 spheres *30 polys=3750polys, i get about 13 fps which is really weird
for the problem of bounding boxes/spheres which may not make a correct occlusion, i was thinking maybe we could invent a new term : the inverted box/sphere. the maximal box/sphere from within the object. that way it would not be such errors.
edit: there is one small issue, i get a very low fps rate with 125 spheres *30 polys=3750polys, i get about 13 fps which is really weird
If I'm reading your sentence correctly, for the inverted bounding box, then it appears you are suggesting that you calculate a bounding box that is contained by the scene node? I don't really understand, and I doubt that would help that much as well, if that is what your saying.
By the way, a better description of your FPS problem would help me. What is the FPS when occlusion culling is off, when all are occluded, when none are occluded, when the poly count is increased, when it is decreased, etc. Also, what is your graphics card, because possibly it doesn't even support hardware occlusion queries. And if it doesn't support it, then it would probably fall back to software.
By the way, there will be a flag exposed in scene nodes to eliminate it from occlusion culling if the user chooses.
By the way, a better description of your FPS problem would help me. What is the FPS when occlusion culling is off, when all are occluded, when none are occluded, when the poly count is increased, when it is decreased, etc. Also, what is your graphics card, because possibly it doesn't even support hardware occlusion queries. And if it doesn't support it, then it would probably fall back to software.
By the way, there will be a flag exposed in scene nodes to eliminate it from occlusion culling if the user chooses.
TheQuestion = 2B || !2B
yes what i meant is something like maximal BOUNDED ( not bounding ) box. this way the culling would be done against something smaller than the actual mesh and thus there would be no errors resulting from larger boxes than the mesh which could lead to culling meshes which would be visible.Halifax wrote:If I'm reading your sentence correctly, for the inverted bounding box, then it appears you are suggesting that you calculate a bounding box that is contained by the scene node? I don't really understand, and I doubt that would help that much as well, if that is what your saying.
By the way, a better description of your FPS problem would help me. What is the FPS when occlusion culling is off, when all are occluded, when none are occluded, when the poly count is increased, when it is decreased, etc. Also, what is your graphics card, because possibly it doesn't even support hardware occlusion queries. And if it doesn't support it, then it would probably fall back to software.
By the way, there will be a flag exposed in scene nodes to eliminate it from occlusion culling if the user chooses.
about fps, with culling OFF, 8 fps with : 5 sphere / axis ( 125 in total ) with 30 polys each ( which means 3750 polys in total ).
nvidia 6600. with culling on slightly better ( about 13 ). it just seems weird to me that 8 fps with only 3750 polys.
Re:
Hi! Very cool thing, keep it up!
Here are my test results with 10x10x10 spheres (10 polygons), opengl, ati radeon hd 2600 pro:
culling on: 19-22 FPS
culling off: 61-62 FPS
As you can see, currently without the culling it's faster for me, but I hope you'll make progress on this.
Thumbs up!
Cheers,
PI
Here are my test results with 10x10x10 spheres (10 polygons), opengl, ati radeon hd 2600 pro:
culling on: 19-22 FPS
culling off: 61-62 FPS
As you can see, currently without the culling it's faster for me, but I hope you'll make progress on this.
Thumbs up!
Cheers,
PI
@Mirror: That's quite wierd, maybe it's still a problem in my SVN version. I guess I'll update and compile against the latest SVN next time I release a demo.
@PI: How many objects are being occluded. Their are some many possibilites with the occlusion culling that could lead to that, but also the fact that it's 1,000 scene nodes in the graph. And last time I remember, Irrlicht wasn't to good with many scene nodes.
But also it could be one of the common pitfalls with my process as I explained before. That's already fixed. And by the way, how many objects are occluded when you get that lower 19-22 FPS?
@PI: How many objects are being occluded. Their are some many possibilites with the occlusion culling that could lead to that, but also the fact that it's 1,000 scene nodes in the graph. And last time I remember, Irrlicht wasn't to good with many scene nodes.
But also it could be one of the common pitfalls with my process as I explained before. That's already fixed. And by the way, how many objects are occluded when you get that lower 19-22 FPS?
TheQuestion = 2B || !2B
Well I've run it again, still about 19 FPS, and there were 760-790 objects occluded.@PI: How many objects are being occluded. Their are some many possibilites with the occlusion culling that could lead to that, but also the fact that it's 1,000 scene nodes in the graph. And last time I remember, Irrlicht wasn't to good with many scene nodes.
But also it could be one of the common pitfalls with my process as I explained before. That's already fixed. And by the way, how many objects are occluded when you get that lower 19-22 FPS?
Yeah it could be the method itself, or maybe it's because you don't use VBOs? Have you tried occlusion culling with them?
Anyway, keep it up, it will be really great!
Cheers,
PI
I did 10x10x10 spheres (20 polygons), NVidia 8800 GT:
500 occl
directX 9.0c:
culling on: 280-285 FPS
culling off: 275-280 FPS
openGL:
culling on: 40-50 FPS
culling off: 35-45 FPS
i got very strange differences...
500 occl
directX 9.0c:
culling on: 280-285 FPS
culling off: 275-280 FPS
openGL:
culling on: 40-50 FPS
culling off: 35-45 FPS
i got very strange differences...
Compete or join in irrlichts monthly screenshot competition!
Blog/site: http://rex.4vandorp.eu
Company: http://www.islandworks.eu/, InCourse
Twitter: @strong99
Blog/site: http://rex.4vandorp.eu
Company: http://www.islandworks.eu/, InCourse
Twitter: @strong99