3000th commit - IrrlichtBAW (GIT repo, v 0.3.0-gamma1)

devsh · Post by **devsh** » Mon Nov 12, 2018 10:31 am

Nice stuff, but seems that your Blender Exporter or something hates your smoothing groups.

mant · Post by **mant** » Mon Nov 12, 2018 10:33 am

I haven't smoothed the exported model, will try.

devsh · Post by **devsh** » Mon Nov 19, 2018 7:37 pm

New version is out.

Stable-ish custom memory allocators, and a proper implementation of GPU memory streaming.

There will be a new version sometime this week that will touch up a few things with Resizable Address Allocators, but nothing major.

This is the last version before asset-pipeline (which is actually already done, thread safe texture and mesh loading) and color (all Vulkan and OpenGL texture format support) branches will be merged.

Also new extension has been contributed by manh for emulating the legacy draw3DLine (but also with batches for high performance and lots of lines) as well as a PR for OpenCL (that has interop with your GL contexts) device creation.

devsh · Post by **devsh** » Tue Jan 15, 2019 1:37 pm

Asset-pipeline and Vulkan format branches are getting merged soon.

devsh · Post by **devsh** » Tue Jan 22, 2019 10:04 pm

There's an example Bullet 3 integration being worked on by a 3rd party.

devsh · Post by **devsh** » Wed Jan 23, 2019 2:28 pm

Latest stable tag before merge:
https://github.com/buildaworldnet/Irrli ... .0-alpha11

The merge is happening:
https://github.com/buildaworldnet/IrrlichtBAW/pull/213

devsh · Post by **devsh** » Sat Jan 26, 2019 5:10 am

Threw old the old unsafe, global, static and imprecise timer.

robmar · Post by **robmar** » Tue Jan 29, 2019 8:27 am

I had a quick look at BAW, is the Vulkan driver implemented, because I couldn't find the driver just a few structures name ...Vulkan.

devsh · Post by **devsh** » Tue Jan 29, 2019 10:19 am

Vulkan is not yet implemented, it requires following projects to be complete first:
https://github.com/buildaworldnet/Irrli ... projects/7
https://github.com/buildaworldnet/Irrli ... projects/3

We have OpenGL that works like Vulkan (Persistently Mapped Buffers).

robmar · Post by **robmar** » Wed Jan 30, 2019 3:23 pm

Has anyone considered using Vulkan Ez, or Ez Vulkan API, seems that it offers all the functionality at lower-level but faster start-up.
I guess there is considerable work to do restructuring for CG multitasking to make have Vulkan really worthwhile.
Still, Ez Vulkan seems interesting.

devsh · Post by **devsh** » Wed Jan 30, 2019 4:30 pm

Yes, but we decided to fix the API problems first.

It's Irrlicht's (and my fork's) APIs that are preventing good performance and threading in any API.
One thing you see in all of the other ports/works on this forum is that they hide DX11, Vulkan, etc. behind the unchanged facade of Irrlicht 1.8.x or 1.9.x

The whole E_MATERIAL_TYPE system of shaders makes the use of compute shader pretty much impossible, this is why none of the above (despite 2 separate Vulkan ports) actually exposes compute shaders for you to use.

Also V-EZ has a whole host of structural problems that make it less reliable than dealing with Vulkan directly (that is, if you know what you are doing).

Finally V-EZ does not absolve you of needing to provide shaders in VK GLSL or SPIR-V. Whereas my ultimate goal right now is for IrrBAW to accept SPIR-V shaders as input and cross compile the SPIR-V into OpenGL Shaders, Metal Shaders or Vulkan Shaders as needed.

robmar · Post by **robmar** » Fri Feb 01, 2019 4:14 pm

I guess bGFX also has its problems though that seemed like a good solution, i.e. to go with a major driver project, but I guess there are problems with that too.

I wonder how much real benefit users will get with a full multitasking driver until we get GPU with multiple cores?

My problem, for my usage, is that the GPU gets overloaded with scenes with 5+ million polygons, and the calculations to determine occluded sub-meshes when handled on the CPU slows things down rather than helping.

I did see nVidia had a sample for occlusion detection running on the GPU, but again that can load the GPU so the loss + gain might produce little benefit.

Are you getting this ready for multi-core GPUs, or do you know that multitasking video drivers will really yield a good performance gain?

devsh · Post by **devsh** » Fri Feb 01, 2019 9:05 pm

Irrlicht would actually be pretty decent if implemented on top of bGFX and didn't wrap its APIs in old dx8/dx9 style interfaces and didn't hide the extra things that bGFX does.
However I didn't go with bGFX in the first place because it did not (and probably still does not) support Vulkan, as far as I can remember the github issue says "now just have to fill-in the gaps with Vulkan". This is a major problem as if you go and read any of the presentation and resources about porting to Vulkan, you will see that hammering Vulkan into a framework/engine designed around wrapping DX11 will not be a success, neither for performance, usability nor maintenance. And I really wanted an API centered around Vulkan that is able to leverage the "precomputation"/"baking" that Vulkan requires.

I wonder how much real benefit users will get with a full multitasking driver until we get GPU with multiple cores?

We already do, and have had since GT 8800, are you familiar with the concept of SM/SMX ? Or Async Compute?
The question of C++ API parallelism is separate from the question of GPU parallelism (of which it has plenty, your draws actually execute out-of-order and simultaneously).
A small problem is when your GPU has to sync up and stop executing things asynchronously.

The real problem with Irrlicht is that everything that touches ISceneManager and IVideoDriver needs to be synchronised externally (mutexed) because of global state that is kept around (texture caches, states, etc.) instead of being per-thread or clearly limited. Also the way OpenGL is designed and used by irrlicht demands that all your IVideoDriver calls take place from the main thread, this has several disadvantages that you can probably work out for yourself.
Simplest example is that if you do everything from one thread, it will kick off clock boost which means higher power usage (Vulkan saves battery on mobile).

Also your driver takes up majority of the CPU time when rendering, if it weren't so then there would be no noise about AZDO and NV_command_list. GPU simply cannot be fed raw data from something like glDraw, the driver needs to validate the parameters you use (to not explode your OS) as well as repackage the work into the GPU's internal command queue representation, this takes up the majority of the time.
For example when drawing thousands of objects I can gain 100% perf by using GPU Draw Indirect
https://github.com/buildaworldnet/Irrli ... tVSCPUCull

Finally in DX11 and OpenGL 4.3 you get those things called Persistently Mapped Buffers which you can access simultaneously with the CPU and the GPU and you can use the extra threads to upload and download data to the GPU. In-fact, I've recently made an `std::vector` over GPU accessible memory.

The point is that you can always benefit from the same work

My problem, for my usage, is that the GPU gets overloaded with scenes with 5+ million polygons, and the calculations to determine occluded sub-meshes when handled on the CPU slows things down rather than helping.

The sample I linked above is something you should run on your computer, I'm sure the desktop 1050 can draw many more triangles (BaW draws 32 Million on a mobile 980M)

I did see nVidia had a sample for occlusion detection running on the GPU, but again that can load the GPU so the loss + gain might produce little benefit.

We use that in BaW for water, its one of the OpenGL 4.3 GPGPU SSBO Culling Samples. You could modify ex26 to do the same thing with the Draw Indirect.

Are you getting this ready for multi-core GPUs, or do you know that multitasking video drivers will really yield a good performance gain?

It already yields performance gains although not where you think it will.

robmar · Post by **robmar** » Wed Feb 06, 2019 11:22 am

Its a nightmare, so many things to consider, so many compromises!
I don't think much will help my model's rendering speed, I already having it loaded in GPU memory, and just render the same stream each frame, but the 1050 mobile struggles with 5 million polys when rendered with floor, environmental reflections and shader shadows, which them creates 15 million primitives per frame, and the frame rate drops to around 25 FPS.
Culling the submeshes that are occluded would help, maybe give a 30% boost in FPS, but then there's the cost of checking occlusions, so maybe the gain drops a few FPS.
So does BAW Irrlicht have the animation system working with skinning done in the vertex shader, can I compile and test it?

devsh · Post by **devsh** » Wed Feb 06, 2019 6:52 pm

So does BAW Irrlicht have the animation system working with skinning done in the vertex shader, can I compile and test it?

Yes, its been working for over 2 years. Make sure you use MSVC 2017 toolset v141 . Use CMake to generate visual studio files, make sure your INSTALL directory is the lib/Win64 (or similar).
Use vcpkg to install openssl.
Skinning example is number 07 or 08.
Beware the examples build to wrong directories (https://github.com/buildaworldnet/Irrli ... issues/195) , so you'll have to manually copy over the EXE files to be in the "bin" folder next to the relevant example's source.

I don't think much will help my model's rendering speed, I already having it loaded in GPU memory, and just render the same stream each frame, but the 1050 mobile struggles with 5 million polys when rendered with floor, environmental reflections and shader shadows, which them creates 15 million primitives per frame, and the frame rate drops to around 25 FPS.
Culling the submeshes that are occluded would help, maybe give a 30% boost in FPS, but then there's the cost of checking occlusions, so maybe the gain drops a few FPS.

Well packing it to a struct tighter than S3DVertex (which is 30+ bytes) will definitely help you.

You're definitely not drawing your shadows the optimized way unless you're using a depth only FBO (and MSAA if using VSM).

Depending on how many submeshes you have, using Draw indirect might give you 100% FPS boost and be combined with GPGPU culling (google "OpenGL 4.4 culling techniques"). You can test that with example 26.

Irrlicht Engine

3000th commit - IrrlichtBAW (GIT repo, v 0.3.0-gamma1)

Re: Raytracing - BAW Irrlicht (GIT repo, v 0.3.0-alpha5)

Re: Raytracing - BAW Irrlicht (GIT repo, v 0.3.0-alpha5)

Re: New Version - BAW Irrlicht (GIT repo, v 0.3.0-alpha6)

Re: New Version - BAW Irrlicht (GIT repo, v 0.3.0-alpha6)

Re: New Version - BAW Irrlicht (GIT repo, v 0.3.0-alpha6)

Re: New Version - BAW Irrlicht (GIT repo, v 0.3.0-alpha6)

Re: Update 23.01.19 - BAW Irrlicht (GIT repo, v 0.3.0-alpha1

Re: Update 26.01.19 - BAW Irrlicht (GIT repo, v 0.3.0-alpha1

Re: Update 26.01.19 - BAW Irrlicht (GIT repo, v 0.3.0-alpha1

Re: Update 26.01.19 - BAW Irrlicht (GIT repo, v 0.3.0-alpha1

Re: Update 26.01.19 - BAW Irrlicht (GIT repo, v 0.3.0-alpha1

Re: Update 26.01.19 - BAW Irrlicht (GIT repo, v 0.3.0-alpha1

Re: Update 26.01.19 - BAW Irrlicht (GIT repo, v 0.3.0-alpha1

Re: Update 26.01.19 - BAW Irrlicht (GIT repo, v 0.3.0-alpha1

Re: Update 26.01.19 - BAW Irrlicht (GIT repo, v 0.3.0-alpha1