3000th commit - IrrlichtBAW (GIT repo, v 0.3.0-gamma1)
Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver
"target specific option mismatch" means that you likely did not enable the required SSE level.
Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver
The SHA256 stuff is not our fault, maybe GCC 6.3 is indeed too recent
the Makefile doesn't work except for the BAW_SERVER target or something like that, we only build from the codeblocks project for Linux and Visual Studio for Windows
the Makefile doesn't work except for the BAW_SERVER target or something like that, we only build from the codeblocks project for Linux and Visual Studio for Windows
Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver
Added non-recursive versions of updateAbsolutePosition and needsAbsolutePositionUpdate which don't run in O(n^2) where n is the depth of the SceneNode hierarchy
Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver
Abandoned the above idea
Carrying out first tests of Hardware Skinning
Current version should be fully OpenGL core compatible
Carrying out first tests of Hardware Skinning
Current version should be fully OpenGL core compatible
Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver
The repository has moved to a new address
0.2 is being committed
For the 0.3 Release:
1) Create the OpenGL context in Core Profile
2) Up the minimum OpenGL version to 4.0 (Sandy Bridge Intel, Fermi Nvidia, Radeon HD5000)
3) Abandon Linux, MAC OS X and possibly windows device in favour in SDL2
4) remove all s8,u8,s16,u16,s32,u32,s64,u64 and f32 and replace with float and types from stdint.h
5) CPU culling workarounds for small instance counts to avoid GPU idling
6) Bounding Box culling to avoid animating/boning meshes which are guaranteed to be off-screen
7) Better CPU to GPU Mesh conversion modes (making sure vertex attributes are interleaved)
Native irrlicht mesh format save and load + encryption (index and attribute buffers)
9) InstancedSkinnedMeshSceneNode with LoDs (together with a re-skin function for cpu meshes to reduce the number of bone weights per vertex)
10) 1D Textures
11) Compute Shaders
12) Quantization optimization post-load for CPU mesh vertex attributes
13) MultiDraw{Indirect} and some other Vulkan like features
14) My own render state update system, as some render commands/materials dont need to check and update every render state
15) Assimp for model format import/export which will load textures without putting them in GPU memory, this will enable multi-threading of mesh loading
Sometime Later:
A) Atomic Reference Counting
B) STL vector and list used in place of irrlicht's containers
C) C++x11 mutexes and maybe some other stuff
D) GPU Boning
E) SSE3 SIMD Dual Quaternion Class
F) Dual Quaternion Skinning
G) SSE3 SIMD classes used for all 2D/3D math
H) Multisample renderbuffers and textures
Version 1.0 Roadmap:
1) AVX/AVX2 versions of all SIMD stuff with separate library builds
2) Vulkan Renderer
3) Android builds
0.2 is being committed
For the 0.3 Release:
1) Create the OpenGL context in Core Profile
2) Up the minimum OpenGL version to 4.0 (Sandy Bridge Intel, Fermi Nvidia, Radeon HD5000)
3) Abandon Linux, MAC OS X and possibly windows device in favour in SDL2
4) remove all s8,u8,s16,u16,s32,u32,s64,u64 and f32 and replace with float and types from stdint.h
5) CPU culling workarounds for small instance counts to avoid GPU idling
6) Bounding Box culling to avoid animating/boning meshes which are guaranteed to be off-screen
7) Better CPU to GPU Mesh conversion modes (making sure vertex attributes are interleaved)
Native irrlicht mesh format save and load + encryption (index and attribute buffers)
9) InstancedSkinnedMeshSceneNode with LoDs (together with a re-skin function for cpu meshes to reduce the number of bone weights per vertex)
10) 1D Textures
11) Compute Shaders
12) Quantization optimization post-load for CPU mesh vertex attributes
13) MultiDraw{Indirect} and some other Vulkan like features
14) My own render state update system, as some render commands/materials dont need to check and update every render state
15) Assimp for model format import/export which will load textures without putting them in GPU memory, this will enable multi-threading of mesh loading
Sometime Later:
A) Atomic Reference Counting
B) STL vector and list used in place of irrlicht's containers
C) C++x11 mutexes and maybe some other stuff
D) GPU Boning
E) SSE3 SIMD Dual Quaternion Class
F) Dual Quaternion Skinning
G) SSE3 SIMD classes used for all 2D/3D math
H) Multisample renderbuffers and textures
Version 1.0 Roadmap:
1) AVX/AVX2 versions of all SIMD stuff with separate library builds
2) Vulkan Renderer
3) Android builds
Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver
I'm guessing my gcc version is too recent for irr 1.8.3. Latest Irrlicht trunk revision compiles fine - I will try to look into this later.
We (with bkeys) found out that it was due to GCC 6 using std=gnu++14 as default dialect, so you need to set the -std=c++03 as an extra CFLAGSThe SHA256 stuff is not our fault, maybe GCC 6.3 is indeed too recent
Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver
I threw in an OpenCL device manager for teh lulz today
Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver
Sounds awesome! Maybe we can hang out on IRC at some point and tap up an example/documentation for this.devsh wrote:I threw in an OpenCL device manager for teh lulz today
Someone at some point should try to bump up the C++ version, perhaps we can move a few standards ahead in time and the engine still compile as it did before. If no one else has the time I may do this in a few weeks.devsh wrote:I'm guessing my gcc version is too recent for irr 1.8.3. Latest Irrlicht trunk revision compiles fine - I will try to look into this later.devsh wrote: We (with bkeys) found out that it was due to GCC 6 using std=gnu++14 as default dialect, so you need to set the -std=c++03 as an extra CFLAGS
- Brigham Keys, Esq.
-
- Posts: 1638
- Joined: Mon Apr 30, 2007 3:24 am
- Location: Montreal, CANADA
- Contact:
Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver
Hi Devsh!
Decided today to tryout the engine by compiling one of the demos (Hardware instancing). I must be doing something really wrong!
Here is a picture of it running on my 4k screen with my GTX 1080:
It give only 19 fps?! I know there are lots of model on screen but with the hardware I have I was not expecting this (Play most of my games at max quality on a 4k screen and get 60fps+ most of the time)
EDIT: Changed some values in your demo to get only one "cow" and the primitive count is really high (like 5000 primitives). Are theses "primitives" are tris?
Decided today to tryout the engine by compiling one of the demos (Hardware instancing). I must be doing something really wrong!
Here is a picture of it running on my 4k screen with my GTX 1080:
It give only 19 fps?! I know there are lots of model on screen but with the hardware I have I was not expecting this (Play most of my games at max quality on a 4k screen and get 60fps+ most of the time)
EDIT: Changed some values in your demo to get only one "cow" and the primitive count is really high (like 5000 primitives). Are theses "primitives" are tris?
Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver
the primitive count is really high, plus make sure you compile with O3 optimizations
the slowdown is most probably due to your CPU and PCIe bus, it needs to read back the occlusion results before drawing the next batch so there is a stall
Next version will determine LoD instance arrays in one pass (4 at a time) and the one after that will use indirect draws
the slowdown is most probably due to your CPU and PCIe bus, it needs to read back the occlusion results before drawing the next batch so there is a stall
Next version will determine LoD instance arrays in one pass (4 at a time) and the one after that will use indirect draws
Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver
Found and straightened out some bugs in ver 0.2 (yet to commit):
1) bug in window format finding GL core profile Linux+Nvidia giving transparent windows (pretty cool feature though)
2) bug in setTexture where if old texture was removed it would crash
3) bad quantization optimization in the OBJ loader resulting in messed up meshes
4) CPU culling workarounds for small instance counts to avoid GPU idling
5) Normal quantization to 30bit in x meshes which use Skinning
Stuff left to do for the 0.3 release:
1) Abandon Linux, MAC OS X and possibly windows device in favour in SDL2
2) remove all s8,u8,s16,u16,s32,u32,s64,u64 and f32 and replace with float and types from stdint.h
3) Remove core::stringc, array and list and replace with SSE3,AVX and 4096bit alignment (page-locked) friendly std allocators
4) Assimp for model format import/export which will load textures without putting them in GPU memory, this will enable multi-threading of mesh loading
Roadmap for 0.4 release:
1) Global mesh optimization function (do forsyth index optimization and re-quantization into integers)
2) Better CPU to GPU Mesh conversion modes (making sure vertex attributes are interleaved)
3) Native irrlicht mesh format save and load + encryption (index and attribute buffers)
4) Shader Subroutines -> My own render state update system, as some render commands/materials dont need to check and update every render state
5) Bounding Box culling to avoid animating/boning meshes which are guaranteed to be off-screen
6) InstancedSkinnedMeshSceneNode with LoDs (together with a re-skin function for cpu meshes to reduce the number of bone weights per vertex)
7) 1D Textures
Compute Shaders
9) MultiDrawIndirect and some other render-command-list like features
Roadmap for 0.5:
1) Super-fast per-thread malloc/new pool allocator
2) Super-fast thread-safe malloc/new operators
Roadmap for 0.6:
1) SSE3 SIMD classes used for all 2D/3D math
2) AVX/AVX2 versions of all SIMD stuff with separate library builds
3) Vulkan Renderer
4) Android builds
Sometime Later:
A) Atomic Reference Counting
B) C++x11 mutexes and maybe some other stuff
C) GPU Boning
D) SSE3 SIMD Dual Quaternion Class
E) Dual Quaternion Skinning
G) Multisample renderbuffers and textures
1) bug in window format finding GL core profile Linux+Nvidia giving transparent windows (pretty cool feature though)
2) bug in setTexture where if old texture was removed it would crash
3) bad quantization optimization in the OBJ loader resulting in messed up meshes
4) CPU culling workarounds for small instance counts to avoid GPU idling
5) Normal quantization to 30bit in x meshes which use Skinning
Stuff left to do for the 0.3 release:
1) Abandon Linux, MAC OS X and possibly windows device in favour in SDL2
2) remove all s8,u8,s16,u16,s32,u32,s64,u64 and f32 and replace with float and types from stdint.h
3) Remove core::stringc, array and list and replace with SSE3,AVX and 4096bit alignment (page-locked) friendly std allocators
4) Assimp for model format import/export which will load textures without putting them in GPU memory, this will enable multi-threading of mesh loading
Roadmap for 0.4 release:
1) Global mesh optimization function (do forsyth index optimization and re-quantization into integers)
2) Better CPU to GPU Mesh conversion modes (making sure vertex attributes are interleaved)
3) Native irrlicht mesh format save and load + encryption (index and attribute buffers)
4) Shader Subroutines -> My own render state update system, as some render commands/materials dont need to check and update every render state
5) Bounding Box culling to avoid animating/boning meshes which are guaranteed to be off-screen
6) InstancedSkinnedMeshSceneNode with LoDs (together with a re-skin function for cpu meshes to reduce the number of bone weights per vertex)
7) 1D Textures
Compute Shaders
9) MultiDrawIndirect and some other render-command-list like features
Roadmap for 0.5:
1) Super-fast per-thread malloc/new pool allocator
2) Super-fast thread-safe malloc/new operators
Roadmap for 0.6:
1) SSE3 SIMD classes used for all 2D/3D math
2) AVX/AVX2 versions of all SIMD stuff with separate library builds
3) Vulkan Renderer
4) Android builds
Sometime Later:
A) Atomic Reference Counting
B) C++x11 mutexes and maybe some other stuff
C) GPU Boning
D) SSE3 SIMD Dual Quaternion Class
E) Dual Quaternion Skinning
G) Multisample renderbuffers and textures
Last edited by devsh on Sat Feb 25, 2017 3:45 pm, edited 1 time in total.
Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver
As soon as we cap GL version to 4.2 or require ARB_shader_image_load_store/EXT_shader_image_load_store, we will use a unified shader for culling all instances (or at least batches of instances) at once.
The above approach could also be used for updating the transformations of the scenegraph, skeletal animation, culling and indirect drawing of everything.
P.S. GL_ARB_shader_atomic_counters seems to be present on all Intels with GL 4.0+, so its a possibility to do the instance culling like that (or image load store)
The above approach could also be used for updating the transformations of the scenegraph, skeletal animation, culling and indirect drawing of everything.
P.S. GL_ARB_shader_atomic_counters seems to be present on all Intels with GL 4.0+, so its a possibility to do the instance culling like that (or image load store)
Re: To The Rescue of Your FPS - BAW Irrlicht (GIT repo, ver
Version 0.2.2 will be appearing shortly as soon as the bugs marked with * will be resolved and a windows project is updated
Bugs Fixed:
1) Transparent Windows on AMD/NVidia in GL Core Profile under Linux
2) Bad Mesh Quantization for the OBJ loader leading to slightly miss-placed UVs and Vertices
3) Bug in setTexture where is an old texture was remoed it would crash
4) Bug where Irrlicht wouldn't compile with GCC 6.3 (needs the option -std=c++03) because of some SHA256 code
5) Bug where some meshes in a Skinned X-Format Mesh attached to a bone but without weights would be in the wrong place (actually a quaternion bug )
New Features:
1) OpenCL device stub, finds associated device to the rendering GPU and can report number of GPU cores
2) CPU culling workarounds for small instance counts to avoid GPU idling
3) Normal quantization to 30bit in x meshes, even ones which use Skinning
4) MultiDrawIndirect and DrawIndirect handles to GL functions (can use this OpenGL feature if desired)
5) Minimum GL version is now 4.0 plus a few ubiquitous extensions, until Intel retires all Bay-Trail SOCs and Ivy Bridge CPUs
The engine minimum requirements will be for Nvidia GeForce 400 series, Radeon HD 5000 series, and Intel HD Graphics bundled with Ivy Bridge CPUs and up
Roadmap for the Immediate Next Release (In the order Features Will appear):
1) Bounding Box culling to avoid animating/boning meshes which are guaranteed to be off-screen
2) Uniform Buffer Objects and getting rid of setShaderConstant
3) New Material State Tracking system
4) Separation of BaseMaterial (essentially Blend State) from actual shader programs
5) Shader Subroutines
6) InstancedSkinnedMeshSceneNode with LoDs (together with a re-skin function for cpu meshes to reduce the number of bone weights per vertex)
7) GPU Boning
Expensive Operation (Cass Everit's "OpenGL Beyond Porting) Profiling and Tracking (MRT change, Shader Program, ROP, etc.)
Roadmap for Version 0.3:
1) SDL 2 Device
2) removal of irrlicht types like s8,u8,c8 etc.
3) Migrating to std::vector and list instead of core::array
4) Compute Shaders and SSBOs
5) Quaternion only rotations
6) SIMD only vector math
Roadmap for Version 0.4:
1) ASSIMP for model format import/export which will load textures without putting them in GPU memory, this will enable multi-threading of mesh loading
2) Global mesh optimization function (do forsyth index optimization and re-quantization into vertex attribute formats with less bit-depth)
3) Better CPU to GPU Mesh conversion modes (making sure vertex attributes are interleaved)
4) 1D Textures
Roadmap for 0.5:
1) Super-fast per-thread malloc/new pool allocator
2) Super-fast thread-safe malloc/new operators
3) Full-GPU compute shader scenegraph update and drawcommand generation
4) Nvidia Bindless, Sparse Textures and NV command-list
Roadmap for 0.6:
1) Bumping Minium GL Version to 4.3
2) AVX/AVX2 versions of all SIMD stuff with separate library builds
3) Vulkan Renderer
4) Android builds
Sometime Later:
A) Atomic Reference Counting
B) Bumping C++ version to C++x11 and use C++x11 mutexes and maybe some other stuff
C) SSE3 SIMD Dual Quaternion Class
D) Dual Quaternion Skinning
E) Multisample renderbuffers and textures
Bugs Fixed:
1) Transparent Windows on AMD/NVidia in GL Core Profile under Linux
2) Bad Mesh Quantization for the OBJ loader leading to slightly miss-placed UVs and Vertices
3) Bug in setTexture where is an old texture was remoed it would crash
4) Bug where Irrlicht wouldn't compile with GCC 6.3 (needs the option -std=c++03) because of some SHA256 code
5) Bug where some meshes in a Skinned X-Format Mesh attached to a bone but without weights would be in the wrong place (actually a quaternion bug )
New Features:
1) OpenCL device stub, finds associated device to the rendering GPU and can report number of GPU cores
2) CPU culling workarounds for small instance counts to avoid GPU idling
3) Normal quantization to 30bit in x meshes, even ones which use Skinning
4) MultiDrawIndirect and DrawIndirect handles to GL functions (can use this OpenGL feature if desired)
5) Minimum GL version is now 4.0 plus a few ubiquitous extensions, until Intel retires all Bay-Trail SOCs and Ivy Bridge CPUs
The engine minimum requirements will be for Nvidia GeForce 400 series, Radeon HD 5000 series, and Intel HD Graphics bundled with Ivy Bridge CPUs and up
Roadmap for the Immediate Next Release (In the order Features Will appear):
1) Bounding Box culling to avoid animating/boning meshes which are guaranteed to be off-screen
2) Uniform Buffer Objects and getting rid of setShaderConstant
3) New Material State Tracking system
4) Separation of BaseMaterial (essentially Blend State) from actual shader programs
5) Shader Subroutines
6) InstancedSkinnedMeshSceneNode with LoDs (together with a re-skin function for cpu meshes to reduce the number of bone weights per vertex)
7) GPU Boning
Expensive Operation (Cass Everit's "OpenGL Beyond Porting) Profiling and Tracking (MRT change, Shader Program, ROP, etc.)
Roadmap for Version 0.3:
1) SDL 2 Device
2) removal of irrlicht types like s8,u8,c8 etc.
3) Migrating to std::vector and list instead of core::array
4) Compute Shaders and SSBOs
5) Quaternion only rotations
6) SIMD only vector math
Roadmap for Version 0.4:
1) ASSIMP for model format import/export which will load textures without putting them in GPU memory, this will enable multi-threading of mesh loading
2) Global mesh optimization function (do forsyth index optimization and re-quantization into vertex attribute formats with less bit-depth)
3) Better CPU to GPU Mesh conversion modes (making sure vertex attributes are interleaved)
4) 1D Textures
Roadmap for 0.5:
1) Super-fast per-thread malloc/new pool allocator
2) Super-fast thread-safe malloc/new operators
3) Full-GPU compute shader scenegraph update and drawcommand generation
4) Nvidia Bindless, Sparse Textures and NV command-list
Roadmap for 0.6:
1) Bumping Minium GL Version to 4.3
2) AVX/AVX2 versions of all SIMD stuff with separate library builds
3) Vulkan Renderer
4) Android builds
Sometime Later:
A) Atomic Reference Counting
B) Bumping C++ version to C++x11 and use C++x11 mutexes and maybe some other stuff
C) SSE3 SIMD Dual Quaternion Class
D) Dual Quaternion Skinning
E) Multisample renderbuffers and textures
Last edited by devsh on Wed Mar 15, 2017 5:33 pm, edited 2 times in total.