ms3d (milkshape 3d) loader fail on BIG_ENDIAN

kas1e · Post by **kas1e** » Thu Oct 24, 2019 6:24 pm

Should we add anything to the "general" irrlicth's includes ? Maybe just for that loader ? (as seems done for others). Later, if i will use the same for halflife loader, then we can made one in general ones, but for time being maybe keep it just for one loader ?

At moment i think that this 2 lines:

Code: Select all

 
    f32 framesPerSecond = get_unaligned_le_float(pPtr);
    framesPerSecond = os::Byteswap::byteswap(framesPerSecond);

Can be replaced via addiing to CMS3DMeshFileLoader.cpp.h that:

Code: Select all

 
static inline float get_unaligned_le_float(const u8 *ptr)
{
    union {
        u8 u[4];
        float f;
    } tmp;
        
    tmp.u[0] = ptr[3];
    tmp.u[1] = ptr[2];
    tmp.u[2] = ptr[1];
    tmp.u[3] = ptr[0];
    
    return tmp.f;
}

And instead of those 2 lines in the source code do that:

Code: Select all

 
    f32 framesPerSecond = get_unaligned_le_float(pPtr);

That will be fix for that unaligned floated pointer.

Then for all the structs, we need some clean way to handle all platforms. Sure creating structs with the same name is bad idea..

AlignFloat32 is also a bad name. As it's about having an unaligned float.

But because when we will call it , in return we will have aligned float. So imho AlignFloat32 sounds right ?

CuteAlien · Post by **CuteAlien** » Thu Oct 24, 2019 6:36 pm

kas1e wrote:Should we add anything to the "general" irrlicth's includes ? Maybe just for that loader ? (as seems done for others). Later, if i will use the same for halflife loader, then we can made one in general ones, but for time being maybe keep it just for one loader ?

If simple change works, always good. But adding to general headers is also not a problem (like adding my template for having structs with pack(1)). And especially the union for float is also fine to have in a general header. If we find a good name :-)

kas1e wrote: At moment i think that this 2 lines:
Code: Select all
 
    f32 framesPerSecond = get_unaligned_le_float(pPtr);
    framesPerSecond = os::Byteswap::byteswap(framesPerSecond);
 
Can be replaced via addiing to CMS3DMeshFileLoader.cpp.h that:
Code: Select all
 
static inline float get_unaligned_le_float(const u8 *ptr)
{
    union {
        u8 u[4];
        float f;
    } tmp;
        
    tmp.u[0] = ptr[3];
    tmp.u[1] = ptr[2];
    tmp.u[2] = ptr[1];
    tmp.u[3] = ptr[0];
    
    return tmp.f;
}
 
And instead of those 2 lines in the source code do that:
Code: Select all
 
    f32 framesPerSecond = get_unaligned_le_float(pPtr);
 
That will be fix for that unaligned floated pointer.

Sure if that's good enough for now, then we can do that.

kas1e wrote:
AlignFloat32 is also a bad name. As it's about having an unaligned float.
But because when we will call it , in return we will have aligned float. So imho AlignFloat32 sounds right ?

I thought that as well at first. But we really never have the float in there aligned. Not until we copy it to some other (already aligned) struct or float. The idea of that union is really to work when it's unaligned. So I guess UnalignedFloat32 is a better name (even when it's longer, but what do we have ctrl+c for...).

But function get_unaligned_le_float is also good. Just despite the coolness, don't mix english-french in the name ;-)

kas1e · Post by **kas1e** » Thu Oct 24, 2019 8:04 pm

@CuteAlien

But function get_unaligned_le_float is also good. Just despite the coolness, don't mix english-french in the name

By "le" i mean there little endian

What i mostly have problems now with, is how to deal with those structs. I mean, how you can avoid a copy in a structure that add an abstraction level, as the initial structure is badly aligned and playing with fields and padding won't give the expected result if not by creating copy of structure with other name ?

I just want to make it clean by code of coures, so on top will be structs , that new function, and later just some code changes there and there. Making second block of the same structs with copied name kind of suck for sure...

CuteAlien · Post by **CuteAlien** » Fri Oct 25, 2019 9:18 am

kas1e wrote:@CuteAlien
But function get_unaligned_le_float is also good. Just despite the coolness, don't mix english-french in the name ;-)
By "le" i mean there little endian :)

Ooooh, haha, yeah... I missed that :-)

kas1e wrote: What i mostly have problems now with, is how to deal with those structs. I mean, how you can avoid a copy in a structure that add an abstraction level, as the initial structure is badly aligned and playing with fields and padding won't give the expected result if not by creating copy of structure with other name ?

Yeah, that was the idea behind the TPack1 template I posted above. Let the compiler create the copy, don't do it yourself. So we only have the structs with padding, no more PACK_STRUCT and irrPack includes. And for loading we create a new variable of that struct with TPack1 which creates a copy of the same struct without padding . After loading we copy that one back to the padded struct. On big endian just with = copy and on little endian with the swapping stuff and get_unaligned_le_float. And floats in the structs are replaced by the new union.

edit: OK, forget about the TPack1 template. Why would you still need it? Just replace all float's with UnalignedFloat32. Assume I'll put that in some header (once you tell me it works) and that it's in namespace irr. I'd probably make it look like this:

Code: Select all

 
union UnalignedFloat32 
{
    u32 uval;
    u8 cval[4];
    f32 fval;
};

After you did that you'll get errors in any place that uses those floats. And in that place you can fix it. The byteswap can be done with the uval then. And for the assignment to the real structures use some function which either creates a copy or copies the bytes to the float.

When the float is used directly like with swapping the sign (+-) then you'll need to use a copy of UnalignedFloat32 in that place. Then copy it back with the uval. Or alternatively move the sign-swapping code completely to the place where it's assigned to the real mesh-vertices later.

edit2: In case TPack1 is still needed, the header could look like this: https://bitbucket.org/mzeilfelder/irr-p ... t/tpack1.h

kas1e · Post by **kas1e** » Mon Oct 28, 2019 7:45 pm

@CuteAlien

Ok, i firstly just want to go "easy" route and made it just works by the way i did with first structure. Once i will made whole things compiles with -O3 and works as expected, then probabaly we can go futher with reducing code and make it all looks cleaner and better.

Now i have some problem (beware as i not programmer just some slasher who put pieces in one place, so some lame stuff very possible come out from me). Problem is that i want to fix that structure now:

Code: Select all

 
struct MS3DTriangle
{
    u16 Flags;
    u16 VertexIndices[3];
    float VertexNormals[3][3];
    float S[3], T[3];
    u8 SmoothingGroup;
    u8 GroupIndex;
}

So i create copy of it (just for make it all works of course, dind't worry that its all ugly for now):

Code: Select all

 
union HybridFloat {
    float fvalue;
    char cvalue[4];
};
 
// Triangle information
struct MS3DTrianglePacked
{
    u16 Flags;
    u16 VertexIndices[3];
    HybridFloat VertexNormals[3][3];
    HybridFloat S[3], T[3];
    u8 SmoothingGroup;
    u8 GroupIndex;
} PACK_STRUCT;
 
 
struct MS3DTriangle
{
    u16 Flags;
    u16 VertexIndices[3];
    float VertexNormals[3][3];
    float S[3], T[3];
    u8 SmoothingGroup;
    u8 GroupIndex;
}

Now, there is original code which use that structure:

Code: Select all

 
pPtr += sizeof(u16);    
    MS3DTriangle *triangles = (MS3DTriangle*)pPtr;
    pPtr += sizeof(MS3DTriangle) * numTriangles;
    if (pPtr > buffer+fileSize)
    {
        delete [] buffer;
        os::Printer::log("Loading failed. Corrupted data found.", file->getFileName(), ELL_ERROR);
        return false;
    }
#ifdef __BIG_ENDIAN__
    for (u16 tmp=0; tmp<numTriangles; ++tmp)
    {
        triangles[tmp].Flags = os::Byteswap::byteswap(triangles[tmp].Flags);
        for (u16 j=0; j<3; ++j)
        {
            triangles[tmp].VertexIndices[j] = os::Byteswap::byteswap(triangles[tmp].VertexIndices[j]);
        
            triangles[tmp].VertexNormals[j][0] = os::Byteswap::byteswap(triangles[tmp].VertexNormals[j][0]);
            triangles[tmp].VertexNormals[j][1] = os::Byteswap::byteswap(triangles[tmp].VertexNormals[j][1]);
            triangles[tmp].VertexNormals[j][2] = -os::Byteswap::byteswap(triangles[tmp].VertexNormals[j][2]);
        
            triangles[tmp].S[j] = os::Byteswap::byteswap(triangles[tmp].S[j]);
            triangles[tmp].T[j] = os::Byteswap::byteswap(triangles[tmp].T[j]);
        }
    }
#endif

And that how i tried to "align" it , but which is just crashes on me currently:

Code: Select all

 
    pPtr += sizeof(u16);
    HybridFloat tmpfloat2;
    char *pVPtr2 = (char *)pPtr; // address of the first packed vertex in memory
    
    MS3DTriangle *triangles = new MS3DTriangle[numTriangles]; 
    pPtr += sizeof(MS3DTrianglePacked) * numTriangles;
    if (pPtr > buffer+fileSize)
    {
        delete [] buffer;
        os::Printer::log("Loading failed. Corrupted data found.", file->getFileName(), ELL_ERROR);
        return false;
    }
#ifdef __BIG_ENDIAN__
    for (u16 tmp=0; tmp<numTriangles; ++tmp)
    {
        // because fist filed in structure "u16 Flags", do nothing new there:
        triangles[tmp].Flags = os::Byteswap::byteswap(triangles[tmp].Flags);
                
        // because second field in structure "u16 VertexIndices[3];", do also nothing new (just repeat it 3 times, as we get rid of "for (u16 j=0; j<3; ++j)" loop:
        triangles[tmp].VertexIndices[0] = os::Byteswap::byteswap(triangles[tmp].VertexIndices[0]);
        triangles[tmp].VertexIndices[1] = os::Byteswap::byteswap(triangles[tmp].VertexIndices[1]);
        triangles[tmp].VertexIndices[2] = os::Byteswap::byteswap(triangles[tmp].VertexIndices[2]);
 
// Our floats start, let's align them.
//
// Read per byte with swapping and fill 9 VertexNormals ([3][3]), 3 S and 3 T , which mean 15 floats in whole
 
// pVPtr[0] and pVPtr[1] is ignored, as it contains u16 Flags;
// pVPtr[2] and pVPtr[3], pVPtr[4] and pVPtr[5], pVPtr[6] and pVPtr[7] is ignored, as it "u16 VertexIndices[3]"
// So we start from pVPtr[8]
                
        // hybridfloat VertexNormals
        // x.[0]
        tmpfloat2.cvalue[3] = pVPtr2[8];
        tmpfloat2.cvalue[2] = pVPtr2[9];
        tmpfloat2.cvalue[1] = pVPtr2[10];
        tmpfloat2.cvalue[0] = pVPtr2[11];
        triangles[tmp].VertexNormals[0][0] = tmpfloat2.fvalue;
 
        tmpfloat2.cvalue[3] = pVPtr2[12];
        tmpfloat2.cvalue[2] = pVPtr2[13];
        tmpfloat2.cvalue[1] = pVPtr2[14];
        tmpfloat2.cvalue[0] = pVPtr2[15];
        triangles[tmp].VertexNormals[0][1] = tmpfloat2.fvalue;
 
        tmpfloat2.cvalue[3] = pVPtr2[16];
        tmpfloat2.cvalue[2] = pVPtr2[17];
        tmpfloat2.cvalue[1] = pVPtr2[18];
        tmpfloat2.cvalue[0] = pVPtr2[19];
        triangles[tmp].VertexNormals[0][2] = - tmpfloat2.fvalue;
        
 
        // x.[1]
        tmpfloat2.cvalue[3] = pVPtr2[20];
        tmpfloat2.cvalue[2] = pVPtr2[21];
        tmpfloat2.cvalue[1] = pVPtr2[22];
        tmpfloat2.cvalue[0] = pVPtr2[23];
        triangles[tmp].VertexNormals[1][0] = tmpfloat2.fvalue;
 
        tmpfloat2.cvalue[3] = pVPtr2[24];
        tmpfloat2.cvalue[2] = pVPtr2[25];
        tmpfloat2.cvalue[1] = pVPtr2[26];
        tmpfloat2.cvalue[0] = pVPtr2[27];
        triangles[tmp].VertexNormals[1][1] = tmpfloat2.fvalue;
 
        tmpfloat2.cvalue[3] = pVPtr2[28];
        tmpfloat2.cvalue[2] = pVPtr2[29];
        tmpfloat2.cvalue[1] = pVPtr2[30];
        tmpfloat2.cvalue[0] = pVPtr2[31];
        triangles[tmp].VertexNormals[1][2] = - tmpfloat2.fvalue;
 
 
        // x.[2]
        tmpfloat2.cvalue[3] = pVPtr2[32];
        tmpfloat2.cvalue[2] = pVPtr2[33];
        tmpfloat2.cvalue[1] = pVPtr2[34];
        tmpfloat2.cvalue[0] = pVPtr2[35];
        triangles[tmp].VertexNormals[2][0] = tmpfloat2.fvalue;
 
        tmpfloat2.cvalue[3] = pVPtr2[36];
        tmpfloat2.cvalue[2] = pVPtr2[37];
        tmpfloat2.cvalue[1] = pVPtr2[38];
        tmpfloat2.cvalue[0] = pVPtr2[39];
        triangles[tmp].VertexNormals[2][1] = tmpfloat2.fvalue;
 
        tmpfloat2.cvalue[3] = pVPtr2[40];
        tmpfloat2.cvalue[2] = pVPtr2[41];
        tmpfloat2.cvalue[1] = pVPtr2[42];
        tmpfloat2.cvalue[0] = pVPtr2[43];
        triangles[tmp].VertexNormals[2][2] = - tmpfloat2.fvalue;
 
        
        // hybridfloat S 
        tmpfloat2.cvalue[3] = pVPtr2[44];
        tmpfloat2.cvalue[2] = pVPtr2[45];
        tmpfloat2.cvalue[1] = pVPtr2[46];
        tmpfloat2.cvalue[0] = pVPtr2[47];
        triangles[tmp].S[0] = tmpfloat2.fvalue;
        
        tmpfloat2.cvalue[3] = pVPtr2[48];
        tmpfloat2.cvalue[2] = pVPtr2[49];
        tmpfloat2.cvalue[1] = pVPtr2[50];
        tmpfloat2.cvalue[0] = pVPtr2[51];
        triangles[tmp].S[1] = tmpfloat2.fvalue;
        
        tmpfloat2.cvalue[3] = pVPtr2[52];
        tmpfloat2.cvalue[2] = pVPtr2[53];
        tmpfloat2.cvalue[1] = pVPtr2[54];
        tmpfloat2.cvalue[0] = pVPtr2[55];
        triangles[tmp].S[2] = tmpfloat2.fvalue;
        
        
        // hybridfloat T
        tmpfloat2.cvalue[3] = pVPtr2[56];
        tmpfloat2.cvalue[2] = pVPtr2[57];
        tmpfloat2.cvalue[1] = pVPtr2[58];
        tmpfloat2.cvalue[0] = pVPtr2[59];
        triangles[tmp].T[0] = tmpfloat2.fvalue;
 
        tmpfloat2.cvalue[3] = pVPtr2[60];
        tmpfloat2.cvalue[2] = pVPtr2[61];
        tmpfloat2.cvalue[1] = pVPtr2[62];
        tmpfloat2.cvalue[0] = pVPtr2[63];
        triangles[tmp].T[1] = tmpfloat2.fvalue;
 
        tmpfloat2.cvalue[3] = pVPtr2[64];
        tmpfloat2.cvalue[2] = pVPtr2[65];
        tmpfloat2.cvalue[1] = pVPtr2[66];
        tmpfloat2.cvalue[0] = pVPtr2[67];
        triangles[tmp].T[2] = tmpfloat2.fvalue;
                
        pVPtr2 += sizeof(struct MS3DTrianglePacked);
 
    }
#endif

Did you see something very weird and wrong there ? I probabaly can mess with sizes which i should skip from begining (probabaly it can be not 8, but 4 , and i maye very well suck in whole logic of all that alignment .. Is there something obviously wrong ?:)

Maybe for those [x][x] things it should be like pVptr2[x][x] too, not just +1 for each one ?

CuteAlien · Post by **CuteAlien** » Tue Oct 29, 2019 10:25 am

Sorry, on quick view can't tell. Except that you swap stuff like Flags which doesn't contain anything yet at that point (as you no longer start out with a pointer to the memory you loaded). So it contains random values. Which still shouldn't crash.

It's probably easier replacing float with HybridFloat (or whatever name we use for it) directly in MS3DTriangle. Then you don't need a copy of the struct. And then move the endian handling and the swapping of the minus sign down to the point where it's assigned to the other structures. As the union needs the .fvalue and .cvalue the compiler will tell you all places you need to change. So don't swap the bytes in loading at all. Do it on assignment later.

Sorry I'm currently working on other TODO's, otherwise I could move the minus sign assignments down already (that should probably be the first step so those are out of the way and no longer mixed up with the endian troubles).

edit: One thing that may be wrong (not sure as you maybe just left it out) - be careful about PACK_STRUCT. As mentioned before that one is likely not relevant to packing. The #include "irrpack.h" and #include "irrunpack.h" are on most platforms the real tricky. Those change packing to 1 ("irrpack.h") and back to default ("irrunpack.h"). So unless you add those it's likely both your structs have the same packing right now.

kas1e · Post by **kas1e** » Tue Oct 29, 2019 11:35 am

@CuteAlien

Sorry, on quick view can't tell. Except that you swap stuff like Flags which doesn't contain anything yet at that point (as you no longer start out with a pointer to the memory you loaded). So it contains random values. Which still shouldn't crash.

Ha .. It didn't crash by usuall crash, it just give me :

assertion "!"(index>=used)" failed : file "../../include/irrArray.h", line 317.

Which is : _IRR_DEBUG_BREAK_IF(index>=used) // access violation , but not sure if it about that ?

But its not only Flags, but probabaly also VertexIndices (which are u16[3] ) , both in trun can cause issues maybe , and both need to be accessed from that pVptr2 thing ?

edit: One thing that may be wrong (not sure as you maybe just left it out) - be careful about PACK_STRUCT. As mentioned before that one is likely not relevant to packing. The #include "irrpack.h" and #include "irrunpack.h" are on most platforms the real tricky. Those change packing to 1 ("irrpack.h") and back to default ("irrunpack.h"). So unless you add those it's likely both your structs have the same packing right now.

Yeah i have all structures i add/change inside of irrpack.h/irrunpack.h. Just those some of them with PACK_STRUCT; at end, some without. I.e:

Code: Select all

 
// byte-align structures
#include "irrpack.h"
 
 
union HybridFloat {
    float fvalue;
    char cvalue[4];
};
 
struct MS3DVertexPacked
{
    u8 Flags;
    HybridFloat Vertex[3];
    char BoneID;
    u8 RefCount;
} PACK_STRUCT;
 
struct MS3DVertex
{
    u8 Flags;
    float Vertex[3];
    char BoneID;
    u8 RefCount;
};
 
// Triangle information
struct MS3DTrianglePacked
{
    u16 Flags;
    u16 VertexIndices[3];
    HybridFloat VertexNormals[3][3];
    HybridFloat S[3], T[3];
    u8 SmoothingGroup;
    u8 GroupIndex;
} PACK_STRUCT;
 
 
 
 
struct MS3DTriangle
{
    u16 Flags;
    u16 VertexIndices[3];
    float VertexNormals[3][3];
    float S[3], T[3];
    u8 SmoothingGroup;
    u8 GroupIndex;
};
 
#include "irrunpack.h"

I think those which without "PACK_STRUCT" still can stay inside of irrpack/irrunpack things ?

kas1e · Post by **kas1e** » Tue Oct 29, 2019 12:00 pm

@CuteAlien
I just tried in original code to comment out those lines:

Code: Select all

 
#ifdef __BIG_ENDIAN__
    for (u16 tmp=0; tmp<numTriangles; ++tmp)
    {
//      triangles[tmp].Flags = os::Byteswap::byteswap(triangles[tmp].Flags);
        for (u16 j=0; j<3; ++j)
        {
//          triangles[tmp].VertexIndices[j] = os::Byteswap::byteswap(triangles[tmp].VertexIndices[j]);
        
            triangles[tmp].VertexNormals[j][0] = os::Byteswap::byteswap(triangles[tmp].VertexNormals[j][0]);
            triangles[tmp].VertexNormals[j][1] = os::Byteswap::byteswap(triangles[tmp].VertexNormals[j][1]);
            triangles[tmp].VertexNormals[j][2] = -os::Byteswap::byteswap(triangles[tmp].VertexNormals[j][2]);
        
            triangles[tmp].S[j] = os::Byteswap::byteswap(triangles[tmp].S[j]);
            triangles[tmp].T[j] = os::Byteswap::byteswap(triangles[tmp].T[j]);
        }
    }
#endif

And it bring me exactly the same error in irrArrays.h at line 317

Which mean that you probabaly right : issue is exactly that i do not point them on my new pVptr2.

So those first 8 bytes (u16 glags and u16 VertexIndicies[3]) , should be also set to pVptr2 with byte swapping. Something like this maybe ?:

Code: Select all

 
        *((char*)&triangles[tmp].Flags+1) = pVPtr2[0];
        *((char*)&triangles[tmp].Flags+0) = pVPtr2[1];
 
        *((char*)&triangles[tmp].VertexIndices[0]+1) = pVPtr2[2];
        *((char*)&triangles[tmp].VertexIndices[0]+0) = pVPtr2[3];
        
        *((char*)&triangles[tmp].VertexIndices[1]+1) = pVPtr2[4];
        *((char*)&triangles[tmp].VertexIndices[1]+0) = pVPtr2[5];
        
        *((char*)&triangles[tmp].VertexIndices[2]+1) = pVPtr2[6];
        *((char*)&triangles[tmp].VertexIndices[2]+0) = pVPtr2[7];

I.e. we set int's from structure still per bytes and do byteswap. Not sure through if that *(char *) thing is correct there , or if that block is sane enough at all

CuteAlien · Post by **CuteAlien** » Tue Oct 29, 2019 3:50 pm

kas1e wrote: assertion "!"(index>=used)" failed : file "../../include/irrArray.h", line 317.

Which is : _IRR_DEBUG_BREAK_IF(index>=used) // access violation , but not sure if it about that ?

Means you access an array outside of it's borders. If that happens later it might be because you did use random values into VertexIndices.

kas1e wrote: Yeah i have all structures i add/change inside of irrpack.h/irrunpack.h. Just those some of them with PACK_STRUCT; at end, some without. I.e:
<... snip ... >
I think those which without "PACK_STRUCT" still can stay inside of irrpack/irrunpack things ?

No, you got that wrong. The PACK_STRUCT does likely nothing in your case. The includes themself do the stuff here. It's a bit ugly and confusing ... this should use 2 defines instead probably (sorry, don't know why it was done that way). Think of those #includes as code which is run in the precompiler. Everything in between those 2 includes will on most compilers have a packing of 1. The PACK_STRUCT is only on top of that to support some very old gcc versions.

kas1e · Post by **kas1e** » Tue Oct 29, 2019 10:37 pm

One of our developers come up with another idea, check this out:

Code: Select all

 
static inline float GetFloat(void *ptr)
{
register union {
u8 u[4];
float f;
} tmp;
 
#ifdef __BIG_ENDIAN__
tmp.u[0] = ((u8 *)ptr)[3];
tmp.u[1] = ((u8 *)ptr)[2];
tmp.u[2] = ((u8 *)ptr)[1];
tmp.u[3] = ((u8 *)ptr)[0];
#else
tmp.f = *(float*)ptr;
#endif
return tmp.f;
}
 
static inline float GetWord(void *ptr)
{
register union {
u8 u[4];
u16 w;
} tmp;
 
#ifdef __BIG_ENDIAN__
tmp.u[0] = ((u8 *)ptr)[1];
tmp.u[1] = ((u8 *)ptr)[0];
#else
tmp.w = *(u16 *)ptr;
#endif
return tmp.w;
}

And then, on example with triangles structure (we didn't change structure at all, keep it as it):

Code: Select all

 
    pPtr += sizeof(u16);    
    MS3DTriangle *triangles = (MS3DTriangle*)pPtr;
#ifdef __BIG_ENDIAN__
    register struct MS3DTriangle *tri=triangles;
#endif
    pPtr += sizeof(MS3DTriangle) * numTriangles;
    if (pPtr > buffer+fileSize)
    {
        delete [] buffer;
        os::Printer::log("Loading failed. Corrupted data found.", file->getFileName(), ELL_ERROR);
        return false;
    }
 
#ifdef __BIG_ENDIAN__
    for (u16 n=0; n<numTriangles; ++n)
    {
        tri->Flags = GetWord(&(tri->Flags));
        for (u16 j=0; j<3; ++j)
            {
                tri->VertexIndices[j] = GetWord(&(tri->VertexIndices[j]));
 
                tri->VertexNormals[j][0] = GetFloat(&(tri->VertexNormals[j][0]));
                tri->VertexNormals[j][1] = GetFloat(&(tri->VertexNormals[j][1]));
                tri->VertexNormals[j][2] = - GetFloat(&(tri->VertexNormals[j][2]));
 
                tri->S[j] = GetFloat(&(tri->S[j]));
                tri->T[j] = GetFloat(&(tri->T[j]));
            }
    tri++;
    }
    
#endif

I tried that , and it seems to works (at least didn't crashes there anymore, crashes later, when tried to use material's structure). And it also looks more clean of course.. What you think about ? Did it looks like something sane which we can later incorporate to irrlicht ?

CuteAlien · Post by **CuteAlien** » Wed Oct 30, 2019 5:41 pm

"register" got deprecated recently in c++. And it never was more than a hint to the compiler. So if that really helps, it's probably a short-term solution as newer compilers should start ignoring it.

And I have no idea why that solution works. Don't do lines like tri->VertexNormals[j][0] still write to a float which
doesn't care about alignment? (don't know where this specific float sits now in memory, but basically - not sure if it failed before why it works now).
Or is assignment fine? Are floating point exceptions only on reading the float, but you are allow to write?

kas1e · Post by **kas1e** » Thu Oct 31, 2019 12:32 pm

Probabaly that "register" thing can be ruled out, yeah, its indeed just for older compiler with bad optimisation about hit-vars-to-registers, and newer ones probabaly smart enough to doing this.

As for writing to floats, i also first time meet with all this, so trying to understand the rulz of those big-endian restrictions. I read in https://developer.ibm.com/articles/pa-dalign/ , that:

When asked to load an unaligned floating-point number from memory, modern PowerPC processors will throw an exception and have the operating system perform the alignment chores in software. Performing alignment in software is much slower than performing it in hardware.

Yeah, our kernel not that modern to emulate it all , but then in that block writen "when asked to _load_ an unaligned floating-point number from memory". Maybe its indeed issue only with load, but not with store ? Trying to find truth bit by bit

kas1e · Post by **kas1e** » Thu Oct 31, 2019 8:31 pm

@CuteAlien
I also learn today, that on ARM (so Pandora, Pyra, etc) there the same allignment rules restrictions. I.e. the ARM itself can access unaligned data, but coprocessor (so Floats and SIMD) cannot. So this kind of error is common in ARM world too.
We don't test at moment exactly irrlicht and that .ms3d loader there , but at least code of that loader probabaly indeed need to be modified to keep in mind those alignment issues. But we will see once we build ARM version.

CuteAlien · Post by **CuteAlien** » Fri Nov 01, 2019 12:21 am

Yeah, makes sense, might even happen on Android then (which also uses ARM usually). Maybe I can find some time running a quick test for that there once I'm through with my current mipmap troubles.

And yeah, with SIMD you got such alignment requirements also on PC.

kas1e · Post by **kas1e** » Fri Nov 01, 2019 8:57 am

And the reassons why for example it works when we build loader without optimisation, but give us alignment issues when we build with optimisation probably are: -O0 probably use general register to access everything, and then copy to the FPU when actually needeing to math, where in higher optim level, it tries to use FPU register as much as possible, and mainly try to avoid General Register <-> FPU register transfert, as this things tend to be slow. At least that may explain things.

Irrlicht Engine

ms3d (milkshape 3d) loader fail on BIG_ENDIAN

Re: ms3d (milkshape 3d) loader fail on BIG_ENDIAN

Re: ms3d (milkshape 3d) loader fail on BIG_ENDIAN

Re: ms3d (milkshape 3d) loader fail on BIG_ENDIAN

Re: ms3d (milkshape 3d) loader fail on BIG_ENDIAN

Re: ms3d (milkshape 3d) loader fail on BIG_ENDIAN

Re: ms3d (milkshape 3d) loader fail on BIG_ENDIAN

Re: ms3d (milkshape 3d) loader fail on BIG_ENDIAN

Re: ms3d (milkshape 3d) loader fail on BIG_ENDIAN

Re: ms3d (milkshape 3d) loader fail on BIG_ENDIAN

Re: ms3d (milkshape 3d) loader fail on BIG_ENDIAN

Re: ms3d (milkshape 3d) loader fail on BIG_ENDIAN

Re: ms3d (milkshape 3d) loader fail on BIG_ENDIAN

Re: ms3d (milkshape 3d) loader fail on BIG_ENDIAN

Re: ms3d (milkshape 3d) loader fail on BIG_ENDIAN

Re: ms3d (milkshape 3d) loader fail on BIG_ENDIAN