The big problem with modern game engines

Discussion about everything. New games, 3d math, development tips...
Post Reply
Noiecity
Posts: 365
Joined: Wed Aug 23, 2023 7:22 pm
Contact:

The big problem with modern game engines

Post by Noiecity »

This post starts with a small observation based on an experience:
A few years ago when I tested irrlicht, I had an old notebook with which I was testing to see its behavior, I noticed in one of the tests that programs can completely skip a condition (if) when the CPU is too stressed, it happened to me even with irrlicht engine when passing certain collisions ignoring my collision.

Well, this is similar to "branch misprediction", only that instead of being a "logical" error either of the program or of the processor not being able to interpret well where the written instruction is directed, what happened when stressing the cpu is known as "hot-spot".

The "hot-spot" occurs when an area of the semiconductor gets too hot, which can produce errors when interpreting the logic, such as skipping a condition or producing a result with respect to the expected results.

When the processor temperature is measured, it indicates a general temperature, but this temperature varies depending on the zone, one zone may be much hotter than another, and it is very difficult to handle in situations of high thermal stress.

This can make a modern game on a modern engine such as unreal engine 5, where consumption is higher, prone to crashes, no matter how many implementations you make, glitchy animations, glitchy collisions, etc.

There is a study in this regard, which can be summarized as "can cause serious problems, such as drastic loss of performance, incorrect circuit operation and reduced device lifetime. Traditional cooling solutions and design tools are no longer sufficient to mitigate these effects.":

https://sites.tufts.edu/tcal/files/2021 ... C_2021.pdf

The following study addresses "Thermally induced soft errors or delay faults capable of causing data corruption or incorrect execution.":

https://scholarworks.umass.edu/bitstrea ... 2/download


It is worth noting that these studies speak of solutions, but the problem will continue to persist as seen in current modern chips.

A thermal hot-spot can cause a logic condition to "jump", even though there is no apparent error in the code or design.


At higher temperature, MOSFET transistors:

Conduct worse (lower electron/carrier mobility).

They take longer to change state (0→1 or 1→0).

This results in internal signal delays, which can cause a signal:

- Arrive late at a logic gate,
- Not arrive before the clock edge,
- And the circuit registers an erroneous data.

For example:

Code: Select all

if (x > 5) {
    doSomething();
}
If x is being evaluated right in a hotspot area, the comparator may:

- Misevaluate the result.

- Not trigger the control signal.

- And the doSomething() block is not executed, even though x was greater than 5.

This is not a software bug. It is a physical bug: the logical signal did not change in time.

The processors use synchronous logic based on clock edges.
If a signal arrives late because of the hotspot:

- The logical condition is not reached.

- A comparison, an instruction, a conditional operation, etc. is skipped.

Not only can a condition be skipped, there are cases that can even change the values at the time of interpretation, known as "Bit flip", i.e. a register cell changes its value due to heat (soft error):

https://semiengineering.com/heat-relate ... c-designs/

https://dramsec.ethz.ch/papers/mathur-dramsec22.pdf

https://arxiv.org/abs/2110.10291
Irrlicht is love, Irrlicht is life, long live to Irrlicht
chronologicaldot
Competition winner
Posts: 699
Joined: Mon Sep 10, 2012 8:51 am

Re: The big problem with modern game engines

Post by chronologicaldot »

As you may know, back in the old days, programmers tried to be more efficient because they were conscientious about the limitations of their hardware. Alas how we've been spoiled by great tech.
Post Reply