fix(gta-core-five): optimize distant lod lights render#3984
Conversation
Co-Authored-By: divocbn <divocbn@gmail.com>
|
fps at night go 🚀📈 |
|
I will mortgage my house to bribe someone to merge this, what a huge improvement 🫡 |
|
MERGE |
|
Merge! |
Co-Authored-By: DaniGP17 <dani.dcshop.gg@gmail.com>
|
Gogsi was able to demonstrate locally that some lights in Vinewood weren't displaying; this was because we were dereferencing the wrong object (lightdata instead of bucket) for the mapdata slot index. It now works perfectly fine, as tested by various users, with a minimum improvement of 80-100 fps on AMD and around 30-40 fps in the city on Intel. |
|
Huge find, good job! |
radium-cfx
left a comment
There was a problem hiding this comment.
The performance improvements mostly come down to the missing prefetch instructions which behave wrong on modern CPUs (especially on AMDs Zen architecture).
Could you just no-op the 6 prefetchnta instructions in this function instead of reimplementing the entire function and then compare the performance again?
That should be enough to achieve similar results with less code changes.
| if (!outputBuffer) | ||
| break; | ||
|
|
||
| // NOTE: using raw pointers here avoids atArray::operator[] overhead in this hot loop |
There was a problem hiding this comment.
We're not supporting SPU-powered platforms so atArray is accessed without bound checks, no overhead I can see.
There was a problem hiding this comment.
didnt knew the bound checks are only for debug builds, which was causing overhead for me, doesn't matter anyway since we abandoned that version
| lightCount += (totalLights - numStreetLights); | ||
| } | ||
|
|
||
| auto* outputBuffer = g_ReserveRenderBufferSpace(lightCount); |
There was a problem hiding this comment.
Should use __restrict here, otherwise some optimization gets lost.
Co-Authored-By: DaniGP17 <dani.dcshop.gg@gmail.com>
radium-cfx
left a comment
There was a problem hiding this comment.
Looks good now!
Thanks for your contribution, that is a great change!❤️
cool, didnt know that. basically, we just noticed that the function takes longer than it should and tried to optimize it; learned something new again: performance now looks the same, updated the PR |
Goal of this PR
Optimize
CLODLights::RenderDistantLODLightsto reduce CPU overhead caused by distant LOD light rendering during scenes with many active lights, especially at night. The changes remove unnecessary per-frame work and improve hot-path efficiency, significantly reducing CPU time and improving overall frame rate. The impact appears to be more noticeable on AMD hardware, where in some cases performance improved up to 80%, while Intel systems showed an average FPS increase of around 30-40%.This has been a collaboration with @divocbn and @DaniGP17.
Video showcase: https://www.youtube.com/watch?v=G4YqP0g5LoE
How is this PR achieving the goal
Rewrites and optimizes the distant LOD light rendering path used by
CLODLights::RenderDistantLODLights. The implementation reduces unnecessary per-frame processing by improving culling logic, minimizing expensive math operations, avoiding unnecessary container overhead in hot loops, and improving memory access patterns when iterating and uploading light data to the render buffer.This PR applies to the following area(s)
FiveM
Successfully tested on
Game builds: 3258
Platforms: Windows
Checklist
Fixes issues
https://www.reddit.com/r/GrandTheftAutoV_PC/comments/chcwzq/gta_v_insane_fps_drops_when_looking_towards_city/