Skip to content

fix(gta-core-five): optimize distant lod lights render#3984

Open
DaniGP17 wants to merge 3 commits into
citizenfx:masterfrom
DaniGP17:fix/distant-lod-lights-performance
Open

fix(gta-core-five): optimize distant lod lights render#3984
DaniGP17 wants to merge 3 commits into
citizenfx:masterfrom
DaniGP17:fix/distant-lod-lights-performance

Conversation

@DaniGP17
Copy link
Copy Markdown
Contributor

Goal of this PR

Optimize CLODLights::RenderDistantLODLights to reduce CPU overhead caused by distant LOD light rendering during scenes with many active lights, especially at night. The changes remove unnecessary per-frame work and improve hot-path efficiency, significantly reducing CPU time and improving overall frame rate. The impact appears to be more noticeable on AMD hardware, where in some cases performance improved up to 80%, while Intel systems showed an average FPS increase of around 30-40%.

This has been a collaboration with @divocbn and @DaniGP17.

Video showcase: https://www.youtube.com/watch?v=G4YqP0g5LoE

How is this PR achieving the goal

Rewrites and optimizes the distant LOD light rendering path used by CLODLights::RenderDistantLODLights. The implementation reduces unnecessary per-frame processing by improving culling logic, minimizing expensive math operations, avoiding unnecessary container overhead in hot loops, and improving memory access patterns when iterating and uploading light data to the render buffer.

This PR applies to the following area(s)

FiveM

Successfully tested on

Game builds: 3258
Platforms: Windows

Checklist

  • Code compiles and has been tested successfully.
  • Code explains itself well and/or is documented.
  • My commit message explains what the changes do and what they are for.
  • No extra compilation warnings are added by these changes.

Fixes issues

https://www.reddit.com/r/GrandTheftAutoV_PC/comments/chcwzq/gta_v_insane_fps_drops_when_looking_towards_city/

Co-Authored-By: divocbn <divocbn@gmail.com>
@github-actions github-actions Bot added the invalid Requires changes before it's considered valid and can be (re)triaged label May 13, 2026
@divocbn
Copy link
Copy Markdown
Contributor

divocbn commented May 13, 2026

fps at night go 🚀📈

@ook3D
Copy link
Copy Markdown
Contributor

ook3D commented May 13, 2026

I will mortgage my house to bribe someone to merge this, what a huge improvement 🫡

@qalle-git
Copy link
Copy Markdown

MERGE

@Whisperisgod
Copy link
Copy Markdown

Merge!

@radium-cfx radium-cfx added the needs manual verification PRs that need manual verification by testing the change locally label May 14, 2026
Co-Authored-By: DaniGP17 <dani.dcshop.gg@gmail.com>
@divocbn
Copy link
Copy Markdown
Contributor

divocbn commented May 14, 2026

Gogsi was able to demonstrate locally that some lights in Vinewood weren't displaying; this was because we were dereferencing the wrong object (lightdata instead of bucket) for the mapdata slot index. It now works perfectly fine, as tested by various users, with a minimum improvement of 80-100 fps on AMD and around 30-40 fps in the city on Intel.

@mickeygeecom
Copy link
Copy Markdown
Contributor

Huge find, good job!

Copy link
Copy Markdown
Contributor

@radium-cfx radium-cfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance improvements mostly come down to the missing prefetch instructions which behave wrong on modern CPUs (especially on AMDs Zen architecture).

Could you just no-op the 6 prefetchnta instructions in this function instead of reimplementing the entire function and then compare the performance again?

That should be enough to achieve similar results with less code changes.

if (!outputBuffer)
break;

// NOTE: using raw pointers here avoids atArray::operator[] overhead in this hot loop
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not supporting SPU-powered platforms so atArray is accessed without bound checks, no overhead I can see.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didnt knew the bound checks are only for debug builds, which was causing overhead for me, doesn't matter anyway since we abandoned that version

lightCount += (totalLights - numStreetLights);
}

auto* outputBuffer = g_ReserveRenderBufferSpace(lightCount);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use __restrict here, otherwise some optimization gets lost.

Co-Authored-By: DaniGP17 <dani.dcshop.gg@gmail.com>
Copy link
Copy Markdown
Contributor

@radium-cfx radium-cfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now!

Thanks for your contribution, that is a great change!❤️

@divocbn
Copy link
Copy Markdown
Contributor

divocbn commented May 14, 2026

The performance improvements mostly come down to the missing prefetch instructions which behave wrong on modern CPUs (especially on AMDs Zen architecture).

Could you just no-op the 6 prefetchnta instructions in this function instead of reimplementing the entire function and then compare the performance again?

That should be enough to achieve similar results with less code changes.

cool, didnt know that. basically, we just noticed that the function takes longer than it should and tried to optimize it; learned something new again: performance now looks the same, updated the PR

@radium-cfx radium-cfx added ready-to-merge This PR is enqueued for merging and removed invalid Requires changes before it's considered valid and can be (re)triaged needs manual verification PRs that need manual verification by testing the change locally labels May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-to-merge This PR is enqueued for merging

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants