Conversation
|
The API in #18371 sure is prettier... |
|
The Any other ideas for names instead of |
|
Why not adding the volume to the scene like in #18371?
|
|
Hmm, that'll be better indeed. I guess the way to do it would be by checking what volume is the object being rendered inside of. |
|
|
||
| if ( _shMaterial !== null ) _shMaterial.dispose(); | ||
|
|
||
| _shMaterial = new ShaderMaterial( { |
There was a problem hiding this comment.
Instead of using a new shader material, could you also use the existing method LightProbeGenerator.fromCubeRenderTarget() instead like in #18371? The input render target would be the cube camera render target as in the existing code.
There was a problem hiding this comment.
LightProbeGenerator.fromCubeRenderTarget() computes only one light probe and it's on the CPU.
This PR computes many of them on the GPU. The sponza one computes 147 light probes (7x7x3).
There was a problem hiding this comment.
LightProbeGenerator.fromCubeRenderTarget() computes only one light probe and it's on the CPU.
Ah right, it indeed evaluates the data on the CPU side.
The light probe volume iterates through all its light probes, updates the cube camera and then extracts the light probe data into the batch render target. You have here a render call per light probe and the logic for that could be placed in LightProbeGenerator. However, given how the data are organized in the batch render target, it's maybe too use case specific for LightProbeGenerator.
There was a problem hiding this comment.
I wonder how useful LightProbe and LightProbeGenerator are... 🤔
There was a problem hiding this comment.
I think without something like LightProbeVolume, their utility is indeed questionable. If they are not integrated in this PR, they maybe become obsolete.
That aside #18371 was never finished because the baking in LDR was considered as incorrect in context of PBR, see #18371 (comment). Hence a baking solution in Blender or an external tool was intended so LightProbeVolume would just read in the exported baked lighting.
There happened a lot of discussion around https://github.com/gillesboisson/threejs-probes-test but I'm not sure about its state.
In general, asking the users to do the baking in an external tool isn't ideal, imo. It would better fit to three.js if this could be done directly with LightProbeVolume.
This PR has a different approach but I'm not yet sure if HDR is correctly implemented. The feedback of @donmccurdy and @WestLangley is important here.
There was a problem hiding this comment.
This PR has a different approach but I'm not yet sure if HDR is correctly implemented.
I've searched around and also asked a colleague from my former university. He recommended to read An Efficient Representation for Irradiance Environment Maps.
This paper is some sort of gold standard for SH-based diffuse light probes. The link to the PDF is: https://graphics.stanford.edu/papers/envmap/envmap.pdf
One outcome of this analysis is: The implementation of this PR uses an L1 spherical harmonics which is insufficient for HDR (and diffuse light in general). The industry standard for diffuse GI with SHs is based on the above paper and requires L2 spherical harmonics like in our SphericalHarmonics3 math class and thus by our current LightProbe implementation. L2 can retain almost all characteristics of an irradiance environment map while L1 can't.
If you ask Gemini why L1 does not properly work with HDR, you get this:
Gemini
While L1 Spherical Harmonics (SH) can store HDR data and provide a basic directional sense to your lighting, they are generally considered insufficient for high-quality real-time rendering, even for indirect diffuse light.
Here is the breakdown of why L2 is the industry standard and where L1 falls short.
1. The "Ramamoorthi" Standard
In their seminal 2001 paper, Ravi Ramamoorthi and Pat Hanrahan demonstrated that the diffuse reflection (Lambertian BRDF) acts as a low-pass filter on the environment.
L0 (1 coefficient): Represents the average ambient color (DC term).
L1 (3 additional coefficients): Represents the "linear" directionality. It tells you which side is brighter.
L2 (5 additional coefficients): Adds the quadratic "shape."
Ramamoorthi showed that L2 SH (9 coefficients total) captures roughly 99% of the energy of the diffuse irradiance map. Dropping down to L1 loses the subtle "clamping" and shaping that makes indirect light look grounded.
2. Why L1 is usually "Not Enough"
The "Ringing" and Negative Light Problem
L1 SH effectively models lighting as a constant term plus a vector. If your HDR light probe has a very bright localized light source (like a window or a sun-bounce), the L1 representation can often result in negative radiance on the opposite side of the sphere. This leads to "black holes" or weird artifacts in your shadows unless you apply heavy-handed (and often inaccurate) rescaling or windowing.
Lack of "Flatness"
Diffuse irradiance isn't just a directional gradient; it has a specific "lobed" shape caused by the cosine term in the rendering equation:
L1 is too "pointy." It struggles to represent the way light wraps around a curved surface, often making objects look more like they are lit by a single weak directional light rather than a rich indirect environment.
There was a problem hiding this comment.
I'll give it a go!
Asked Claude about performance implications:
The cost breaks down into three parts:
Bake time: ~same. The dominant cost is rendering cubemaps at each probe position — that's unchanged. The SH projection shader does 9 multiply-adds instead of 4 per texel, but that's trivial compared to the cubemap renders. Repacking writes 7 slices instead of 3, also negligible.
Memory: ~2.3x more texture memory, but tiny in absolute terms. 7 RGBA float 3D textures instead of 3. For a 6×6×6 grid, that's ~24KB vs ~10KB. Even a 12×12×12 grid would be under 200KB. Not a concern.
Per-frame shader cost: the only real difference. The evaluation shader goes from 3 texture() calls + 4 multiply-adds to 7 texture() calls + 9 multiply-adds. The 3D texture samples with LinearFilter (trilinear interpolation) are the bottleneck — going from 3 to 7 samples per fragment. That said, this runs only on surfaces inside a probe volume, and it's just one part of the overall lighting calculation.
In practice, the per-frame cost increase is modest. The 4 extra 3D texture samples are the only thing that matters, and trilinear 3D lookups are well-optimized on modern GPUs. Most game engines (Unreal, Unity) use L2 for their probes without concern — the quality improvement far outweighs the cost.
There was a problem hiding this comment.
Sure does look better:
L1
https://raw.githack.com/mrdoob/three.js/gi-l1/examples/webgl_lightprobes.html
https://raw.githack.com/mrdoob/three.js/gi-l1/examples/webgl_lightprobes_complex.html
https://raw.githack.com/mrdoob/three.js/gi-l1/examples/webgl_lightprobes_sponza.html
L2
https://raw.githack.com/mrdoob/three.js/gi/examples/webgl_lightprobes.html
https://raw.githack.com/mrdoob/three.js/gi/examples/webgl_lightprobes_complex.html
https://raw.githack.com/mrdoob/three.js/gi/examples/webgl_lightprobes_sponza.html
There was a problem hiding this comment.
Performance seems pretty good and sponza looks much better yep!
c8fb251 to
f12f6a8
Compare
📦 Bundle sizeFull ESM build, minified and gzipped.
🌳 Bundle size after tree-shakingMinimal build including a renderer, camera, empty scene, and dependencies.
|
b1bb579 to
dc0fbdd
Compare
GPU-resident L1 SH probe grid with hardware trilinear interpolation. Cubemap rendering, SH projection and texture packing run entirely on the GPU with zero CPU readback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Make LightProbeVolume extend Object3D so it can be added to the scene graph with scene.add(), enabling multiple volumes per scene and per-object volume lookup based on bounding box containment. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
src/renderers/shaders/ShaderChunk/lights_fragment_begin.glsl.js
Outdated
Show resolved
Hide resolved
e77f366 to
efa28bd
Compare
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
I still need to clean up the code and tweak the API design too. |
| uniform highp sampler3D probeGridSH3; | ||
| uniform highp sampler3D probeGridSH4; | ||
| uniform highp sampler3D probeGridSH5; | ||
| uniform highp sampler3D probeGridSH6; |
There was a problem hiding this comment.
One thing we have to consider is this block.
The number of active texture units in shaders is quite restricted. On my mac Mini, it's 16 in the fragment shader. If a light probe volume is used, it requires seven 3D textures to store the packed SH data (that's over 40% of the available texture units).
We should at least discuss the option to store the SH data just in one texture and do the interpolation manually.
Right now, I fear in more complex scenarios with multiple shadow casting lights and a 3D asset with multiple textures we could hit the max texture units limit rather quickly.
There was a problem hiding this comment.
TL;DR
- L1: 3 textures, no HDR
- L2: 7 textures, HDR but could reach max 16 units
- L2: 7 textures could be packed to 3 but we would lose trilinear hardware interpolation
🤷♂️
WebGL 2.0 guarantees a minimum of 16 fragment texture units (MAX_TEXTURE_IMAGE_UNITS). Most mobile GPUs offer exactly 16.
A typical PBR scene might use:
- Material maps: ~5-6 (diffuse, normal, roughness, metalness, ao, emissive)
- Environment: 1-2
- Shadow maps: 1-4
- Probe volume L1: 3 / L2: 7
With L1 (3 textures): ~12 total — comfortable.
With L2 (7 textures): ~16 total — right at the limit.
Can we pack into a single texture? Yes, but with tradeoffs:
- We currently rely on hardware
LinearFilterfor free trilinear interpolation (onetexture()call = 8 texel fetches + lerps internally) - With a single texture, we'd need to pack data along one axis (e.g., width × 3 or × 7) and do
texelFetch+ manual trilinear interpolation in the shader - That's 4 fetches + bilinear math per data slot × 3 or 7 slots = significantly more shader work
For the gi branch (L1, 3 textures), I wouldn't worry — 3 slots is very reasonable. For L2, it could be tight on low-end phones with complex materials. One middle-ground option: pack L2's 7 textures into 3 by using wider texels with manual addressing, keeping hardware interpolation on the spatial axes.
Want me to explore a packed single-texture implementation?
Tell me more about the idea of packing 7 textures into 3 and keeping hardware interpolation. Are there any downsides?
Actually, on closer thought, packing 7 into 3 while keeping hardware trilinear doesn't work. Here's why:
Hardware LinearFilter on a 3D texture interpolates between adjacent texels in all 3 axes. If we extend a dimension (e.g., making width N×3 to pack 3 coefficient groups side by side), the filter will bleed between coefficient groups at the boundaries — producing incorrect values.
The only ways to use fewer textures are:
-
Single texture + manual trilinear — Pack everything into one 3D texture with extended dimensions, use
texelFetch(no filtering), and compute trilinear interpolation manually in the shader. That's 8 fetches × 7 data slots = 56texelFetchcalls + lerp math. Expensive. -
Single texture + manual bilinear on XY, nearest on Z — Cheaper (4 fetches × 7 = 28), but produces visible banding along the Z axis.
-
Stay with L1 — 3 textures is very comfortable within the 16-unit budget. L1 captures ~90% of diffuse irradiance energy and is often good enough visually.
So the real tradeoff is: L2 quality (7 textures) vs. L1 compatibility (3 textures). There isn't a clean middle ground that keeps both hardware interpolation and low texture count.
Given that the gi branch already uses L1 with 3 textures, I'd say that's the pragmatic choice for broad device support. L2 could be offered as an opt-in for desktop/high-end.
There was a problem hiding this comment.
L1 captures ~90% of diffuse irradiance energy and is often good enough visually.
Just for the reference, Gemini provides different numbers (only 75% for L1 and 99% for L2):
Comparison: L1 vs. L2
| Feature | L1 Spherical Harmonics | L2 Spherical Harmonics |
|---|---|---|
| Coefficients | 4 per color channel (12 total) | 9 per color channel (27 total) |
| Memory | Low (Fits in a single float4 per channel) | Moderate |
| Accuracy | ~75% of irradiance energy | ~99% of irradiance energy |
| Visuals | Basic "dark-to-light" gradient | Natural, "wrapped" soft lighting |
| Usage | Mobile, very old hardware, or tiny dynamic objects | Standard for PC/Console light probes |
Personally, I would not vote for the L1 approach (at least if we have to decide for a single approach) because of the quality tradeoffs. If we can't find a way for a different packing strategy, I vote to start with L2 like currently implemented and wait for user feedback. If developers actually complain, we maybe can make the SH type configurable ("performance" (L1) vs. "quality" (L2) mode with "quality" as the default).
How does that sound?
There was a problem hiding this comment.
I asked again and it seems like we may be able to bring it down to 5 textures.
I'll give it a try and compare the colors.
However, there's a common game engine trick: store L2 coefficients as greyscale (luminance only). L1 provides most of the color variation; L2 mainly adds directional sharpness.
- L1 color: 4 × vec3 = 12 floats → 3 textures (same as now)
- L2 greyscale: 5 × float = 5 floats → packed into 2 more textures
Total: 5 textures for L1 color + L2 luminance. The evaluation would look like:
// L1 (color)
vec3 result = c0 * 0.886227;
result += c1 * 2.0 * 0.511664 * y;
result += c2 * 2.0 * 0.511664 * z;
result += c3 * 2.0 * 0.511664 * x;
// L2 (greyscale, applied uniformly to RGB)
float l2 = c4g * 2.0 * 0.429043 * x * y;
l2 += c5g * 2.0 * 0.429043 * y * z;
l2 += c6g * ( 0.743125 * z * z - 0.247708 );
l2 += c7g * 2.0 * 0.429043 * x * z;
l2 += c8g * 0.429043 * ( x * x - y * y );
result += vec3( l2 );So the options are:
- 3 textures — L1 color
- 5 textures — L1 color + L2 greyscale
- 7 textures — full L2 color
There was a problem hiding this comment.
The sponza demo is a tiny bit more blue but it may be okay?
L1
https://raw.githack.com/mrdoob/three.js/gi-l1/examples/webgl_lightprobes.html
https://raw.githack.com/mrdoob/three.js/gi-l1/examples/webgl_lightprobes_complex.html
https://raw.githack.com/mrdoob/three.js/gi-l1/examples/webgl_lightprobes_sponza.html
L2 color
https://raw.githack.com/mrdoob/three.js/gi/examples/webgl_lightprobes.html
https://raw.githack.com/mrdoob/three.js/gi/examples/webgl_lightprobes_complex.html
https://raw.githack.com/mrdoob/three.js/gi/examples/webgl_lightprobes_sponza.html
L1 color + L2 greyscale
https://raw.githack.com/mrdoob/three.js/gi-l2lum/examples/webgl_lightprobes.html
https://raw.githack.com/mrdoob/three.js/gi-l2lum/examples/webgl_lightprobes_complex.html
https://raw.githack.com/mrdoob/three.js/gi-l2lum/examples/webgl_lightprobes_sponza.html
There was a problem hiding this comment.
I'll continue at some point. I got overwhelmed with the back and forth.
There was a problem hiding this comment.
There's also quite a bit more to solve.
Now that it's part of the scene graph, I need to work in the editor integration, json format integration and figure out when should the baking occur. Also need to finish the API.
And that's just what I can think of.
There was a problem hiding this comment.
Indeed, it's a complex task. But I think we invest here in the right spot since it's an awesome feature and great alternative to the more expensive dynamic approaches like SSGI.
When my AI quotas are refreshed next week, I'll checkout this branch and continue with the improved packing like described in my earlier post.
Editor and serialization/deserialization support is something we should tackle in a different PR. If we got the number of required texture units for the GI down to 1 and polished the API, the PR is ready to merge, imo.
There was a problem hiding this comment.
Okay, I have updated the branch with the texture atlas approach meaning we just allocate a single 3D texture unit for the GI. The visual result should not be affected by that. For comparison:
Previous:
https://raw.githack.com/mrdoob/three.js/5f513a4ff52875914f03fb1585ecac1a208f10dc/examples/webgl_lightprobes.html
https://raw.githack.com/mrdoob/three.js/5f513a4ff52875914f03fb1585ecac1a208f10dc/examples/webgl_lightprobes_complex.html
https://raw.githack.com/mrdoob/three.js/5f513a4ff52875914f03fb1585ecac1a208f10dc/examples/webgl_lightprobes_sponza.html
Texture Atlas:
https://raw.githack.com/mrdoob/three.js/22b2ff6699e6480d6c0e83671128ba3aec5f9714/examples/webgl_lightprobes.html
https://raw.githack.com/mrdoob/three.js/22b2ff6699e6480d6c0e83671128ba3aec5f9714/examples/webgl_lightprobes_complex.html
https://raw.githack.com/mrdoob/three.js/22b2ff6699e6480d6c0e83671128ba3aec5f9714/examples/webgl_lightprobes_sponza.html
We only use L2 SHs for best quality like Unreal. Now with just one texture unit, I don't think we need a different mode like suggested earlier.
There was a problem hiding this comment.
Looks good!
I'll have a proper look tomorrow 👌
|
During development, I've noticed that We could move things like I wonder now...if the bake is done, could we consider to free the baking resources? They should not be needed anymore once the final 3D render target with the SH grids is created. Unless the volume might be regenerated at a certain point or more volumes are added to the scene. Alternatively, we could free them in There are multiple options, all are valid to a certain extend so I assume it depends on our preferences. I vote to clear the module scope materials and render targets in |
|
I did some clean up commits and also removed the useless check for That aside, the cube-map to light probe conversion logic seems to match |
|
Are you okay with merging this PR? It would be nice to have this in for I would also suggest to wait with a port for |
|
I'll finish the PR tomorrow 👌 |
|
I'm excited about the new LightProbeVolume! It's like getting a new toy as a kid 😊 . |
|
Glad you like it! Yeah, I felt like I still needed to do some idle thinking on this one. Going to change the API a bit... |
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Okay, the API fits a bit more with I'll rename the uniforms later. |
Related issue: #16228 #18371
Description
Adds
LightProbeVolume, a 3D grid of Spherical Harmonic irradiance probes that provides position-dependent diffuse global illumination for WebGLRenderer.LinearFilterfor hardware trilinear interpolationscene.add( volume )Usage
Examples
http://raw.githack.com/mrdoob/three.js/gi/examples/webgl_lightprobes.html
http://raw.githack.com/mrdoob/three.js/gi/examples/webgl_lightprobes_complex.html
http://raw.githack.com/mrdoob/three.js/gi/examples/webgl_lightprobes_sponza.html