Lpips metric by Yuze-e20 · Pull Request #1474 · modelscope/DiffSynth-Studio

Yuze-e20 · 2026-06-01T09:45:02Z

"feat(metrics): add LPIPS image quality metric

diffsynth/models/lpips.py: AlexNet/VGG16/SqueezeNet1.1 backbones, ScalingLayer, NetLinLayer, LPIPSModel, LPIPSCompute
diffsynth/metrics/lpips.py: LPIPSMetric.from_pretrained(net='alex'|'vgg'|'squeeze') with auto-derived ModelConfig
examples/image_quality_metric/lpips.py: img-vs-img and dir-vs-dir examples
Register 3 entries in image_metrics_series + identity state_dict converter
Numerically bit-exact with the official lpips package (verified on PerceptualSimilarity/imgs/ex_dir{0,1})"

- diffsynth/models/lpips.py: AlexNet/VGG16/SqueezeNet1.1 backbones, ScalingLayer, NetLinLayer, LPIPSModel, LPIPSCompute - diffsynth/metrics/lpips.py: LPIPSMetric.from_pretrained(net='alex'|'vgg'|'squeeze') with auto-derived ModelConfig - examples/image_quality_metric/lpips.py: img-vs-img and dir-vs-dir examples - Register 3 entries in image_metrics_series + identity state_dict converter - Numerically bit-exact with the official lpips package (verified on PerceptualSimilarity/imgs/ex_dir{0,1})

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds an LPIPS (Learned Perceptual Image Patch Similarity) image-quality metric to diffsynth.metrics, including model registrations and an example script.

Changes:

Introduce LPIPSModel + LPIPSCompute to run LPIPS on single images or stem-matched directory pairs.
Add LPIPSMetric.from_pretrained(...) and export it from diffsynth.metrics.
Register three LPIPS backbones (alex/vgg/squeeze) in model_configs, plus an example script.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
examples/image_quality_metric/lpips.py	Example usage downloading a dataset and computing LPIPS for image-vs-image and dir-vs-dir.
diffsynth/utils/state_dict_converters/image_metrics.py	Adds an LPIPS state-dict converter for ImageMetrics weights.
diffsynth/models/lpips.py	Implements LPIPS backbones, scaling/linear layers, and compute wrapper handling files/dirs and resizing.
diffsynth/metrics/lpips.py	Adds `LPIPSMetric` with `from_pretrained` integration into the model download/load flow.
diffsynth/metrics/init.py	Exports `LPIPSMetric` in the package API.
diffsynth/configs/model_configs.py	Registers three LPIPS model entries (alex/vgg/squeeze) with hashes and extra kwargs.
PR_LPIPS.md	PR documentation describing API/behavior/weights/test plan.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+def _open_rgb(image: ImageInput) -> Image.Image:
+    if isinstance(image, (str, os.PathLike)):
+        image = Image.open(image)
+    if not isinstance(image, Image.Image):
+        raise TypeError(f"LPIPS expects PIL images or image paths, got {type(image)}.")
+    return image.convert("RGB")


+    def _compute_pairs(self, pairs, do_resize: bool) -> float:
+        scores = []
+        batch_size = max(1, self.batch_size)
+        for start in range(0, len(pairs), batch_size):
+            chunk = pairs[start : start + batch_size]
+            xs0 = torch.stack([self._to_tensor(_open_rgb(a), do_resize) for a, _ in chunk]).to(self.device)
+            xs1 = torch.stack([self._to_tensor(_open_rgb(b), do_resize) for _, b in chunk]).to(self.device)
+            scores.append(self.model(xs0, xs1).detach().cpu())


+    def __init__(
+        self,
+        model: LPIPSModel,
+        device: Union[str, torch.device] = "cpu",
+        batch_size: int = 16,
+        num_workers: int = 0,
+        target_size: int = 512,
+    ):


+def ImageMetricsLPIPSStateDictConverter(state_dict):
+    return {key: state_dict[key] for key in state_dict}


+
+from ..core import ModelConfig
+from ..core.device.npu_compatible_device import get_device_type
+from ..models.lpips import LPIPSModel, LPIPS_NET_CHOICES, LPIPSCompute


+dataset_snapshot_download(
+    "DiffSynth-Studio/diffsynth_example_dataset",
+    allow_file_pattern=["flux/FLUX.1-dev/*", "flux2/FLUX.2-dev/*"],
+    local_dir="./data/diffsynth_example_dataset",
+)
+metric = LPIPSMetric.from_pretrained(
+    net="alex",
+    device="cuda",
+)


+        if a_is_dir:
+            pairs = _pair_directories_by_stem(image_a, image_b)
+            sizes = set()
+            for path_a, path_b in pairs:
+                with Image.open(path_a) as ia, Image.open(path_b) as ib:
+                    sizes.add(ia.size)
+                    sizes.add(ib.size)
+            do_resize = len(sizes) > 1
+            return self._compute_pairs(pairs, do_resize=do_resize)


gemini-code-assist

Code Review

This pull request introduces the LPIPS (Learned Perceptual Image Patch Similarity) image-quality metric to diffsynth.metrics, supporting AlexNet, VGG, and SqueezeNet backbones. Key feedback includes a critical issue where the model is not set to evaluation mode (.eval()), which causes non-deterministic scores due to active dropout layers during inference. Additionally, the num_workers parameter is currently unused, and it is recommended to add a force_resize option to optimize directory comparisons by skipping the expensive image size-checking loop.

gemini-code-assist · 2026-06-01T09:47:04Z

+        self._raw_transform = transforms.ToTensor()
+        self.to(device)


The model is not set to evaluation mode (.eval()). Since LPIPSModel contains _NetLinLayer which uses nn.Dropout, running the metric in default training mode will cause dropout to randomly zero out activations during inference. This makes the LPIPS score non-deterministic and incorrect. Setting the module to evaluation mode disables dropout.

Suggested change

self._raw_transform = transforms.ToTensor()

self.to(device)

self._raw_transform = transforms.ToTensor()

self.to(device)

self.eval()

gemini-code-assist · 2026-06-01T09:47:04Z

+        model: LPIPSModel,
+        device: Union[str, torch.device] = "cpu",
+        batch_size: int = 16,
+        num_workers: int = 0,


The num_workers parameter is accepted in the constructor but is completely unused. The batch loop in _compute_pairs loads and processes images sequentially on the main thread. If parallel loading is not planned, consider removing this parameter to avoid confusion, or document that it is currently unused.

gemini-code-assist · 2026-06-01T09:47:04Z

+    def __init__(
+        self,
+        model: LPIPSModel,
+        device: Union[str, torch.device] = "cpu",
+        batch_size: int = 16,
+        num_workers: int = 0,
+        target_size: int = 512,
+    ):


Comparing large directories requires opening every image file twice (once to check sizes, and once to load pixel data), which can be a significant performance bottleneck. Adding a force_resize parameter would allow users to skip the size-checking loop entirely when they already know they want to resize all images to target_size.

Suggested change

def __init__(

self,

model: LPIPSModel,

device: Union[str, torch.device] = "cpu",

batch_size: int = 16,

num_workers: int = 0,

target_size: int = 512,

):

def __init__(

self,

model: LPIPSModel,

device: Union[str, torch.device] = "cpu",

batch_size: int = 16,

num_workers: int = 0,

target_size: int = 512,

force_resize: bool = False,

):

gemini-code-assist · 2026-06-01T09:47:04Z

+        if a_is_dir:
+            pairs = _pair_directories_by_stem(image_a, image_b)
+            sizes = set()
+            for path_a, path_b in pairs:
+                with Image.open(path_a) as ia, Image.open(path_b) as ib:
+                    sizes.add(ia.size)
+                    sizes.add(ib.size)
+            do_resize = len(sizes) > 1
+            return self._compute_pairs(pairs, do_resize=do_resize)


Use the force_resize attribute to skip the expensive size-checking loop when comparing directories.

Suggested change

if a_is_dir:

pairs = _pair_directories_by_stem(image_a, image_b)

sizes = set()

for path_a, path_b in pairs:

with Image.open(path_a) as ia, Image.open(path_b) as ib:

sizes.add(ia.size)

sizes.add(ib.size)

do_resize = len(sizes) > 1

return self._compute_pairs(pairs, do_resize=do_resize)

if a_is_dir:

pairs = _pair_directories_by_stem(image_a, image_b)

if self.force_resize:

do_resize = True

else:

sizes = set()

for path_a, path_b in pairs:

with Image.open(path_a) as ia, Image.open(path_b) as ib:

sizes.add(ia.size)

sizes.add(ib.size)

do_resize = len(sizes) > 1

return self._compute_pairs(pairs, do_resize=do_resize)

gemini-code-assist · 2026-06-01T09:47:04Z

+    def from_pretrained(
+        cls,
+        net: str = "alex",
+        model_config: ModelConfig = None,
+        device: torch.device = get_device_type(),
+        batch_size: int = 16,
+        num_workers: int = 0,
+        target_size: int = 512,
+        vram_limit: float = None,
+    ):


Expose the force_resize parameter in from_pretrained to allow skipping the expensive size-checking loop.

def from_pretrained( cls, net: str = "alex", model_config: ModelConfig = None, device: torch.device = get_device_type(), batch_size: int = 16, num_workers: int = 0, target_size: int = 512, vram_limit: float = None, force_resize: bool = False, ):

gemini-code-assist · 2026-06-01T09:47:05Z

+        compute_model = LPIPSCompute(
+            model=backbone,
+            device=device,
+            batch_size=batch_size,
+            num_workers=num_workers,
+            target_size=target_size,
+        )


Pass force_resize to LPIPSCompute.

Suggested change

compute_model = LPIPSCompute(

model=backbone,

device=device,

batch_size=batch_size,

num_workers=num_workers,

target_size=target_size,

)

compute_model = LPIPSCompute(

model=backbone,

device=device,

batch_size=batch_size,

num_workers=num_workers,

target_size=target_size,

force_resize=force_resize,

)

Yuze-e20 added 2 commits June 1, 2026 17:24

try

29d8b79

Copilot AI review requested due to automatic review settings June 1, 2026 09:45

Copilot AI reviewed Jun 1, 2026

View reviewed changes

gemini-code-assist Bot reviewed Jun 1, 2026

View reviewed changes

Yuze-e20 added 6 commits June 1, 2026 22:52

add default target size

4dbbbc5

remove .gitignore and md

3329421

fix target

5db7301

fix example, converter and numworker

414dfc7

merged

0a82d1e

.gitignore

2835af4

Artiprocher merged commit e5f88f0 into modelscope:main Jun 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lpips metric#1474

Lpips metric#1474
Artiprocher merged 8 commits into
modelscope:mainfrom
Yuze-e20:lpips-metric

Yuze-e20 commented Jun 1, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		def ImageMetricsLPIPSStateDictConverter(state_dict):
		return {key: state_dict[key] for key in state_dict}

Conversation

Yuze-e20 commented Jun 1, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants