Skip to content

Lpips metric#1474

Merged
Artiprocher merged 8 commits into
modelscope:mainfrom
Yuze-e20:lpips-metric
Jun 2, 2026
Merged

Lpips metric#1474
Artiprocher merged 8 commits into
modelscope:mainfrom
Yuze-e20:lpips-metric

Conversation

@Yuze-e20
Copy link
Copy Markdown
Contributor

@Yuze-e20 Yuze-e20 commented Jun 1, 2026

"feat(metrics): add LPIPS image quality metric

diffsynth/models/lpips.py: AlexNet/VGG16/SqueezeNet1.1 backbones, ScalingLayer, NetLinLayer, LPIPSModel, LPIPSCompute
diffsynth/metrics/lpips.py: LPIPSMetric.from_pretrained(net='alex'|'vgg'|'squeeze') with auto-derived ModelConfig
examples/image_quality_metric/lpips.py: img-vs-img and dir-vs-dir examples
Register 3 entries in image_metrics_series + identity state_dict converter
Numerically bit-exact with the official lpips package (verified on PerceptualSimilarity/imgs/ex_dir{0,1})"

Yuze-e20 added 2 commits June 1, 2026 17:24
  - diffsynth/models/lpips.py: AlexNet/VGG16/SqueezeNet1.1 backbones, ScalingLayer, NetLinLayer, LPIPSModel, LPIPSCompute
  - diffsynth/metrics/lpips.py: LPIPSMetric.from_pretrained(net='alex'|'vgg'|'squeeze') with auto-derived ModelConfig
  - examples/image_quality_metric/lpips.py: img-vs-img and dir-vs-dir examples
  - Register 3 entries in image_metrics_series + identity state_dict converter
  - Numerically bit-exact with the official lpips package (verified on PerceptualSimilarity/imgs/ex_dir{0,1})
Copilot AI review requested due to automatic review settings June 1, 2026 09:45
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds an LPIPS (Learned Perceptual Image Patch Similarity) image-quality metric to diffsynth.metrics, including model registrations and an example script.

Changes:

  • Introduce LPIPSModel + LPIPSCompute to run LPIPS on single images or stem-matched directory pairs.
  • Add LPIPSMetric.from_pretrained(...) and export it from diffsynth.metrics.
  • Register three LPIPS backbones (alex/vgg/squeeze) in model_configs, plus an example script.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
examples/image_quality_metric/lpips.py Example usage downloading a dataset and computing LPIPS for image-vs-image and dir-vs-dir.
diffsynth/utils/state_dict_converters/image_metrics.py Adds an LPIPS state-dict converter for ImageMetrics weights.
diffsynth/models/lpips.py Implements LPIPS backbones, scaling/linear layers, and compute wrapper handling files/dirs and resizing.
diffsynth/metrics/lpips.py Adds LPIPSMetric with from_pretrained integration into the model download/load flow.
diffsynth/metrics/init.py Exports LPIPSMetric in the package API.
diffsynth/configs/model_configs.py Registers three LPIPS model entries (alex/vgg/squeeze) with hashes and extra kwargs.
PR_LPIPS.md PR documentation describing API/behavior/weights/test plan.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread diffsynth/models/lpips.py
Comment on lines +50 to +55
def _open_rgb(image: ImageInput) -> Image.Image:
if isinstance(image, (str, os.PathLike)):
image = Image.open(image)
if not isinstance(image, Image.Image):
raise TypeError(f"LPIPS expects PIL images or image paths, got {type(image)}.")
return image.convert("RGB")
Comment thread diffsynth/models/lpips.py
Comment on lines +306 to +313
def _compute_pairs(self, pairs, do_resize: bool) -> float:
scores = []
batch_size = max(1, self.batch_size)
for start in range(0, len(pairs), batch_size):
chunk = pairs[start : start + batch_size]
xs0 = torch.stack([self._to_tensor(_open_rgb(a), do_resize) for a, _ in chunk]).to(self.device)
xs1 = torch.stack([self._to_tensor(_open_rgb(b), do_resize) for _, b in chunk]).to(self.device)
scores.append(self.model(xs0, xs1).detach().cpu())
Comment thread diffsynth/models/lpips.py
Comment on lines +264 to +271
def __init__(
self,
model: LPIPSModel,
device: Union[str, torch.device] = "cpu",
batch_size: int = 16,
num_workers: int = 0,
target_size: int = 512,
):
Comment on lines +79 to +80
def ImageMetricsLPIPSStateDictConverter(state_dict):
return {key: state_dict[key] for key in state_dict}

from ..core import ModelConfig
from ..core.device.npu_compatible_device import get_device_type
from ..models.lpips import LPIPSModel, LPIPS_NET_CHOICES, LPIPSCompute
Comment on lines +4 to +12
dataset_snapshot_download(
"DiffSynth-Studio/diffsynth_example_dataset",
allow_file_pattern=["flux/FLUX.1-dev/*", "flux2/FLUX.2-dev/*"],
local_dir="./data/diffsynth_example_dataset",
)
metric = LPIPSMetric.from_pretrained(
net="alex",
device="cuda",
)
Comment thread diffsynth/models/lpips.py
Comment on lines +335 to +343
if a_is_dir:
pairs = _pair_directories_by_stem(image_a, image_b)
sizes = set()
for path_a, path_b in pairs:
with Image.open(path_a) as ia, Image.open(path_b) as ib:
sizes.add(ia.size)
sizes.add(ib.size)
do_resize = len(sizes) > 1
return self._compute_pairs(pairs, do_resize=do_resize)
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the LPIPS (Learned Perceptual Image Patch Similarity) image-quality metric to diffsynth.metrics, supporting AlexNet, VGG, and SqueezeNet backbones. Key feedback includes a critical issue where the model is not set to evaluation mode (.eval()), which causes non-deterministic scores due to active dropout layers during inference. Additionally, the num_workers parameter is currently unused, and it is recommended to add a force_resize option to optimize directory comparisons by skipping the expensive image size-checking loop.

Comment thread diffsynth/models/lpips.py
Comment on lines +284 to +285
self._raw_transform = transforms.ToTensor()
self.to(device)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The model is not set to evaluation mode (.eval()). Since LPIPSModel contains _NetLinLayer which uses nn.Dropout, running the metric in default training mode will cause dropout to randomly zero out activations during inference. This makes the LPIPS score non-deterministic and incorrect. Setting the module to evaluation mode disables dropout.

Suggested change
self._raw_transform = transforms.ToTensor()
self.to(device)
self._raw_transform = transforms.ToTensor()
self.to(device)
self.eval()

Comment thread diffsynth/models/lpips.py Outdated
model: LPIPSModel,
device: Union[str, torch.device] = "cpu",
batch_size: int = 16,
num_workers: int = 0,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The num_workers parameter is accepted in the constructor but is completely unused. The batch loop in _compute_pairs loads and processes images sequentially on the main thread. If parallel loading is not planned, consider removing this parameter to avoid confusion, or document that it is currently unused.

Comment thread diffsynth/models/lpips.py
Comment on lines +264 to +271
def __init__(
self,
model: LPIPSModel,
device: Union[str, torch.device] = "cpu",
batch_size: int = 16,
num_workers: int = 0,
target_size: int = 512,
):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Comparing large directories requires opening every image file twice (once to check sizes, and once to load pixel data), which can be a significant performance bottleneck. Adding a force_resize parameter would allow users to skip the size-checking loop entirely when they already know they want to resize all images to target_size.

Suggested change
def __init__(
self,
model: LPIPSModel,
device: Union[str, torch.device] = "cpu",
batch_size: int = 16,
num_workers: int = 0,
target_size: int = 512,
):
def __init__(
self,
model: LPIPSModel,
device: Union[str, torch.device] = "cpu",
batch_size: int = 16,
num_workers: int = 0,
target_size: int = 512,
force_resize: bool = False,
):

Comment thread diffsynth/models/lpips.py
Comment on lines +335 to +343
if a_is_dir:
pairs = _pair_directories_by_stem(image_a, image_b)
sizes = set()
for path_a, path_b in pairs:
with Image.open(path_a) as ia, Image.open(path_b) as ib:
sizes.add(ia.size)
sizes.add(ib.size)
do_resize = len(sizes) > 1
return self._compute_pairs(pairs, do_resize=do_resize)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Use the force_resize attribute to skip the expensive size-checking loop when comparing directories.

Suggested change
if a_is_dir:
pairs = _pair_directories_by_stem(image_a, image_b)
sizes = set()
for path_a, path_b in pairs:
with Image.open(path_a) as ia, Image.open(path_b) as ib:
sizes.add(ia.size)
sizes.add(ib.size)
do_resize = len(sizes) > 1
return self._compute_pairs(pairs, do_resize=do_resize)
if a_is_dir:
pairs = _pair_directories_by_stem(image_a, image_b)
if self.force_resize:
do_resize = True
else:
sizes = set()
for path_a, path_b in pairs:
with Image.open(path_a) as ia, Image.open(path_b) as ib:
sizes.add(ia.size)
sizes.add(ib.size)
do_resize = len(sizes) > 1
return self._compute_pairs(pairs, do_resize=do_resize)

Comment on lines +28 to +37
def from_pretrained(
cls,
net: str = "alex",
model_config: ModelConfig = None,
device: torch.device = get_device_type(),
batch_size: int = 16,
num_workers: int = 0,
target_size: int = 512,
vram_limit: float = None,
):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Expose the force_resize parameter in from_pretrained to allow skipping the expensive size-checking loop.

    def from_pretrained(
        cls,
        net: str = "alex",
        model_config: ModelConfig = None,
        device: torch.device = get_device_type(),
        batch_size: int = 16,
        num_workers: int = 0,
        target_size: int = 512,
        vram_limit: float = None,
        force_resize: bool = False,
    ):

Comment on lines +51 to +57
compute_model = LPIPSCompute(
model=backbone,
device=device,
batch_size=batch_size,
num_workers=num_workers,
target_size=target_size,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Pass force_resize to LPIPSCompute.

Suggested change
compute_model = LPIPSCompute(
model=backbone,
device=device,
batch_size=batch_size,
num_workers=num_workers,
target_size=target_size,
)
compute_model = LPIPSCompute(
model=backbone,
device=device,
batch_size=batch_size,
num_workers=num_workers,
target_size=target_size,
force_resize=force_resize,
)

@Artiprocher Artiprocher merged commit e5f88f0 into modelscope:main Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants