Skip to content
This repository was archived by the owner on Mar 4, 2026. It is now read-only.
This repository was archived by the owner on Mar 4, 2026. It is now read-only.

DLRM Quantization #158

@Vasanta1

Description

@Vasanta1

I tried running IntelAI DLRM model with int8 precision with default int8_configure.json. Could someone clarify if quantization happens each time the inference_performance.sh script is triggered, or if the int8 weights are stored after the first run and reused for the later runs.
Currently, the run takes around 10 hours to complete on a 64 core machine. Please let me know if any additional info is required from my end.
int8_dlrm

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions