DLRM Quantization

I tried running IntelAI DLRM model with int8 precision with default int8_configure.json. Could someone clarify if quantization happens each time the inference_performance.sh script is triggered, or if the int8 weights are stored after the first run and reused for the later runs. 
Currently, the run takes around 10 hours to complete on a 64 core machine. Please let me know if any additional info is required from my end. 
![int8_dlrm](https://github.com/IntelAI/models/assets/106138353/e8b5dd31-79c6-47ea-914b-d5a725976023)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DLRM Quantization #158

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

DLRM Quantization #158

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions