cpu_matmul_quantization_cpp_short¶

C++ API example demonstrating how one can perform reduced precision matrix-matrix multiplication using MatMul and the accuracy of the result compared to the floating point computations.

Concepts:

Static and dynamic quantization
Asymmetric quantization
- Run-time output scales: dnnl::primitive_attr::set_output_scales() and DNNL_RUNTIME_F32_VAL
- Run-time zero points: dnnl::primitive_attr::set_zero_points() and DNNL_RUNTIME_S32_VAL

oneDNN v2.7.1 documentation

cpu_matmul_quantization_cpp_short¶