TensorRT  7.2.1.6
NVIDIA TensorRT
Looking for a C++ dev who knows TensorRT?
I'm looking for work. Hire me!
pytorch_quantization.tensor_quant.TensorQuantFunction Class Reference
Inheritance diagram for pytorch_quantization.tensor_quant.TensorQuantFunction:
Collaboration diagram for pytorch_quantization.tensor_quant.TensorQuantFunction:

Static Public Member Functions

def forward (ctx, inputs, amax, num_bits=8, unsigned=False, narrow_range=True)
 
def backward (ctx, grad_outputs, grad_scale)
 

Detailed Description

A universal tensor quantization function

Take an input tensor, output an quantized tensor. The granularity of scale can be interpreted from the
shape of amax.
output_dtype indicates whether the quantized value will be stored in integer or float. The reason we want to store
it in float is the pytorch function takes the quantized value may not accept integer input, e.g. Conv2D.

It uses 2^num_bits -1 values instead of 2^num_bits. e.g., for num_bits=8, it uses [-127, 127] instead of [-128, 127]

Member Function Documentation

◆ forward()

def pytorch_quantization.tensor_quant.TensorQuantFunction.forward (   ctx,
  inputs,
  amax,
  num_bits = 8,
  unsigned = False,
  narrow_range = True 
)
static
Follow tensorflow convention, max value is passed in and used to decide scale, instead of inputing scale
directly. Though inputing scale directly may be more natural to use.

Args:
    ctx: A Context object to store tensors for backward.
    inputs: A Tensor of type float32.
    amax: A Tensor of type float32. Inputs will be quantized within range [-amax, amax]
amax will be broadcasted to inputs tensor.
    num_bits: A integer used to calculate scaling factor, scale = (2^(num_bits-1) - 1) / max
Effectively, it indicates how many integer bits is used to represent the value. Default 8.
    output_dtype: A type of Tensor. torch.int32 or torch.float32.
    unsigned: A boolean. Use unsigned integer range. E.g. [0, 255] for num_bits=8. Default False.
    narrow_range: A boolean. Use symmetric integer range for signed quantization
E.g. [-127,127] instead of [-128,127] for num_bits=8. Default True.

Returns:
    outputs: A Tensor of type output_dtype.
    scale: A Tensor of type float32. outputs / scale will dequantize outputs tensor.

Raises:
    ValueError:
Here is the call graph for this function:

◆ backward()

def pytorch_quantization.tensor_quant.TensorQuantFunction.backward (   ctx,
  grad_outputs,
  grad_scale 
)
static
Implements straight through estimation with clipping. For -amax <= input <= amax
the gradient passes straight through, otherwise the gradient is zero.

Args:
    ctx: A Context object with saved tensors from forward.
    grad_outputs: A tensor of gradient of outputs.
    grad_scale: A tensor of gradient of scale.

Returns:
    grad_inputs: A tensor of gradient.

The documentation for this class was generated from the following file: