module owlite.nn.modules.qlinear
class QLinear
Applies a linear transformation with fake-quantized weight $$ Aq $$ to the incoming data: $$ y = xAq^T + b $$.
Additionally, fake-quantization is applicable to both the bias and bias addition: $$y = \text{quant}(xW_q^T) + \text{quant}(b)$$, where represents $$\text{quant}$$ the fake-quantize function. The module copies the weights and biases from the original linear instance.
Quantized linear layer inherited from torch.nn.Linear.
method __init__
python
__init__(linear: Linear, weight_opts: FakeQuantizerOptions | None = None) → None
Convert a Linear
instance to the analogous QLinear
instance, copying weights and bias if exists.
Args:
linear
(torch.nn.Linear
): aLinear
instance to be converted toQLinear
instance.weight_opts
(FakeQuantizerOptions | None
, optional): Option for the fake weight quantizer. Defaults to None.
method forward
python
forward(inputs: Tensor) → Tensor
Forward with quantized weight if available.
Updated: 2024-06-13T23:42:42