owlite

`method` `convert`

python convert(model: Module, *args: Any, **kwargs: Any) → GraphModule

Convert the model into a torch.fx.GraphModule object using the example input(s) provided.

{% hint style="warning" %}

The example input(s) provided for owl.convert will also be used by owl.export for the ONNX and Engine conversion afterward. Therefore, it is crucial to provide appropriate example input(s) to ensure the correct behavior of your model.

{% endhint %}

Args:

model (torch.nn.Module): The model to be compressed. Note that it must be an instance of torch.nn.Module, but not torch.nn.DataParallel or torch.nn.DistributedDataParallel. See troubleshooting - Models wrapped with torch.nn.DataParallel or torch.nn.parallel.DistributedDataParallel for more details.
*args: the example input(s) that would be passed to the model's forward method.
**kwargs: the example input(s) that would be passed to the model's forward method. > These example inputs are required to convert the model into a torch.fx.GraphModule instance. Each input must be one of the following:
- A torch.Tensor object
- A tuple of torch.Tensor objects
- A dictionary whose keys are strings and values are torch.Tensor objects.

Returns:

GraphModule: The torch.fx.GraphModule object converted from the model.

Raises:

HTTPError: When request for compression configuration was not successful.

Behavior in each mode

owl.convert behaves differently depending on the mode triggered by owlite.init.

Baseline Mode: In this mode, owl.convert traces the input model with the example input(s).
Experiment Mode: In this mode, the converted torch.fx.GraphModule object will be further modified according to the compression configuration from the experiment. This configuration could have been created by the user on the OwLite website, or copied from another experiment (in 'duplicate from' mode). If there's no compression configuration, it returns the same model as in baseline mode. For dynamic batch size baseline model without compression, create an experiment.

Workflow:

owl.convert goes through the following steps:

Conversion: it converts the input model to the format configurable by OwLite, namely to a torch.fx.GraphModule instance, with the example input(s) provided via *args and **kwargs. This procedure might fail depending on your model's implementation and the torch.compile's coverage in the PyTorch version you're using. If this is the case, you may need to find and fix the causes of the failure provided by the error message.
Compression: In the experiment mode, it further compresses the converted model if the experiment's compression configuration exists. Keep in mind that you must setup the compression configuration via OwLite Web UI before running the owl.convert in order to compress your model.

Examples:

Baseline Mode

```python import owlite import torch

owl = owlite.init(project="testProject", baseline="sampleModel")

Create a sample model

class SampleModel(torch.nn.Module): def init(self): super().init() self.conv1 = torch.nn.Conv2d(3, 64, 3) self.pool1 = torch.nn.MaxPool2d(2, 2) self.conv2 = torch.nn.Conv2d(64, 128, 3) self.pool2 = torch.nn.MaxPool2d(2, 2) self.fc1 = torch.nn.Linear(128 * 7 * 7, 10)

Create a model instance

model = SampleModel()

Convert the model

model = owl.convert(model, torch.randn(4,3,64,64))

Print the model

print(model) ```

This code will create a sample model, convert it to a GraphModule in baseline mode, and export it to ONNX. The output of the code is as follows:

``` OwLite [INFO] Connected device: NVIDIA RTX A6000 OwLite [WARNING] Existing local directory found at /home/sqzb/workspace/owlite/testProject/sampleModel/sample Model. Continuing this code will overwrite the data OwLite [INFO] Created new project 'testProject' OwLite [INFO] Created new baseline 'sampleModel' at project 'testProject' OwLite [INFO] Converted the model GraphModule( (selfconv1): Conv2d(3, 64, kernelsize=(3, 3), stride=(1, 1)) (selfpool1): MaxPool2d(kernelsize=2, stride=2, padding=0, dilation=1, ceilmode=False) (selfconv2): Conv2d(64, 128, kernelsize=(3, 3), stride=(1, 1)) (selfpool2): MaxPool2d(kernelsize=2, stride=2, padding=0, dilation=1, ceilmode=False) (selffc1): Linear(infeatures=6272, out_features=10, bias=True) )

def forward(self, x : torch.Tensor): sqzbmoduledevicecanary = self.sqzbmoduledevicecanary getattr1 = sqzbmoduledevicecanary.device; sqzbmoduledevicecanary = None selfconv1 = self.selfconv1(x); x = None relu = torch.nn.functional.relu(selfconv1); selfconv1 = None selfpool1 = self.selfpool1(relu); relu = None selfconv2 = self.selfconv2(selfpool1); selfpool1 = None relu1 = torch.nn.functional.relu(selfconv2); selfconv2 = None selfpool2 = self.selfpool2(relu1); relu1 = None view = selfpool2.view(-1, 6272); selfpool2 = None selffc1 = self.selffc1(view); view = None outputadapter = owlitebackendfxtraceoutputadapter((selffc1,)); selffc1 = None return output_adapter ```

Experiment Mode

```python import torch

owl = owlite.init(project="testProject", baseline="sampleModel", experiment="conv")

Create a sample model

    def forward(self, x):
        x = self.conv1(x)
        x = torch.nn.functional.relu(x)
        x = self.pool1(x)

        x = self.conv2(x)
        x = torch.nn.functional.relu(x)
        x = self.pool2(x)

        x = x.view(-1, 128 * 7 * 7)
        x = self.fc1(x)

        return x

Create a model instance

model = SampleModel()

Convert the model

model = owl.convert(model, torch.randn(4, 3, 64, 64))

Print the model

print(model) ```

This code will create a sample model, convert it to a GraphModule in experiment mode, and apply the compression configuration from the init function. The output of the code is as follows:

``` OwLite [INFO] Connected device: NVIDIA RTX A6000 OwLite [INFO] Experiment data will be saved in /home/sqzb/workspace/owlite/testProject/sampleModel/conv OwLite [INFO] Loaded existing project 'testProject' OwLite [INFO] Existing compression configuration for 'conv' found OwLite [INFO] Model conversion initiated OwLite [INFO] Compression configuration found for 'conv' OwLite [INFO] Applying compression configuration OwLite [INFO] Converted the model GraphModule( (selfconv1): QConv2d( 3, 64, kernelsize=(3, 3), stride=(1, 1) (weightquantizer): FakeQuantizer(ste(precision: 8, perchannel, quantmin: -127, quantmax: 127, isenabled: True, calib: AbsmaxCalibrator)) (inputquantizer): FakeQuantizer(ste(precision: 8, pertensor, quantmin: -128, quantmax: 127, q zeropoint: 0.0, iszeropointfolded: False, isenabled: True, calib: AbsmaxCalibrator)) ) (selfpool1): MaxPool2d(kernelsize=2, stride=2, padding=0, dilation=1, ceilmode=False) (selfconv2): QConv2d( 64, 128, kernelsize=(3, 3), stride=(1, 1) (weightquantizer): FakeQuantizer(ste(precision: 8, perchannel, quantmin: -127, quantmax: 127, isenabled: True, calib: AbsmaxCalibrator)) (inputquantizer): FakeQuantizer(ste(precision: 8, pertensor, quantmin: -128, quantmax: 127, zeropoint: 0.0, iszeropointfolded: False, isenabled: True, calib: AbsmaxCalibrator)) ) (selfpool2): MaxPool2d(kernelsize=2, stride=2, padding=0, dilation=1, ceilmode=False) (selffc1): QLinear( infeatures=6272, outfeatures=10, bias=True (weightquantizer): FakeQuantizer(ste(precision: 8, perchannel, quantmin: -127, quantmax: 127, isenabled: True, calib: AbsmaxCalibrator)) (inputquantizer): FakeQuantizer(ste(precision: 8, pertensor, quantmin: -128, quantmax: 127, zeropoint: 0.0, iszeropointfolded: False, isenabled: True, calib: AbsmaxCalibrator)) ) (selfconv10quantizer): FakeQuantizer(ste(precision: 8, pertensor, quantmin: -128, quantmax: 127, zeropoint: 0.0, iszeropointfolded: False, isenabled: True, calib: AbsmaxCalibrator)) (selfpool10quantizer): FakeQuantizer(ste(precision: 8, pertensor, quantmin: -128, quantmax: 127, zeropoint: 0.0, iszeropointfolded: False, isenabled: True, calib: AbsmaxCalibrator)) (selfconv20quantizer): FakeQuantizer(ste(precision: 8, pertensor, quantmin: -128, quantmax: 127, zeropoint: 0.0, iszeropointfolded: False, isenabled: True, calib: AbsmaxCalibrator)) (selfpool20quantizer): FakeQuantizer(ste(precision: 8, pertensor, quantmin: -128, quantmax: 127, zeropoint: 0.0, iszeropointfolded: False, isenabled: True, calib: AbsmaxCalibrator)) (selffc10quantizer): FakeQuantizer(ste(precision: 8, pertensor, quantmin: -128, quantmax: 127, zeropoint: 0.0, iszeropointfolded: False, isenabled: True, calib: AbsmaxCalibrator)) )

def forward(self, x : torch.Tensor): selfconv10quantizer = self.selfconv10quantizer(x); x = None selfconv1 = self.selfconv1(selfconv10quantizer); selfconv10quantizer = None relu = torch.nn.functional.relu(selfconv1); selfconv1 = None selfpool10quantizer = self.selfpool10quantizer(relu); relu = None selfpool1 = self.selfpool1(selfpool10quantizer); selfpool10quantizer = None selfconv20quantizer = self.selfconv20quantizer(selfpool1); selfpool1 = None selfconv2 = self.selfconv2(selfconv20quantizer); selfconv20quantizer = None relu1 = torch.nn.functional.relu(selfconv2); selfconv2 = None selfpool20quantizer = self.selfpool20quantizer(relu1); relu1 = None selfpool2 = self.selfpool2(selfpool20quantizer); selfpool20quantizer = None view = selfpool2.view(-1, 6272); selfpool2 = None selffc10quantizer = self.selffc10quantizer(view); view = None selffc1 = self.selffc1(selffc10quantizer); selffc10quantizer = None outputadapter = owlitebackendfxtraceoutputadapter((selffc1,)); selffc1 = None return outputadapter ```

Updated: 2024-06-13T23:42:42

Search

OwLite Help Center

Convert

owlite

`method` `convert`

Args:

Returns:

Raises:

Behavior in each mode

Workflow:

Examples:

Create a sample model

Create a model instance

Convert the model

Print the model

Create a sample model

Create a model instance

Convert the model

Print the model

Search

OwLite Help Center

owlite

method convert

Args:

Returns:

Raises:

Behavior in each mode

Workflow:

Examples:

Create a sample model

Create a model instance

Convert the model

Print the model

Create a sample model

Create a model instance

Convert the model

Print the model

`method` `convert`