owlite
method export
python
export(
model: GraphModule,
onnx_export_options: ONNXExportOptions | None = None,
dynamic_axis_options: DynamicAxisOptions | dict[str, int] | None = None
) → None
Export the model converted by owl.convert
to ONNX format.
{% hint style=“warning" %}
The ONNX model created by owl.export
will also be used by owl.benchmark
for the engine conversion afterward. Therefore, it is crucial to provide an appropriate pre-trained or calibrated model to ensure the correct behavior of your model.
Generally, you can export any model with owl.export
whether it is trained or not. However, keep in mind that some graph-level optimizations performed while building the engine depend on the values of your model's weight.
For example, when you benchmark a quantized model without calibration, the step_size
parameter of the fake quantizers in the model would be all initialized to zeros. These zero step_size
values can make the behavior of the graph-level optimization different, leading to a different latency from a calibrated model's in the benchmarking stage.
Therefore, we strongly recommend
to export for benchmarking a pre-trained model in the baseline mode; and
to perform either PTQ calibration or QAT in experiment mode.
{% endhint %}
Args:
model
(torch.fx.GraphModule
): The model converted byowl.convert
. It must not betorch.nn.DataParallel
ortorch.nn.DistributedDataParallel
instance.onnx_export_options
(owlite.ONNXExportOptions
,optional
): Additional options for exporting ONNX.- OwLite exports your model into ONNX during the conversion using
torch.onnx.export
behind the scenes. You can control some of the behaviors oftorch.onnx.export
by passing anowlite.ONNXExportOptions
object to theonnx_export_options
argument ofowlite.export
. Currently, you can only setopset_version
, which defaults to 17. Other parameters oftorch.onnx.export
might be added in the future.
- OwLite exports your model into ONNX during the conversion using
dynamic_axis_options
(DynamicAxisOptions | dict[str, int]]
,optional
): By default, the exported model will have the shapes of all input tensors set to match exactly those given when calling convert. To specify the axis of tensors as dynamic (i.e., known only at run-time), setdynamic_axis_options
to a dictionary with schema:- KEY (
str
): the name of the input tensor. - VALUE (
int
): the axis to be dynamic.
- KEY (
Raises:
TypeError
: When themodel
is an instance oftorch.nn.DataParallel
ortorch.nn.DistributedDataParallel
.RuntimeError
: Whendynamic_axis_options
is set for baseline export.ValueError
: When invaliddynamic_axis_options
is given.
Behavior in each mode
owl.export
behaves differently depending on the mode triggered by owlite.init
.
Baseline Mode: In this mode,
owl.export
traces the input model with the example input(s) and exports it to ONNX. Then, it sends the ONNX graph and the model to the server. This allows users to view the model graph on the web and apply compression.Experiment Mode: In this mode,
owl.export
exports the model after applying the compression configuration from the experiment or dynamic export options.
Workflow:
owl.export
goes through the following steps:
Exporting ONNX: It exports the input model into ONNX and saves it at your local workspace. In experiment mode, the model will be equipped with a dynamic axis if
dynamic_axis_options
was provided.Uploading ONNX: It then uploads the ONNX (without weights) to the OwLite server.
Examples:
Baseline Mode
```python import owlite
Initialize a baseline or experiment
owl = owlite.init(...)
Initialize your model
model = ...
Convert the model
model = owl.convert(model, ...)
Export the model into ONNX
model = owl.export(model) ```
Checking 0/1...
OwLite [INFO] Saving exported ONNX proto at
/home/sqzb/workspace/owlite/testProject/sampleModel/testProject_sampleModel_sampleModel.onnx with external data
testProject_sampleModel_sampleModel.bin
OwLite [WARNING] External data file at
/home/sqzb/workspace/owlite/testProject/sampleModel/testProject_sampleModel_sampleModel.bin will be overwritten.
OwLite [INFO] Baseline ONNX saved at
/home/sqzb/workspace/owlite/testProject/sampleModel/testProject_sampleModel_sampleModel.onnx
OwLite [INFO] Uploaded the model excluding parameters
Experiment Mode with dynamic batch
```python import owlite
Initialize a baseline or experiment
owl = owlite.init(...)
Initialize your model
model = ...
Convert the model
model = owl.convert(model, ...)
Export the model into ONNX with dynamic axis options
owl.export(model, dynamicaxisoptions={"x": 0}) ```
Checking 0/1...
OwLite [INFO] Saving exported ONNX proto at
OwLite [INFO] `dynamic_axis_options` provided for the following inputs: 'x'
/home/sqzb/workspace/owlite/testProject/sampleModel/dynamic/testProject_sampleModel_dynamic.onnx with external
data testProject_sampleModel_dynamic.bin
OwLite [INFO] Experiment ONNX saved at
/home/sqzb/workspace/owlite/testProject/sampleModel/dynamic/testProject_sampleModel_dynamic.onnx
OwLite [INFO] Uploading
/home/sqzb/workspace/owlite/testProject/sampleModel/dynamic/testProject_sampleModel_dynamic.onnx
100%|█████████████████████████████████████████████████████████████████████| 2.29k/2.29k [00:00<00:00, 123kiB/s]
OwLite [INFO] Uploading done
OwLite will create ONNX graph file and parameter file with the hierarchical structure below:
- owlite
- testProject
- sampleModel
- dynamic
- testProject_SampleModel_dynamic.onnx
- testProject_SampleModel_dynamic.bin
Updated: 2024-06-13T23:42:42