unexpected error during export

While testing my models with OwLite I came across an unexpected error during OwLite.export. The error occurred when I was trying to export baseline my torchvideo/x3d model. It seems like exporting to ONNX is done successfully but somewhat fails to upload the exported ONNX.

I could track down the error and discover the fact that this only happens with the models with custom torch.autograd.Function. I attach the minimal reproducer of the error I found and the error traceback of it.

I have found a workaround which is just to replace the custom autograd function with the torch native one(torch.nn.SiLU which does basically the same thing). I’d like to know if better way exists since I do not want to fix all the models in the repository myself and am pretty sure that there exists certain cases that I do not have a torch native appropriate replacer.

Also, it would be great to print a appropriate error message rather than just throwing naive http error response. I have suffered a lot to track down to the fact that this error is somewhat related to the torch.autograd.Function, which is completely not inferable from the error that owlite package have thrown.

Environment:

OS: both on macOS 14.5 and ubuntu 22.04
python: 3.10
owlite: 2.1.0
pytorch: both on 2.2.2 and 2.1.2
pytorchvideo: 0.1.3

Code:

import torch
from owlite import init

class SwishFunction(torch.autograd.Function):
    @staticmethod
    def forward(ctx, x):
        result = x * torch.sigmoid(x)
        ctx.save_for_backward(x)
        return result

    @staticmethod
    def backward(ctx, grad_output):
        (x,) = ctx.saved_tensors
        sigmoid_x = torch.sigmoid(x)
        return grad_output * (sigmoid_x * (1 + x * (1 - sigmoid_x)))
    
class Bar(torch.nn.Module):
    def __init__(self, *args, **kwargs) -> None:
        super().__init__(*args, **kwargs)
        self.proj = torch.nn.Linear(in_features=100, out_features=10)

    def forward(self, x):
        x = self.proj(x)
        x = SwishFunction.apply(x)
        return x

class Foo(torch.nn.Module):
    def __init__(self, *args, **kwargs) -> None:
        super().__init__(*args, **kwargs)
        self.bar = Bar()

    def forward(self, x):
        x = self.bar(x)
        return x

model = Foo().eval()
args = (torch.randn(100), )

owl = init("bug", "my_custom_autograd")
model = owl.convert(model, *args)
owl.export(model)

Traceback:

(test) ~ # python custom_autograd.py
OwLite [INFO] Created a new project 'bug'
OwLite [INFO] Connecting to the first device at NEST
OwLite [INFO] Device connected: NVIDIA RTX A4000 [TensorRT]
OwLite [INFO] Created a new baseline 'my_custom_autograd' in the project 'bug'
OwLite [INFO] Experiment data will be saved in /Users/hjdbhj/tmp/owlite/bug/my_custom_autograd
OwLite [INFO] Converting the model
OwLite [INFO] Saving exported ONNX proto at /Users/hjdbhj/tmp/owlite/bug/my_custom_autograd/bug_my_custom_autograd_my_custom_autograd.onnx with external data bug_my_custom_autograd_my_custom_autograd.bin
OwLite [INFO] Baseline ONNX saved at /Users/hjdbhj/tmp/owlite/bug/my_custom_autograd/bug_my_custom_autograd_my_custom_autograd.onnx
Traceback (most recent call last):
  File "/Users/hjdbhj/owlite-bug-repro/custom_autograd.py", line 43, in <module>
    owl.export(model)
  File "/Users/hjdbhj/opt/miniconda3/envs/test/lib/python3.10/site-packages/owlite/owlite.py", line 493, in export
    self.target.upload(proto, model)
  File "/Users/hjdbhj/opt/miniconda3/envs/test/lib/python3.10/site-packages/owlite/api/baseline.py", line 149, in upload
    DOVE_API_BASE.post(
  File "/Users/hjdbhj/opt/miniconda3/envs/test/lib/python3.10/site-packages/owlite/owlite_core/api_base.py", line 159, in post
    return self._request(request_callable)
  File "/Users/hjdbhj/opt/miniconda3/envs/test/lib/python3.10/site-packages/owlite/owlite_core/api_base.py", line 107, in _request
    response.raise_for_status()
  File "/Users/hjdbhj/opt/miniconda3/envs/test/lib/python3.10/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://dove.owlite.ai/upload

SqueezeBits

September 19, 2024 06:18

Hello hjd_bhj. Thank you for using OwLite and for reporting the issue you experienced.

It seems you encountered an unexpected error, which must have caused inconvenience while using OwLite. OwLite aims to provide clear solutions and explanations for problems users face. We believe your inquiry will help us improve OwLite and make it a more reliable service.

I have just notified our development team about your request, and they will get back to you soon with a response.

Thank you!

Huijong Jeong

September 20, 2024 08:53
Edited

Hi,

Me and our team was able to reproduced the error with your code and verify the workaround you suggested is valid.

As you mentioned, this is an unexpected bug when model with custom torch.autograd function is given as an input to OwLite. Currently, OwLite lacks the ability to handle custom torch.autograd functions. To be more precise, error occurs during the mapping logic between torhc.fx.Graph converted with OwLite.convert and onnx.ModelProto exported with OwLite.export.

We are eagerly seeking a way to support custom torch.autograd functions as there are certain cases that these functions cannot be easily replaced as you mentioned. We will let you know if we come up with a precise schedule on this. Until then, please stick to the workaround you've found.

We will also update the log messages so that you can detect these kinds of problem with less effort.

Thanks

Search

OwLite Help Center

unexpected error during export

Comments

Didn't find what you were looking for?