unexpected error during export
While testing my models with OwLite I came across an unexpected error during OwLite.export. The error occurred when I was trying to export baseline my torchvideo/x3d model. It seems like exporting to ONNX is done successfully but somewhat fails to upload the exported ONNX.
I could track down the error and discover the fact that this only happens with the models with custom torch.autograd.Function. I attach the minimal reproducer of the error I found and the error traceback of it.
I have found a workaround which is just to replace the custom autograd function with the torch native one(torch.nn.SiLU which does basically the same thing). I’d like to know if better way exists since I do not want to fix all the models in the repository myself and am pretty sure that there exists certain cases that I do not have a torch native appropriate replacer.
Also, it would be great to print a appropriate error message rather than just throwing naive http error response. I have suffered a lot to track down to the fact that this error is somewhat related to the torch.autograd.Function, which is completely not inferable from the error that owlite package have thrown.
Environment:
- OS: both on macOS 14.5 and ubuntu 22.04
- python: 3.10
- owlite: 2.1.0
- pytorch: both on 2.2.2 and 2.1.2
- pytorchvideo: 0.1.3
Code:
import torch
from owlite import init
class SwishFunction(torch.autograd.Function):
@staticmethod
def forward(ctx, x):
result = x * torch.sigmoid(x)
ctx.save_for_backward(x)
return result
@staticmethod
def backward(ctx, grad_output):
(x,) = ctx.saved_tensors
sigmoid_x = torch.sigmoid(x)
return grad_output * (sigmoid_x * (1 + x * (1 - sigmoid_x)))
class Bar(torch.nn.Module):
def __init__(self, *args, **kwargs) -> None:
super().__init__(*args, **kwargs)
self.proj = torch.nn.Linear(in_features=100, out_features=10)
def forward(self, x):
x = self.proj(x)
x = SwishFunction.apply(x)
return x
class Foo(torch.nn.Module):
def __init__(self, *args, **kwargs) -> None:
super().__init__(*args, **kwargs)
self.bar = Bar()
def forward(self, x):
x = self.bar(x)
return x
model = Foo().eval()
args = (torch.randn(100), )
owl = init("bug", "my_custom_autograd")
model = owl.convert(model, *args)
owl.export(model)
Traceback:
(test) ~ # python custom_autograd.py
OwLite [INFO] Created a new project 'bug'
OwLite [INFO] Connecting to the first device at NEST
OwLite [INFO] Device connected: NVIDIA RTX A4000 [TensorRT]
OwLite [INFO] Created a new baseline 'my_custom_autograd' in the project 'bug'
OwLite [INFO] Experiment data will be saved in /Users/hjdbhj/tmp/owlite/bug/my_custom_autograd
OwLite [INFO] Converting the model
OwLite [INFO] Saving exported ONNX proto at /Users/hjdbhj/tmp/owlite/bug/my_custom_autograd/bug_my_custom_autograd_my_custom_autograd.onnx with external data bug_my_custom_autograd_my_custom_autograd.bin
OwLite [INFO] Baseline ONNX saved at /Users/hjdbhj/tmp/owlite/bug/my_custom_autograd/bug_my_custom_autograd_my_custom_autograd.onnx
Traceback (most recent call last):
File "/Users/hjdbhj/owlite-bug-repro/custom_autograd.py", line 43, in <module>
owl.export(model)
File "/Users/hjdbhj/opt/miniconda3/envs/test/lib/python3.10/site-packages/owlite/owlite.py", line 493, in export
self.target.upload(proto, model)
File "/Users/hjdbhj/opt/miniconda3/envs/test/lib/python3.10/site-packages/owlite/api/baseline.py", line 149, in upload
DOVE_API_BASE.post(
File "/Users/hjdbhj/opt/miniconda3/envs/test/lib/python3.10/site-packages/owlite/owlite_core/api_base.py", line 159, in post
return self._request(request_callable)
File "/Users/hjdbhj/opt/miniconda3/envs/test/lib/python3.10/site-packages/owlite/owlite_core/api_base.py", line 107, in _request
response.raise_for_status()
File "/Users/hjdbhj/opt/miniconda3/envs/test/lib/python3.10/site-packages/requests/models.py", line 1024, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://dove.owlite.ai/upload
-
Hello hjd_bhj. Thank you for using OwLite and for reporting the issue you experienced.
It seems you encountered an unexpected error, which must have caused inconvenience while using OwLite. OwLite aims to provide clear solutions and explanations for problems users face. We believe your inquiry will help us improve OwLite and make it a more reliable service.
I have just notified our development team about your request, and they will get back to you soon with a response.
Thank you!
2 -
Hi,
Me and our team was able to reproduced the error with your code and verify the workaround you suggested is valid.
As you mentioned, this is an unexpected bug when model with custom torch.autograd function is given as an input to OwLite. Currently, OwLite lacks the ability to handle custom torch.autograd functions. To be more precise, error occurs during the mapping logic between torhc.fx.Graph converted with OwLite.convert and onnx.ModelProto exported with OwLite.export.
We are eagerly seeking a way to support custom torch.autograd functions as there are certain cases that these functions cannot be easily replaced as you mentioned. We will let you know if we come up with a precise schedule on this. Until then, please stick to the workaround you've found.
We will also update the log messages so that you can detect these kinds of problem with less effort.
Thanks
0
Please sign in to leave a comment.
Comments
2 comments