Intro

OwLite

OwLite incorporates SqueezeBits' unique compatibility technology between PyTorch, ONNX, and TensorRT.
- Converts PyTorch models to ONNX models and visualizes them to understand the model structure better.
- Converts ONNX model to Tensor RT and benchmarks it using GPUs registered with OwLite.
  - This enables users to assess end-to-end latency and the latency contribution of each part within the visualized ONNX graph.
- Then, the user can compress the model based on the rich information mapped on the visualized ONNX.
OwLite features powerful recommendations for beginners and fine-tuning even the most intricate details for advanced users.
Furthermore, it provides an archive of compression results and analytical tools to boost the compression capabilities of individuals and organizations.

Start by following the 'Getting Started' section to set up OwLite.
- Then, dive into the 'Tutorial' section to guide you through your first project, showcasing OwLite's efficient project management capabilities.
- Get started now and experience the ease of OwLite!

The OwLite Python package can be seamlessly implemented into your existing PyTorch code.
It generates a Compressed AI model based on the compression configuration edited on the web.
With the compressed model, you can proceed through Post-Training Quantization (PTQ) and Quantization-Aware Training (QAT) processes.
For detailed instructions on how to use this, please take a look at this link.

The OwLite web editor visualizes the structure of your model, indicating the latency of specific parts of the model on your custom-selected device. It also informs you of the peak VRAM required to run the model.
After gaining insights into the model,
- You can utilize SqueezeBits engineers' algorithms, optimized for the model's structure, to perform one-click automatic compression.
- If you possess a deeper understanding of AI models, OwLite also offers the capability to fine-tune down to the most detailed aspects according to your specific requirements.
For detailed instructions on how to use this, please take a look at this link.

Please get in touch with owlite@squeezebits.com for any questions or suggestions.

AI model compression, AI model quantization, AI model optimization, AI compression, AI optimization, AI quantization