Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added docs for converter #46

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,10 @@ Inference cost for CNV_2W2A.onnx

You can read more about the BOPS metric in [this paper](https://www.frontiersin.org/articles/10.3389/frai.2021.676564/full), Section 4.2 Bit Operations.

### Qkeras to Qonnx Converter

To see details about the qkeras converter and potential limitations check this [document](docs/qkeras-converter/qkeras_to_qonnx.md)

### Development

Install in editable mode in a venv:
Expand Down
23 changes: 23 additions & 0 deletions docs/qkeras-converter/qkeras_to_qonnx.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
### <a name="Qkeras to Qonnx"></a><a name="abs">**Qkeras to Qonnx**</a>

The converter works by (1) strip QKeras model of quantization attributes and store in a dictionary; (2) convert (as if plain Keras model) using tf2onnx; (3) Insert “Quant” nodes at appropriate locations based on a dictionary of quantization attributes.

The current version has few issues given how tf2onnx inserts the quant nodes. These problems have suitable workarounds detailed below.

### Quantized-Relu
The quantized-relu quantization inserts a redundant quantization node when used as output activation of Dense/Conv2D layer.

Workaround: Only use quantized-relu activation in a seperate QActivation layers.

<img src="https://user-images.githubusercontent.com/31563706/209125992-e03078e4-ec92-4796-982f-2a31292687d6.png" width="300" height="500">

### Quantized-Bits
The quantized-bits quantization node is not added to the model when used in QActivation layers.

Workaround: Use quantized-bits only at the output of a Dense/Conv2D layers.

<img src="https://user-images.githubusercontent.com/31563706/209126623-9956ecea-748e-4d7c-930c-d46d06ab6a14.png" width="350" height="350">

### Ternary Quantization
A threshold of 0.5 must be used when using ternary quantization.
(This is sometimes unstable even with t=0.5)