Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is the current NNC able to compress an already quantized model? #1

Open
ColdCodeCool opened this issue Nov 28, 2023 · 1 comment
Open

Comments

@ColdCodeCool
Copy link

ColdCodeCool commented Nov 28, 2023

hello, as the title, my question is how can I use nnc to compress an already quantized model?

@phaase-hhi
Copy link
Collaborator

phaase-hhi commented Jan 15, 2024

Generally, yes! Basically there are different ways, depending on how the model is represented. If "quantized" means, that you have a model that is represented as integer, NNCodec will detect the integer tensors and just compress the integer values losslessly.

For a model that is quantized but still in float precision it depends on how the model is quantized. If it is quantized uniformly, you could simply try to find a combination of qp_density and qp that yields a quantization step size that matches your quantization step size. Usually, this should result in a more or less lossless compression.

Another way would be to use the "codebook" method. Here, NNCodec will derive a codebook from each tensor, that holds all unqiue values within a tensor an assigns an index. However, it is important to mention, that the values within the codebook are quantized to integer values with a step size that is derived from the qp value. But if you choose a very small qp value (e.g. -75) this should be again more or less lossless.

BTW: Sorry for the late answer. I have been very busy lately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants