You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Generally, yes! Basically there are different ways, depending on how the model is represented. If "quantized" means, that you have a model that is represented as integer, NNCodec will detect the integer tensors and just compress the integer values losslessly.
For a model that is quantized but still in float precision it depends on how the model is quantized. If it is quantized uniformly, you could simply try to find a combination of qp_density and qp that yields a quantization step size that matches your quantization step size. Usually, this should result in a more or less lossless compression.
Another way would be to use the "codebook" method. Here, NNCodec will derive a codebook from each tensor, that holds all unqiue values within a tensor an assigns an index. However, it is important to mention, that the values within the codebook are quantized to integer values with a step size that is derived from the qp value. But if you choose a very small qp value (e.g. -75) this should be again more or less lossless.
BTW: Sorry for the late answer. I have been very busy lately.
hello, as the title, my question is how can I use nnc to compress an already quantized model?
The text was updated successfully, but these errors were encountered: