Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computation of compression parameters via OpenVINO models #2727

Merged
merged 109 commits into from
Jan 23, 2025
Merged
Show file tree
Hide file tree
Changes from 107 commits
Commits
Show all changes
109 commits
Select commit Hold shift + click to select a range
10d1ddb
Initial draft. Rebased.
nikita-savelyevv Jul 3, 2024
bd2629b
Unstage helper scripts
nikita-savelyevv Oct 22, 2024
3e69252
WIP
nikita-savelyevv Oct 23, 2024
166dd04
Reshape weights beforehand
nikita-savelyevv Oct 24, 2024
edbe913
BF16 support
nikita-savelyevv Oct 25, 2024
b636c66
Tweak lora type hint
nikita-savelyevv Oct 25, 2024
f0129ef
Tweaks
nikita-savelyevv Oct 25, 2024
e887e70
Added share_inputs
nikita-savelyevv Oct 25, 2024
9141a8a
Modeling tweaks
nikita-savelyevv Oct 25, 2024
a43c514
Move results_cache into separate file
nikita-savelyevv Oct 25, 2024
1216f65
Implement astype for ov backend for bf16, u4, i4
nikita-savelyevv Oct 25, 2024
8611b75
Experiments
nikita-savelyevv Oct 26, 2024
0718668
Support case of (weight, scale) -> (c_weight, zp)
nikita-savelyevv Oct 26, 2024
283a821
SE improvements
nikita-savelyevv Oct 28, 2024
6964844
Accelerate AWQ
nikita-savelyevv Oct 28, 2024
80e2c92
SE changes
nikita-savelyevv Oct 29, 2024
fc82866
Add access counts to caching decorator
nikita-savelyevv Oct 29, 2024
f3891cd
Comment out env vars
nikita-savelyevv Oct 29, 2024
353aac1
Fix existing tests
nikita-savelyevv Oct 29, 2024
d20e593
Unstage helper scripts
nikita-savelyevv Oct 30, 2024
dc30d8d
Tests WIP
nikita-savelyevv Oct 31, 2024
c5606ce
Invert Tensor division
nikita-savelyevv Nov 1, 2024
e6a9d56
Add fns.divide
nikita-savelyevv Nov 4, 2024
ab90a08
Adopt misalignment test to check the degree of misalignment
nikita-savelyevv Nov 6, 2024
2e308b7
Merge branch 'develop' into compress-via-openvino
nikita-savelyevv Nov 7, 2024
6289c5c
Merge-related fixes
nikita-savelyevv Nov 7, 2024
f60fd17
Tweaks
nikita-savelyevv Nov 7, 2024
57a0931
Strict input/output data types
nikita-savelyevv Nov 11, 2024
1010fcf
Add dynamic shapes test
nikita-savelyevv Nov 11, 2024
6e54fba
ov modeling tests
nikita-savelyevv Nov 13, 2024
8ac0fe2
Move cache_results decorator
nikita-savelyevv Nov 13, 2024
ded66f3
Tests reorgantization
nikita-savelyevv Nov 13, 2024
69ae5fa
cache_results decorator test
nikita-savelyevv Nov 13, 2024
d0f49ae
get_const_value test
nikita-savelyevv Nov 13, 2024
a282976
OVModelParameters minor refactor
nikita-savelyevv Nov 13, 2024
b13f186
Added OV tensor tests
nikita-savelyevv Nov 14, 2024
9e90d5a
Minor file reorg
nikita-savelyevv Nov 14, 2024
5f46593
Tweaks
nikita-savelyevv Nov 14, 2024
e7617f1
Tweaks
nikita-savelyevv Nov 14, 2024
925f830
Switch to OV 2024.5 rc2
nikita-savelyevv Nov 15, 2024
5831fcd
Additional tests for ov_modeling
nikita-savelyevv Nov 15, 2024
9160de3
Type hints
nikita-savelyevv Nov 15, 2024
c7c63eb
Ignore mypy
nikita-savelyevv Nov 15, 2024
764f722
Reuse DTYPE_MAP_REV
nikita-savelyevv Nov 15, 2024
4a448e1
Added docstrings
nikita-savelyevv Nov 18, 2024
73f61fc
Remove inverted NP division. Add non-convertable OV division.
nikita-savelyevv Dec 11, 2024
16ccf50
Merge branch 'develop' into compress-via-openvino
nikita-savelyevv Dec 11, 2024
cd884eb
Remove OV 2024.5 RC installation
nikita-savelyevv Dec 11, 2024
608cfe9
Add a test for non-convertable division
nikita-savelyevv Dec 11, 2024
9569e1e
Make the test more strict
nikita-savelyevv Dec 11, 2024
f962bd1
Remove unnecessary lines
nikita-savelyevv Dec 11, 2024
5dcd83d
Update get_integer_quantization_error implementation
nikita-savelyevv Dec 11, 2024
6e22ef5
Remove unnecessary convert
nikita-savelyevv Dec 11, 2024
b45e788
Move create_ov_const_from_tensor to node_utils
nikita-savelyevv Dec 11, 2024
b2cebd0
Separate checking logic into standalone methods
nikita-savelyevv Dec 11, 2024
3a71141
Add debug conditions
nikita-savelyevv Dec 11, 2024
eeadf1d
Move ov model cache clearing to ov backend destructor
nikita-savelyevv Dec 12, 2024
40aef54
Update default ov model parameters
nikita-savelyevv Dec 12, 2024
ab3d35f
Revert debug logic
nikita-savelyevv Dec 12, 2024
d48c748
Update reference
nikita-savelyevv Dec 12, 2024
9a56fae
Add debug conditions
nikita-savelyevv Dec 11, 2024
e10d806
Disable dynamic shapes by default
nikita-savelyevv Dec 12, 2024
b372dc7
Revert "Add debug conditions"
nikita-savelyevv Dec 12, 2024
63858d3
Linters
nikita-savelyevv Dec 12, 2024
87b5c10
Fix lora correction
nikita-savelyevv Dec 13, 2024
7134e6d
Remove not used argument
nikita-savelyevv Dec 13, 2024
5a1866f
Remove static shapes testing because it is not needed with non-conver…
nikita-savelyevv Dec 13, 2024
6a2c9fc
Set dynamic shapes by default
nikita-savelyevv Dec 13, 2024
204fb21
Merge branch 'develop' into compress-via-openvino
nikita-savelyevv Dec 13, 2024
dca5376
Merge branch 'develop' into compress-via-openvino
nikita-savelyevv Dec 16, 2024
92fbba5
Guarantee call order
nikita-savelyevv Dec 16, 2024
b27c720
Add convertable_division parameter
nikita-savelyevv Dec 16, 2024
6ab1c08
Cleanup
nikita-savelyevv Dec 16, 2024
a0fe91a
Add convertable division test
nikita-savelyevv Dec 16, 2024
97bd61d
Add explicit inference precision
nikita-savelyevv Dec 16, 2024
58963ab
Fix import
nikita-savelyevv Dec 16, 2024
ec21996
Update tests/post_training/data/wc_reference_data.yaml
nikita-savelyevv Dec 16, 2024
aeffc8b
Suggested renaming
nikita-savelyevv Jan 14, 2025
476287b
Merge branch 'compress-via-openvino' of github.com:nikita-savelyevv/n…
nikita-savelyevv Jan 14, 2025
d2d66b1
to_backend -> as_numpy_tensor
nikita-savelyevv Jan 14, 2025
f4a08b9
Use duplicate filter
nikita-savelyevv Jan 14, 2025
9f2a79b
Revert "Use duplicate filter"
nikita-savelyevv Jan 14, 2025
05b3eb8
Align log message
nikita-savelyevv Jan 14, 2025
84c88fc
Add TODO regarding share memory during constant creation from tensor
nikita-savelyevv Jan 14, 2025
d821e7d
Create infer request in both cases
nikita-savelyevv Jan 14, 2025
467b5b8
Merge branch 'develop' into compress-via-openvino
nikita-savelyevv Jan 14, 2025
1c485ec
Update copyright year
nikita-savelyevv Jan 14, 2025
57de030
Remove not used import
nikita-savelyevv Jan 14, 2025
882c9b1
Fix ov modeling test
nikita-savelyevv Jan 14, 2025
a9f4e70
Add return type annotation
nikita-savelyevv Jan 14, 2025
fc64966
Fix docs api conf.py
nikita-savelyevv Jan 14, 2025
68e734f
mypy
nikita-savelyevv Jan 14, 2025
48d47c8
Revert "Create infer request in both cases"
nikita-savelyevv Jan 15, 2025
234f698
Set shared_memory=True when creating ov.constant from ov.tensor
nikita-savelyevv Jan 15, 2025
0698c17
Remove Optional type hint
nikita-savelyevv Jan 16, 2025
1d5a7d7
Remove DuplicateFilter
nikita-savelyevv Jan 16, 2025
990fd72
Move ov models to separate module
nikita-savelyevv Jan 20, 2025
2054e46
Introduce NNCFLogger
nikita-savelyevv Jan 20, 2025
07e3060
Addressed other comments
nikita-savelyevv Jan 20, 2025
9cc933a
Fixed caching test
nikita-savelyevv Jan 20, 2025
f1dc6ac
Introduced suggested changes
nikita-savelyevv Jan 21, 2025
c384df9
Implement requested changes
nikita-savelyevv Jan 22, 2025
2a13717
Add debug conditions
nikita-savelyevv Jan 22, 2025
92a3d9a
Revert "Add debug conditions"
nikita-savelyevv Jan 22, 2025
74d1d74
Merge branch 'develop' into compress-via-openvino
nikita-savelyevv Jan 22, 2025
06c3447
Fix tests
nikita-savelyevv Jan 22, 2025
841b807
Using < is being deprecated
nikita-savelyevv Jan 22, 2025
9a533d0
Address minor comments
nikita-savelyevv Jan 23, 2025
68eef07
Add todo
nikita-savelyevv Jan 23, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/api/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,7 @@ def collect_api_entities() -> APIInfo:
"nncf.tensor.functions.torch_linalg",
"nncf.tensor.functions.torch_io",
"nncf.tensor.functions.numpy_io",
"nncf.tensor.functions.ov_numeric",
]

with mock(mock_modules):
Expand Down
44 changes: 32 additions & 12 deletions nncf/common/logging/logger.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,41 @@

import logging
import sys
from typing import Set
from functools import lru_cache
from typing import cast


class NNCFLogger(logging.Logger):
nikita-savelyevv marked this conversation as resolved.
Show resolved Hide resolved
def __init__(self, name: str, level: int = logging.NOTSET):
super().__init__(name, level)

@lru_cache(None)
def _log_once(self, level: int, msg: str) -> None:
self.log(level, msg)

def debug_once(self, msg: str) -> None:
"""
Log a message at the DEBUG level, ensuring the message is logged only once.
"""
self._log_once(logging.DEBUG, msg)

def info_once(self, msg: str) -> None:
"""
Log a message at the INFO level, ensuring the message is logged only once.
"""
self._log_once(logging.INFO, msg)

def warning_once(self, msg: str) -> None:
"""
Log a message at the WARNING level, ensuring the message is logged only once.
"""
self._log_once(logging.WARNING, msg)


NNCF_LOGGER_NAME = "nncf"

nncf_logger = logging.getLogger(NNCF_LOGGER_NAME)
logging.setLoggerClass(NNCFLogger)
nncf_logger = cast(NNCFLogger, logging.getLogger(NNCF_LOGGER_NAME))
nncf_logger.propagate = False

stdout_handler = logging.StreamHandler(sys.stdout)
Expand Down Expand Up @@ -60,16 +90,6 @@ def disable_logging() -> None:
nncf_logger.handlers = []


class DuplicateFilter:
def __init__(self) -> None:
self.msgs: Set[str] = set()

def filter(self, rec: logging.LogRecord) -> bool:
retval = rec.msg not in self.msgs
self.msgs.add(rec.msg)
return retval


NNCFDeprecationWarning = FutureWarning


Expand Down
15 changes: 15 additions & 0 deletions nncf/common/utils/backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,13 @@

TModel = TypeVar("TModel")

try:
import openvino # type: ignore # noqa: F401

_OPENVINO_AVAILABLE = True
except ImportError:
_OPENVINO_AVAILABLE = False


class BackendType(Enum):
TORCH = "Torch"
Expand Down Expand Up @@ -159,3 +166,11 @@ def copy_model(model: TModel) -> TModel:
model = TFModelTransformer(model).transform(TFTransformationLayout())
return model
return deepcopy(model)


def is_openvino_available() -> bool:
"""
Check if OpenVINO is available.
:return: True if openvino package is installed, False otherwise.
"""
return _OPENVINO_AVAILABLE
102 changes: 102 additions & 0 deletions nncf/common/utils/caching.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# Copyright (c) 2025 Intel Corporation
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import copy
import inspect
from contextlib import contextmanager
from functools import wraps
from typing import Any, Callable, Dict, Iterator, TypeVar, cast


class ResultsCache:
"""
A container for results decorated with @cache_results decorator.
"""

def __init__(self) -> None:
self._enabled = True
# Stores the results of the decorated function
self._cache: Dict[Any, Any] = {}
# Stores the number of times the cached result was accessed
self._access_count: Dict[Any, int] = {}

def enable(self) -> None:
self._enabled = True

def disable(self) -> None:
self._enabled = False

def enabled(self) -> bool:
return self._enabled

def access_count(self) -> Dict[Any, int]:
return copy.deepcopy(self._access_count)

def clear(self) -> None:
self._cache.clear()
self._access_count.clear()

def __getitem__(self, key: Any) -> Any:
self._access_count[key] += 1
return self._cache[key]

def __setitem__(self, key: Any, value: Any) -> None:
self._access_count[key] = 0
self._cache[key] = value

def __contains__(self, key: Any) -> bool:
return key in self._cache


TFunc = TypeVar("TFunc", bound=Callable[..., Any])


def cache_results(cache: ResultsCache) -> Callable[[TFunc], TFunc]:
"""
Decorator to cache results of a function. When decorated function is called with the same set of arguments, it
will return the cached result instead of recomputing it. If it was the first call with such set of arguments, the
result will be computed and stored in the cache. The cache is stored in the `cache` object. Function arguments
must be hashable.

:param cache: A cache container where results will be stored.
"""

def decorator(func: TFunc) -> TFunc:
@wraps(func)
def wrapper(*args: Any, **kwargs: Any) -> Any:
if not cache.enabled():
return func(*args, **kwargs)
sig = inspect.signature(func)
new_kwargs = {name: arg for name, arg in zip(sig.parameters, args)}
new_kwargs.update(kwargs)
cache_key = (func.__name__, frozenset(new_kwargs.items()))
if cache_key in cache:
return cache[cache_key]
result = func(*args, **kwargs)
cache[cache_key] = result
return result

return cast(TFunc, wrapper)

return decorator


@contextmanager
def disable_results_caching(cache: ResultsCache) -> Iterator[None]:
"""
Context manager to disable caching of results for a block of code.
:param cache: A cache container where results are stored.
alexsu52 marked this conversation as resolved.
Show resolved Hide resolved
"""
if cache.enabled():
cache.disable()
yield
cache.enable()
else:
yield
57 changes: 50 additions & 7 deletions nncf/openvino/graph/node_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@

import numpy as np
import openvino.runtime as ov
import openvino.runtime.op as op
import openvino.runtime.opset13 as opset

import nncf
Expand Down Expand Up @@ -41,6 +42,8 @@
from nncf.openvino.graph.metatypes.openvino_metatypes import OVMatMulMetatype
from nncf.openvino.graph.metatypes.openvino_metatypes import OVOpMetatype
from nncf.openvino.graph.metatypes.openvino_metatypes import get_node_metatype
from nncf.tensor import Tensor
from nncf.tensor import TensorBackend

InplaceInsertionFnType = Callable[[ov.Node, int, str], ov.Node]

Expand Down Expand Up @@ -97,26 +100,27 @@ def get_number_if_op(model: ov.Model) -> int:
"""

def cnt_if_op(model: ov.Model, cnt: int) -> int:
for op in model.get_ops():
if get_node_metatype(op) == OVIfMetatype:
for model_op in model.get_ops():
if get_node_metatype(model_op) == OVIfMetatype:
cnt += 1
cnt = cnt_if_op(op.get_function(0), cnt)
cnt = cnt_if_op(op.get_function(1), cnt)
cnt = cnt_if_op(model_op.get_function(0), cnt)
cnt = cnt_if_op(model_op.get_function(1), cnt)
return cnt

return cnt_if_op(model, 0)


def get_const_value(const_node: ov.Node) -> np.ndarray:
def get_const_value(const_node: ov.Node, cast_bf16_to_fp32: bool = True) -> np.ndarray:
"""
Returns the constant tensor for the node.
This method is applicable only for the floating-point constant data.

:param const_node: OpenVINO node.
:param cast_bf16_to_fp32: Whether to cast bf16 node data to fp32 or not. If False and the node contains bf16 data,
the resulting bf16 value will be returned encoded inside a numpy.float16 array.
:return: The constant value.
"""
if const_node.get_element_type() == ov.Type.bf16:
# Fixed FP32 data type as the result for BF16 constant
if const_node.get_element_type() == ov.Type.bf16 and cast_bf16_to_fp32:
return const_node.get_data(dtype=np.float32)
return const_node.data

Expand Down Expand Up @@ -635,3 +639,42 @@ def get_activation_channel_axis(node: NNCFNode, port_id: int, input_shape: Tuple
channel_axis = activations_layout.index(OVLayoutElem.C_IN)

return channel_axis


def convert_op(node: ov.Node, target_dtype: ov.Type) -> ov.Node:
AlexanderDokuchaev marked this conversation as resolved.
Show resolved Hide resolved
"""
Return a subgraph which converts the given node output to the target data type. If the output is already in the
target data type then the given node is returned.

:param node: The input node to convert.
:param target_dtype: The target data type to convert the input node to.
:return: The converted node.
"""
if node.get_element_type() == target_dtype:
return node
return opset.convert(node, target_dtype)


def non_convertable_divide_op(a: ov.Node, b: ov.Node) -> ov.Node:
"""
Creates a "non-convertable" divide operation. It won't be converted to a*(1/b).
"""
divide_node = a / b
divide_node.get_rt_info()["nonconvertable_divide_0"] = True
return divide_node


def create_ov_const_from_tensor(x: Tensor, dtype: ov.Type, name: Optional[str] = None) -> op.Constant:
"""
Create an OpenVINO Constant node from the given tensor.
:param x: Data tensor. Supports NumPy and OV tensor backends. If x backend is OV, the constant node is created
directly from underlying OV tensor.
:param dtype: Data type of the constant.
:param name: Optional name of the constant.
:return: OpenVINO Constant node.
"""
if x.backend == TensorBackend.ov:
assert x.data.get_element_type() == dtype
return opset.constant(x.data, name=name, shared_memory=True)
const = opset.constant(x.data, dtype=dtype, name=name)
return const
16 changes: 16 additions & 0 deletions nncf/openvino/optimized_functions/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Copyright (c) 2025 Intel Corporation
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from nncf.openvino.optimized_functions.functions import astype as astype
from nncf.openvino.optimized_functions.functions import do_int_quantization as do_int_quantization
from nncf.openvino.optimized_functions.functions import quantize_dequantize_weight as quantize_dequantize_weight
from nncf.openvino.optimized_functions.models import OVModelParameters as OVModelParameters
from nncf.openvino.optimized_functions.models import clear_ov_model_cache as clear_ov_model_cache
Loading