The llm-rag-llamaindex hasn't yet implemented Compile-mode but PRs are ready #2647

JamieVC · 2025-01-13T06:51:04Z

Describe the bug
We saw PRs that make compile-only mode works.
huggingface/optimum-intel#873
huggingface/optimum-intel#1101

Then, we do the config modification with the latest optimum-intel (1.22.0.dev0+58aec63),
but the compile-only mode doesn't work in the sample llm-rag-llamaindex (https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/llm-rag-llamaindex)

Screenshots

Expected behavior
The compile-only mode works to reduce the memory footprint.

Installation instructions (Please mark the checkbox)
[yes] I followed the installation guide at https://github.com/openvinotoolkit/openvino_notebooks#-installation-guide to install the notebooks.

eaidova · 2025-01-13T16:55:44Z

I see that these arguments are correctly passed to API, correctly passed to optimum part via llama-index integration, so it is incorrect statement that it is not implemented.

compile_only has effect only if user at least compiled model once, it is not expected that it brings any benefit from the first usage (in opposite, it requires more disk space to save precompiled model)., possibly it is the reason why you do not see benefit, if not, then it is GPU plugin issue, but neither openvino notebooks, llama-index or optimum intel.
Another possible reason, for avoiding recompilation, you need to use the same openvino version every usage, if you'll update openvino runtime even without model conversion, model will be recompiled. As llm notebooks uses nightly package as default openvino version, it means that ov runtime continuously updated that may prevent to see advantage of compile_only

JamieVC · 2025-01-14T09:27:48Z

Thanks!
Per my experiment again, it has benefits to low memory footprint if following the configuration in the snapshot.

andrei-kochin closed this as completed Jan 13, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The llm-rag-llamaindex hasn't yet implemented Compile-mode but PRs are ready #2647

The llm-rag-llamaindex hasn't yet implemented Compile-mode but PRs are ready #2647

JamieVC commented Jan 13, 2025

eaidova commented Jan 13, 2025 •

edited by andrei-kochin

Loading

JamieVC commented Jan 14, 2025

The llm-rag-llamaindex hasn't yet implemented Compile-mode but PRs are ready #2647

The llm-rag-llamaindex hasn't yet implemented Compile-mode but PRs are ready #2647

Comments

JamieVC commented Jan 13, 2025

eaidova commented Jan 13, 2025 • edited by andrei-kochin Loading

JamieVC commented Jan 14, 2025

eaidova commented Jan 13, 2025 •

edited by andrei-kochin

Loading