Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The llm-rag-llamaindex hasn't yet implemented Compile-mode but PRs are ready #2647

Closed
JamieVC opened this issue Jan 13, 2025 · 2 comments
Closed

Comments

@JamieVC
Copy link

JamieVC commented Jan 13, 2025

Describe the bug
We saw PRs that make compile-only mode works.
huggingface/optimum-intel#873
huggingface/optimum-intel#1101

Then, we do the config modification with the latest optimum-intel (1.22.0.dev0+58aec63),
but the compile-only mode doesn't work in the sample llm-rag-llamaindex (https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/llm-rag-llamaindex)

Screenshots
image

Expected behavior
The compile-only mode works to reduce the memory footprint.

Installation instructions (Please mark the checkbox)
[yes] I followed the installation guide at https://github.com/openvinotoolkit/openvino_notebooks#-installation-guide to install the notebooks.

@eaidova
Copy link
Collaborator

eaidova commented Jan 13, 2025

I see that these arguments are correctly passed to API, correctly passed to optimum part via llama-index integration, so it is incorrect statement that it is not implemented.

compile_only has effect only if user at least compiled model once, it is not expected that it brings any benefit from the first usage (in opposite, it requires more disk space to save precompiled model)., possibly it is the reason why you do not see benefit, if not, then it is GPU plugin issue, but neither openvino notebooks, llama-index or optimum intel.
Another possible reason, for avoiding recompilation, you need to use the same openvino version every usage, if you'll update openvino runtime even without model conversion, model will be recompiled. As llm notebooks uses nightly package as default openvino version, it means that ov runtime continuously updated that may prevent to see advantage of compile_only

@JamieVC
Copy link
Author

JamieVC commented Jan 14, 2025

Thanks!
Per my experiment again, it has benefits to low memory footprint if following the configuration in the snapshot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants