You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, the results seem much lower than the reported results in Table 4 of llmlingua2 (around 20 than 50.45 in the paper). Here is my implementation:
Hi, @jzhang538, thank you for raising the question!
I think there are two reasons that may lead to this issue. The first is the parameters of LLMLingua, such as iterative_size or context_budget. The second is the evaluation. Note that we do not use the instruct version of Mistral in experiment, the model may generate lengthy responses and even raise similar questions in the response, which leads to a low performance. So it is necessary to truncate the responses at an appropriate place.
Describe the issue
Thanks for the interesting work. I tried to reproduce the results of llmlingua on the meetingbank QA dataset with Mistral-7B as the target LLM.
The small LLM I use is https://huggingface.co/NousResearch/Llama-2-7b-hf
However, the results seem much lower than the reported results in Table 4 of llmlingua2 (around 20 than 50.45 in the paper). Here is my implementation:
compressor = PromptCompressor(
model_name=args.model_name,
model_config={},
use_llmlingua2=False
)
iterative_size = 200
comp_dict = compressor.compress_prompt(
context=origin,
instruction="",
question="",
rate=args.compression_rate,
iterative_size=iterative_size,
context_budget="*2.0",
)
I'm wondering if there is any issue with my implementation?
The text was updated successfully, but these errors were encountered: