You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm interested in your longllmlingua results on LongBench.
I reproduced LongBench BM25 2,000-token constraint using ChatGPT.
Unlike the your paper's results, the performance is too high.
trec task score is 72.5 and most of the other tasks are also high.
I would like to know how you produced the bm25 result.
I'll show you the parameter I used to reproduce bm25 so I'd appreciate it if you could tell me which one is different.
I use same split and parameters other tasks.(only q_format and first inst changing according to original LongBench config)
Thank you
first_inst="Please determine the type of the question below. Here are some examples of questions."
q_format="{input}"
question= q_format.format(input=input)
instruction=first_inst
contexts_list = df['ctxs'][i].split("\n")
contexts_list = [
"\n".join(contexts_list[ii : ii + 4]) for ii in range(0, len(contexts_list), 4)
]
compressed_prompt = llm_lingua.compress_prompt(
contexts_list,
instruction=instruction,
question=question,
target_token=1800,
condition_compare=True,
condition_in_question="after",
rank_method="bm25",
use_sentence_level_filter=False,
use_token_level_filter=False,
context_budget="+100",
dynamic_context_compression_ratio=0.4, # enable dynamic_context_compression_ratio
)
The text was updated successfully, but these errors were encountered:
Hi @JUNE515, thanks for support in LLMLingua.
Thanks for your support in LLMLingua. I checked the parameters you used and found that your actual compression rate might be relatively low. You can refer to the following code:
I have one more question.
Also, I reproduced LongBench LongLLMLingua 2,000-token constraint using ChatGPT.
But I get summary task 22.0(5.4 low) / few shot task 65.1(4.2 low) / code 49.4 (7.2 low)
I think my result seems to be low because both the split method and parameter of the context are equal.
I apply same split method and parameter like your code.ipynb repobench-p example.
I would like to know how you produced the longllmlingua result.
Describe the issue
I'm interested in your longllmlingua results on LongBench.
I reproduced LongBench BM25 2,000-token constraint using ChatGPT.
Unlike the your paper's results, the performance is too high.
trec task score is 72.5 and most of the other tasks are also high.
I would like to know how you produced the bm25 result.
I'll show you the parameter I used to reproduce bm25 so I'd appreciate it if you could tell me which one is different.
I use same split and parameters other tasks.(only q_format and first inst changing according to original LongBench config)
Thank you
first_inst="Please determine the type of the question below. Here are some examples of questions."
q_format="{input}"
question= q_format.format(input=input)
instruction=first_inst
contexts_list = df['ctxs'][i].split("\n")
contexts_list = [
"\n".join(contexts_list[ii : ii + 4]) for ii in range(0, len(contexts_list), 4)
]
compressed_prompt = llm_lingua.compress_prompt(
contexts_list,
instruction=instruction,
question=question,
target_token=1800,
condition_compare=True,
condition_in_question="after",
rank_method="bm25",
use_sentence_level_filter=False,
use_token_level_filter=False,
context_budget="+100",
dynamic_context_compression_ratio=0.4, # enable dynamic_context_compression_ratio
)
The text was updated successfully, but these errors were encountered: